Python web scraper written for CS169 used to scrape Eta Kappa Nu's (HKN) course survey website for ratings and Berkeleytime's course catalog for class information and units. Written in Python 2.7 using scrapy.
Clay Shieh db924f5426 initial commit 1 year ago
great_course_guide initial commit 1 year ago
README.txt initial commit 1 year ago
scrape.sh initial commit 1 year ago
scrapy.cfg initial commit 1 year ago

README.txt

To use the scraper, simply run ./scrape.sh with the following options

-h : scrapes HKN's course survey guide for all the ratings

-b : scrapes Berkeleytime's course catalog and the associated courses for units

Each run will create a new directory with the following format: run_<datetime>/
and will store all files generated from the scrape in said directory

Examples:
To scrape only HKN:
./scrape.sh -h
To scrape only Berkeleytime:
./scrape.sh -b
To scrape both HKN and Berkeleytime:
./scrape.sh -h -b