Monday, March 9, 2015

Data Scraping from Online PDFs - oDesk

- Obtaining the PDFs from a website, which contain inspections to primary and secondary schools. The PDFs have to be obtained searching for either "Primary" or "Secondary" in a search tab, clicking on the name of the school, and downloading the PDFs for each school. There are about 6,000 schools with 8 PDFs each.



- From each PDF, scrape the "name of the inspector" that is contained at the end of the PDF.



- Save the name of the inspector and the identifier for the school in an Excel file.



Posted On: March 09, 2015 08:55 UTC

ID: 205244778

Category: >

Skills: Array, Array

Country: United States

click to apply



from Online Job Search

No comments:

Post a Comment