The client has made the following changes to the job.
Job title changed to "Simple data-extraction script in Google Go (golang) for about 20 sites". Previously: "Simple data-extraction script in Google Go for about 20 sites"
Bidding period extended.
Client prefers freelancers from:
You are still able to submit a proposal for this job.
The client prefers freelancers from
a different location.
Crawl the New Arrivals sections of about 20 fashion oriented web sites, for example like this [obscured] /anthro/category/clothing/shopnew-clothes.jsp
We need somebody to write code that, for each of these about 20 websites, given the HTML of the page does the following:
+ extracts the list of products and the URLs of the page for each product
+ given the contents of the product page, extracts the following: URLs of detailed product photos, price, name of product and description
The extracted data will be outputted in a simple CSV or JSON format, in a text file.
This code needs to be written in Go.
Good style and clear code is important. Unit tests are not required.
Reliability of the system or data base integration is not necessary, all that is needed on this job is writing the individual crawlers for each website (take HTML as input, output structured data in a text-file).
One example parser will be provided so that the individual parsers for the new sites can be writt...
Sign in or Register to see more