Find freelancers. Lose those costly delays.

With 1.7 million freelancers, we'll match you with the perfect talent.

or, Register with Elance »

Richard Penman | Elance
 
176985602337900
Last Sign-in: Apr 18, 2014

Richard Penman

Web scraping expert
   Australia
  |   Melbourne, Victoria
  |  12:48 am Local Time

Overview

Hi there - for the last few years I have specialized at web scraping. In this time I have extracted data from websites that require parsing JavaScript/AJAX, using proxies, solving CAPTCHA's, and contain millions of records.

I use Amazon's EC2 for my web crawling infrastructure.
Much of my web scraping library is open sourced to make sharing solutions with clients easier: http://code.google.com/p/webscraping

I can provide scraped data in whatever format is required: CSV, XML, MySQL, etc. (Most clients prefer CSV)

I speak native English,...

Read More »
Mar 29, 2014|Software Application|$200|Completed
|
5.0
Feb 9, 2014|Other IT & Programming|$300|Completed
|
5.0
Feb 5, 2014|Data Analysis|$0|Working
Jan 14, 2014|Web Programming|$0|Working

View All »

Portfolio

Scrapes jobs from job websites and displays them here in consistent format
Scraed book details for categories into CSV
Scraped result rankings and ads for keywords
Scraped locations of dentists and other professionals
Scraped result rankings and ads for keywords
Scraped details of tradesman
Scraped motels and restaurant details
Scraped result rankings and ads for keywords
Scraped results for various keywords
Scraped restaurants in multiple locations
Scraped product and seller details for various categories

Skills (12)

Tested
Web scraping
Web crawling
HTML5
HTML 5 Test
My Score
avg
Linux
Linux Test
My Score
avg
Twitter Development
Twitter Developer Test
My Score
avg
Yahoo Developer Skills
Yahoo Developer Skills - YAP Test
My Score
avg
Yahoo Developer skills - Data APIs Test
My Score
avg
Yahoo Developer skills - Tools and Libraries Test
My Score
avg
HTML
HTML 4.01 Test
My Score
avg
Google App Engine
Google App Engine Test
My Score
avg
jQuery
Knowledge of jQuery 1.3.2 Skills Test
My Score
avg
Amazon Web Services
Amazon Web Services Test
My Score
avg
RSS
RSS 2.0 Test
My Score
avg
RDF
RDF Test
My Score
avg
Passed

Service Description

This is my usual workflow for web scraping projects:

1) Discuss what data is needed:
- Website URL
- Output fields (business name, address, phone, email, etc)
- Output format (csv, xml, mysql, etc)


2) Agree on the budget.


3) I prepare a sample and make adjustments based on feedback.


4) Once downloading has finished I send the full output.

I can still make changes to the output format at this time - all downloaded webpages are cached so it is quick to prepare an update.

If the output file is large I will provide a download link from my server.
And if downloading takes a while I can provide regular updates of the data if required.

Read More »

Employment

WebScraping.com
Programmer
2008 - Present
Extracting data from websites and automating website services
The University of Melbourne
Research Assistant
2005 - 2008
Research Assistant in various computer science fields: computer vision, information retrieval, augmented reality

Education

The University of Melbourne
Bachelor of Software Engineering
2003 - 2007

Keywords

Python
web scraping
scraper
crawling
crawler
bot
linux
google app engine
web2py
webkit
My Snapshot
IT & Programming
8
Elance Level
Level represents activity and experience on Elance. Freelancers start at Level 1 and achieve higher levels through their work. A higher "Level" indicates greater earnings, ratings and other achievements on Elance. Learn More »
  • 12 months
  • Lifetime
Jobs
Total
Milestones
Hours
Reviews
4.9
Recommend
Clients
Total
Repeat
Earnings
Total
Per Client
Identity
Username
sitescraper
Type
Individual
Member Since
May 2009
Elance URL
Verifications
0
Groups
Richard Penman | Elance

Richard Penman