Find freelancers. Lose those costly delays.

With 1.7 million freelancers, we'll match you with the perfect talent.

or, Register with Elance »

Data Extraction
Sign in to Add to Watch List

IT & Programming > Web Programming

View Change History

The client has made the following changes to the job.


Job Description

  |  Change History >>


  • Posted: Sat, Dec 24, 2011
  • Time Left: Closed
  • Location: Anywhere
  • Client prefers freelancers from:

    You are still able to submit a proposal for this job.

    The client prefers freelancers from
    a different location.

    You're still able to submit a proposal for this job, regardless of your location.
  • Start: Immediately
  • Budget: $50 - $150
  • Fixed Price Job
  • Elance Escrow Protection
  • W9 Not Required
Sign in to view client's details

The task requires the creation source code for a program that can extract information from the very large crawled snapshot of the Web posted on Amazon S3's service.

If you have not worked with S3 buckets before then this will be a good opportunity.

The program would take in input of a domain, run on an EC2 instance, and would produce a tab-separated text files with three columns: SourceURL\tAnchorText\tTargetURL where the TargetURL points to the domain of the input.

The domains that I would like to test are amazon.com, overstock.com, and dealerdirectparts.com. I.e. provide three files for these three domains. These will likely be large files so ideally you can use S3 directly to provide them.

The source code is ideally Java, Python or PHP

Sign in or Register to see more

Job ID: 27739350
Avg $ | High $ | Low $ — Show Pricing
  • Submit Date (Latest)

Alexey L.      
Alexey L.'s identity has been verified.  Learn More
 Russia  |  
More than 10 years experience as developer a web applications. Experienced in .NET, python, SQL Server. The latest projects are: - high performance...
 0.0   |  Private   |  0 Jobs
Bid ID: 27741021  |  Submitted: Dec 25, 2011 02:40 ET 
Proposal SEALED

 Bosnia and Herzegovina  |  
PHP & MySQL-savy Linux Administrator with passion for open source.
 4.7   |  Private   |  2 Jobs
Bid ID: 27739565  |  Submitted: Dec 24, 2011 18:11 ET 
Proposal SEALED
Elance is now an Upwork company.
Upwork is the choice of 4M+ clients. Get started working on Upwork today.
Are you ready to post a job like this one?
Post a Similar Job »