Save this Search

All JobsMapReduce

 (2 results)  
Sort by:
  • Posted Date
Results for 'MapReduce'
Hourly Rate: Not Sure   |  Duration: Not Sure  |  Posted: Jul 22, 2015  |  Ends: 1d, 9h  |   4 Proposals
**Need finished in 3 days.! Objective: Process pdfs to create counts of key words/key phrases in a database table High Level Process ? Basic System a. Input: Medical journals in PDF b. Step 1: Use Apache Tika to convert PDF's to text files c. Step 2: MapReduce to count words/phrases, output to tab-separated files d. Step 3: Tab-separated files to database tables I want to be clear on the Requirements: 1. Must be able to query finished tables (3. word count table & 4. word combination count table) 2. Table to be organized similar to the attached sheet. 3. Table should include a count of all words that appear  4. Table should include 5 two word combinations: 'positive results' 'negative results' 'positive response' 'negative response' 'in remission' You will need to download the pdf files from an SFTP site and then re-upload the .dat files to my SFTP.
Category: Data Engineering       
Skills: MapReduce, apache tika       

Sign in to view client's details.
| n****dev
|    United States
Fixed Price: $1,000 - $5,000   |  Posted: Jul 01, 2015  |  Ends: 55d, 16h  |   24 Proposals
Note : Project based in USA but this is 100% remote, work from anywhere. We are developing a customer intelligence platform that helps software as a service, e-commerce, mobile, and social businesses make smarter decisions using person-based data. Our mission is to help businesses delight their customers. We're currently tracking billions of activities per month. Everyone in our engineering team helps solve interesting problems with scaling, data science, visualization, API architecture, and delivering insights. WHAT WE'RE LOOKING FOR We track a LOT of data and are constantly optimizing getting the data in and out of our systems. We are looking for candidates that love tackling the types of problems that come with this. We use a mix of many different technologies and while our environment changes, here are some of the current technologies we are using: Git Key-Value Stores Distributed systems Python C++ C Ruby Map/Reduce PostgreSQL Help us build solutions that make you and our c...
Category: Other IT & Programming       
Skills: C, Python, Ruby, C++, MapReduce       

Sign in to view client's details.
| I****USA *
Symbol Key
Payment method not yet verified
Payment verified
Purchased $1-$500
Purchased $500-$5,000
Purchased more than $5,000
You have already submitted a
proposal to this job