Fixed Price: $1,000 - $5,000
| Posted: Mar 11, 2014 | Ends: 85d, 15h |
Fixed Price: Not Sure
| Posted: Mar 03, 2014 | Ends: 3d, 4h |
At PanLex (panlex.org), we have a large backlog of sources, mostly bilingual and multilingual dictionaries, that we would like to have parsed and processed for inclusion in our database. To do this task, you must have a general understanding our data model ( [obscured] /reference/concepts/). The task itself involves writing custom scripts to extract data from files in text, HTML, and PDF format, in a language of your choice; standardizing and normalizing the extracted data to fit our data model, using our Perl scripts; and submitting the fully processed data for inclusion in the database. For more detail on our workflow, we suggest reading [obscured] /workflow/ and following the source analysis tutorial. In your proposal, please indicate your relevant qualifications for this project, and propose an approximate budget per source. We are initially interested in hiring someone to work on one or two sources. If the results are good, we may be interesting in hiring on ...