Python basic language processor
natural language processing
Summary: A basic program for a newsroom that searches for phrases and words line by line in series of 30+ lines and outputs to an array a list of tags found for each sentence.
1. Write python code that allows input of 30-40 text lines into text box.
2. A button to store each line into memory.
3. Cleaning up text: Set up basic search and replace word functions to be used over and over by me on each line to make wording consistent for searches (like SearchReplace function (China trader, Chinese trading firm) would replace "China trader" with "Chinese trading firm" in the line.
4. Tagging text: Set up basic search functions for words/phrases and has function to store results into array with eight tags (So for example if "Chinese trading firm" found, store in first position of array "Chinese trading firm", if "offered a cargo" found, store in second position "off screen offer" etc.
Note: This job is not setting up all the search functions necessary to process all the text. That will be work for me and a subsequent job for you to further tailor, such as adding regular expression searches like for prices. Steps 3 and 4 of this job are to set up the framework of the program.
Ultimately, beyond this job, the program would take the line "A south China-based trading firm offered a cargo of Atlas fines on a 57% basis with June delivery at /t."
and store in series as: South China-based trading firm, off-screen offer, Atlas fines, 57%, June delivery, /t, fixed-price, date
5. A button to execute functions in steps 3 and 4.
6. A button to output the array to clipboard or output box so user can copy and paste it into Excel as comma delimited rows and columns.
So setting up the program with some of it more a framework that leaves room for later additions and tailoring by me and/or you in subsequent project. If this project interests you, please in your response let me know your comfort level working with someone like me who has both feet in journalism and just a toe dipping in coding -- so readable code and comments to help.
I would give you sample text and sample output to give you an idea of end goal, but aim of this job is not to get all possible combinations of needed searches etc. for end goal, just to set up framework for the search functions for later tailoring.