With 1.7 million freelancers, we'll match you with the perfect talent.
The client has made the following changes to the job.
Client prefers freelancers from:
You are still able to submit a proposal for this job.
The client prefers freelancers from
a different location.
I have a large amount of data in a flat file, which is just strings separated by newlines like this:
All I wish to ask from my data is "is string xyz in the dataset", also know as candidacy checking. The dataset is never added too, nor subtracted from, once it is created.
The problem is that I need this candidacy checking to happen very very fast, many times per second - and the size of my data is several Gigabytes. When i took a small sample of my data (2.5Gb) and put it into a indexed SQLite database, the resultant database was 10Gb big, and was too slow. The slowness was almost entirely attributed to the fact that at 10Gb the dataset had to be stored on disk and not in memory, and caching cannot speed up my query times as the string being queried is essentially random.
The plan therefore is to put my data into a Trie data-structure, which will hopefully be far more compact and will be able to fit into memory. To make the...
Sign in or Register to see more