It's a pretty simple implemented:
Read, for example:
Oleg Bartunov, Alexander Korotkov
Full-text search in PostgreSQL in milliseconds
Google is not in speed attainment. And the adequacy
of the result - "relevance" is called.
Smart neyroseti and everything.
And only for speed good banal FTS.
The algorithm there is a primitive, you can even on the weekends to warm up implement.
Or just use the ready-made very fast decision
On your local computer slowly, because this is not
its main function.
If the developers believed that search is a function of paramount importance - it would be allocated just more resources for indexing, index storage, and more RAM for caching, etc.
But this would require to take resources from more important
functions of the computer.
The algorithm FTS:
1) Divide the text into words
2) Discard function words (prepositions, etc.). The resulting so-called tokens.
3) Run the resulting word-tokens using the algorithm of stemming snowball.tartarus.org/algorithms/russian/stemmer.html
4) the Received words without endings (called Thermae) stuffed in roaringbitmap.org
Will look like this:
The source objects for the search
a) "Hey, bear"
b) "bear force"
a) -> "Hello", "bear" -> "Hello", "the bear"
b) -> "bears", "power" -> "bear", "SIL"
In the index like this:
"bear" is 11
A search for the word "bear":
1) Turn to the index, we get 11 that oznacena as in the first and in the second sentence is of interest to us the word.
2) Sort the result by relevance
Search for the phrase "Hey, bear":
1) Go by the first word, we get 10
2) Looking for the second word, we get 11
3) the Broken operation on the intersection of the results 10
4) Sorted by relevance
It is easy to notice:
a) Algorithm stemming may mess up
b) Relevancy is calculated purely mechanical
But with the speed, still no problem.
On the local computer is just not the main focus.
Do a quick search of the local - no problem.