Fuzzy search?


Warning: count(): Parameter must be an array or an object that implements Countable in /home/styllloz/public_html/qa-theme/donut-theme/qa-donut-layer.php on line 274
0 like 0 dislike
11 views
There are two rows, the 1st short 1-3 words, the second-longest 10-20 words, we have to determine is whether the first string into the second or how many percent it is there. Advise algorithms :)
by | 11 views

7 Answers

0 like 0 dislike
I once wrote a thesis on this; I got that it is best to compare the Russian words along the length of the maximal common prefix (as a percentage of the lesser of the length of words should be above threshold). Comparisons — do you compare words in pairs of rows, and output function of the similarity using the distance between the similar words.
by
0 like 0 dislike
if(longStr.Contains(shortStr))
by
0 like 0 dislike
There are many different distances between the words, I would break the phrase into words and would have taken the average of the maximum obtained measures of pairwise matching words.
by
0 like 0 dislike
You can compare the method of trigrams. Will give a certain result, even if the words with different endings, etc.
by
0 like 0 dislike
Traditionally the Levenshtein distance.
\r
But I would recommend ispolzvat longest common subsequence. Thus, it is possible to introduce some penalty for the gaps between words.
by
0 like 0 dislike
by

Related questions

0 like 0 dislike
1 answer
0 like 0 dislike
3 answers
0 like 0 dislike
5 answers
0 like 0 dislike
2 answers
0 like 0 dislike
2 answers
110,608 questions
257,186 answers
0 comments
27,914 users