Algorithm to find similar objects?

0 like 0 dislike
11 views
Hello!


Assume that the similarity of some objects, we can compute by computing metrics on the classification of objects of different classifiers.


I.e. each object for each classifier have their match.


Question:


how to find the n most (or least) similar objects to object X when the order of objects in the tens of millions, and classifiers tens of thousands?
by | 11 views

4 Answers

0 like 0 dislike
If you want to group a very large number of objects you should try to create a hash function to results of a classifier (such that she necessarily gave out the same Hesi for objects supposedly of the same group, but it is not guaranteed that objects with the same hash was in the same group).
\r
Having the hash function we already can sort objects by its value even if all the hash values do not fit in RAM (you can use B-tree for example).
\r
But after sorting into groups with the same hashes you can use more accurate algorithms to divide the group into desired sub-groups, since the search will have much less.
by
0 like 0 dislike
Sort and group by criteria. MapReduce.
by
0 like 0 dislike
Can try to use the Kohonen self-organizing map?
by
0 like 0 dislike
I would use cluster analysis for this. The algorithm TROUT is fast enough, you can look in his direction.
by

Related questions

0 like 0 dislike
3 answers
0 like 0 dislike
1 answer
0 like 0 dislike
1 answer
110,608 questions
257,186 answers
0 comments
28,731 users