Missing Known Relevant Results

I am searching for just the two most relevant document with Marqo like so:

mq.index("my-index").search("question", limit=2)

The results are not consistent though, I tested with a small dataset and I can see that I am sometimes not even getting the first result correctly? (as in there are closer vectors in my data).

In Marqo the limit parameter is analogous to the ef search parameter of other HNSW implementations.

That means that a small limit will not explore all the neighbours in the search and it may miss the best neighbours. I would recommend setting limit to a minimum of 20 to increase the recall of the search, for larger indexes you will be better off with a higher limit.

If you are only interested in the top two results then just take the first two from the results. Like so:

results = mq.index("my-index").search("question", limit=50)

hits = results["hits"][:2]
2 Likes