Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V4 Similarity support #327

Closed
wants to merge 30 commits into from
Closed

Conversation

nzdev
Copy link
Contributor

@nzdev nzdev commented Feb 1, 2023

What?

This pull request adds support for setting the Similarity used for querying a Lucene Index.

Why?
This can improve the scoring of the documents in the index for more relevant results.

Example:

var searcher = (BaseLuceneSearcher)indexer.Searcher;
var query = searcher.CreateQuery("cOntent",
                    BooleanOperation.And,
                    searcher.LuceneAnalyzer,
                    searchOptions: new LuceneSearchOptions
                    {
                        Similarity = LuceneSearchOptionsSimilarities.BM25
                    }).All();
 var results = query.Execute();

How?

  • Added SimilarityDefinitionCollection to IndexOptions to set up available similarity definitions
  • Added IndexSimilaritiesFactory to IndexOptions to set up available similarity types
  • Added LuceneSearchOptions.SimilarityName property to set the Similarity on a query.
  • Added SimilarityDefinitionCollection.AddExamineLuceneSimilarities() with LuceneSearchOptionsSimilarities with a set of preconfigured defaults.
  • Set LuceneSearchOptionsSimilarities.ExamineDefault to the existing default similarity (Lucene.Net.Similarities.DefaultSimilarity)
  • Added recommendation to change LuceneSearchOptionsSimilarities.ExamineDefault to BM25Similarity in Examine V5
  • Added DictionaryPerFieldSimilarityWrapper as a simple way of setting up per field Similarities.
  • Added documentation
  • Added tests

Similarity types

These are the default similarity types provided with Examine.

Similarity Name Description
Examine.Default Default Similarity for Examine Lucene. ( V3/V4 Lucene.Classic), (V5 Lucene.BM25)
Lucene.Classic Classic Similarity for Lucene
Lucene.BM25 BM25Similarity with default parameters for Lucene
Lucene.LMDirichlet LMDirichletSimilarity with default parameters for Lucene
Lucene.LMJelinekMercerTitle LMJelinekMercerSimilarity with parameter 0.1f which is suitable for title searches
Lucene.LMJelinekMercerLongText LMJelinekMercerSimilarity with parameter 0.7f which is suitable for long text searches.

@nzdev nzdev marked this pull request as draft March 24, 2023 01:30
@nzdev nzdev marked this pull request as ready for review March 25, 2023 07:11
@nikcio nikcio mentioned this pull request Jul 28, 2023
5 tasks
@nzdev nzdev changed the base branch from release/3.0 to release/4.0 August 15, 2023 22:04
@nzdev nzdev marked this pull request as draft October 31, 2023 12:33
@nzdev nzdev changed the title Similarity support V4 Similarity support Nov 20, 2023
@nzdev nzdev marked this pull request as ready for review December 7, 2023 08:11
@nzdev nzdev marked this pull request as draft December 14, 2023 10:40
@nzdev
Copy link
Contributor Author

nzdev commented Dec 14, 2023

While this does work fine. I'd like to go further with this.

@nzdev nzdev closed this Dec 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant