Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed document source and score field mismatch in sorted hybrid queries #1043

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

martin-gaievski
Copy link
Member

@martin-gaievski martin-gaievski commented Dec 24, 2024

Fixed document source and score field mismatch in sorted hybrid queries.
Returned search hits identified with the help of min heap, it's used by sorting functionality to get top X docs. We do keep track of the heap leaf element and updating it when collecting doc ids. In current code we use only one element for this, but in case of hybrid query we do need separate element for each sub-query. With a single element our updates are incorrectly propagated to a results of different sub-queries, cause different types of inconsistency: doc id, score fields, score can be incorrect in final search result.

In this PR I'm changing the min heap leaf element from a single object to an array of objects, one for each sub-query.

Tested on the data set referred in the issue, got correct response where all field have consistent values in _source and sort sections:

{
    "took": 455,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 397,
            "relation": "eq"
        },
        "max_score": 0.8442519,
        "hits": [
            {
                "_index": "templates_prod",
                "_id": "bk2m5np8h467r92kmalxdcvft",
                "_score": null,
                "_source": {
                    "trendingScore": 303.125,
                    "name": "Summer Breeze Design"
                },
                "sort": [
                    303.125
                ]
            },
            {
                "_index": "templates_prod",
                "_id": "bk2m6w28201xn8m3vb6mmdebn",
                "_score": null,
                "_source": {
                    "trendingScore": 303.125,
                    "name": "Winterfrost"
                },
                "sort": [
                    303.125
                ]
            },
            {
                "_index": "templates_prod",
                "_id": "ak2nhbsz900lwxe0xcpir0duc",
                "_score": null,
                "_source": {
                    "trendingScore": 19,
                    "name": "Sunshine Days - Nature Greeting"
                },
                "sort": [
                    19.0
                ]
            },
            {
                "_index": "templates_prod",
                "_id": "ak2n8y4h900csue0x7ij8rwuj",
                "_score": null,
                "_source": {
                    "trendingScore": 13,
                    "name": "Mountain Vista Wedding Suite - Save the Date"
                },
                "sort": [
                    13.0
                ]
            },
            {
                "_index": "templates_prod",
                "_id": "ak2nau2a900e7ue0x2h3nheb9",
                "_score": null,
                "_source": {
                    "trendingScore": 10,
                    "name": "Ocean Waves Thank You"
                },
                "sort": [
                    10.0
                ]
            },
            {
                "_index": "templates_prod",
                "_id": "ak305ic900fr3xe0xjuiin2nt",
                "_score": null,
                "_source": {
                    "trendingScore": 10,
                    "name": "Midnight Dreams - Elegant Celebration"
                },
                "sort": [
                    10.0
                ]
            },
            {
                "_index": "templates_prod",
                "_id": "ak2n2r1t007au00xsn82pvg1",
                "_score": null,
                "_source": {
                    "trendingScore": 6,
                    "name": "Forest Pine - Rustic Wedding Invitation"
                },
                "sort": [
                    6.0
                ]
            },
            {
                "_index": "templates_prod",
                "_id": "ak2n0pxe0060u00xxz0afuaa",
                "_score": null,
                "_source": {
                    "trendingScore": 3.0,
                    "name": "Modern Romance - Classic Black Wedding Invitation"
                },
                "sort": [
                    3.0
                ]
            }
        ]
    }
}

Related Issues

#1044

Check List

  • New functionality includes testing.
  • [ ] New functionality has been documented.
  • [ ] API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.~~
  • [ ] Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

…is enabled in hybrid query

Signed-off-by: Martin Gaievski <[email protected]>
Fixed mismatch between document source and score fields when sorting is enabled in hybrid query

Signed-off-by: Martin Gaievski <[email protected]>
@martin-gaievski martin-gaievski force-pushed the fixed_mismatch_in_doc_source_and_score_fields branch from 093f199 to 3559f12 Compare December 25, 2024 00:53
@martin-gaievski martin-gaievski marked this pull request as ready for review December 25, 2024 00:55
@martin-gaievski martin-gaievski added backport 2.x Label will add auto workflow to backport PR to 2.x branch Bug Fixes Changes to a system or product designed to handle a programming bug/glitch labels Dec 25, 2024
private FieldValueHitQueue.Entry bottom;
@Getter(AccessLevel.PACKAGE)
@VisibleForTesting
private FieldValueHitQueue.Entry fieldValueLeafTrackers[];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private FieldValueHitQueue.Entry fieldValueLeafTrackers[];
private FieldValueHitQueue.Entry sortFieldValueTrackers[];

@@ -254,6 +259,9 @@ protected void initializePriorityQueuesWithComparators(LeafReaderContext context
initializeLeafFieldComparators(context, i);
}
}
if (Objects.isNull(fieldValueLeafTrackers)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QQ: Do you want this code to be executed
a. Once per shard?
b. Once per segment?
c. Once per entire search flow?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this code can be shifted under above if(compoundScores==null) condition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Label will add auto workflow to backport PR to 2.x branch Bug Fixes Changes to a system or product designed to handle a programming bug/glitch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants