-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for reroute operation during node-left #16468
base: main
Are you sure you want to change the base?
Conversation
❌ Gradle check result for 8a64664: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 96c0766: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for 83a5525: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 4227892: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
.shardsWithState(ShardRoutingState.INITIALIZING) | ||
.stream() | ||
.filter(r -> !r.primary()) | ||
.map(ShardRouting::shardId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we keep ShardRouting itself here? Can there be case where for shard ShardRouting object is different in RoutingAllocation vs batch cache?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be a problem in case of multiple replica copies for a shard. RoutingAllocation would have one ShardRouting per copy whereas the batch cache would contain only one of the ShardRouting (picked randomly).
ShardRouting might not exactly match and hence de-dupe by ShardId is required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But then fetching and decision should happen on latest ShardRouting from RoutingAllocation instead of Batch cache
❌ Gradle check result for c506262: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
Description
[Describe what this change achieves]
During node-left task execution, the shards on the leaving node are failed and reroute operation is triggered. The input RoutingAllocation to reroute has updated ShardRouting information. As part of reroute, the existing recoveries are looked for a better match using the ShardRouting cached in ShardBatch cache. Since this can contain the stale ShardRoutring assigned to the node leaving the cluster, it causes NPE when the node is referenced during shard allocation decisions.
Repro for stale batch cache entries as compared to RoutingAllocation
node
uwpG13DpRq-ZvWOUbgpivA
is removed from cluster and the ShardAllocator tries to reference the PrimaryShard[index][2]
which was previously allocated on the node. The batch cache has Replica shard for[index][2]
still in INITIALIZING state although it has been marked as UNASSIGNED in RoutingAllocation.Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.