-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Chore] Improve test flakes #404
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
time.sleep(2) | ||
time_waited += 2 | ||
|
||
if time_waited > max_wait: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small nit: because of logic ordering, you end up sleeping for 2 seconds before failing on the last iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what comes from outsourcing too much typing to github copilot 🤣
desc = pc.describe_index(index) | ||
if desc.status.state == "Ready": | ||
ready_to_delete = True | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why both a break and boolean flag? could just loop/break?
time_waited = 0 | ||
while not ready_to_delete: | ||
desc = pc.describe_index(index) | ||
if desc.status.state == "Ready": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you can delete in other states as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You would think, wouldn't you? Maybe this is true in some cases, but a lot of flakes seen in the past are about trying to delete while the index is "upgrading". I.e. if you called configure index too recently, your delete request will be rejected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually changed all my integration tests to check both status.state
and status.ready
. There are situations where they are not in sync, and the index is marked as Ready but then the state is something else.
I'd maybe recommend checking both just to be thorough, it's helped with some of my flakiness. I think @aulorbe reported this to the db team at some point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, this also seems in line with some of the approaches we've had to take in other clients to massage some of these issues, specifically around cleanup.
I basically just retry the cleanup delete call a set number of times before bailing in Go.
Problem
I frequently see recurring test flakes, most often at the cleanup stage of
test_configure_index_with_deletion_protection
because the index cannot be deleted while still in an "upgrading" state. The index is in that state while configuration changes are being applied.Solution
Type of Change