Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: add negative test to detect replication failures #1445

Merged
merged 4 commits into from
Oct 14, 2024

Conversation

malandis
Copy link
Contributor

@malandis malandis commented Oct 11, 2024

Previously, our integration tests were written assuming consistent
reads. To avoid false positives caused by stale reads in low-latency
environments, we enforced consistent reads by setting the consistent
read header. While this reduced noise from false positives, it also
meant we lacked test coverage for replica reads and replication behavior.

To address this gap, we've added a negative test that detects
replication failures by verifying that no stale reads occur after the
expected replication delay.

The new test suite replica-reads implements a test that runs 10
trials in parallel. Each trial performs the following steps:

  • Set: Writes a random key-value pair to the cache
  • Wait: 1 second for reads to propagate
  • Get: Reads the value for the key using cacheClientWithBalancedReadConcern.
  • Assert: Confirms that the retrieved value matches the value set
    earlier.

Trials are started with slight delays between them to distribute
load (delayBetweenTrials and a small random offset).

Previously, our integration tests were written assuming consistent
reads. To avoid false positives caused by stale reads in low-latency
environments, we enforced consistent reads by setting the consistent
read header. While this reduced noise from false positives, it also
meant we lacked test coverage for replica reads and replication behavior.

To address this gap, we've added a negative test that detects
replication failures by verifying that no stale reads occur after the
expected replication delay.

The new test suite `replica-reads` implements a test that runs 10
trials in parallel. Each trial performs the following steps:
- Set: Writes a random key-value pair to the cache using
`cacheClientWithBalancedReadConcern`.
- Wait: 1 second for reads to propagate
- Get: Reads the value for the key.
- Assert: Confirms that the retrieved value matches the value set
earlier.

Trials are started with slight delays between them to distribute
load (delayBetweenTrials and a small random offset).
@malandis malandis requested review from krispraws and a team October 11, 2024 17:47
anitarua
anitarua previously approved these changes Oct 12, 2024
Copy link
Contributor

@anitarua anitarua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

krispraws
krispraws previously approved these changes Oct 14, 2024
Copy link
Contributor

@krispraws krispraws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, left a question for clarification.

Previously we had a separate function in the inner loop of promise
creation. Instead there is a single trial function that is
parameterized by the trial number.

We refactor the test to pull out the trial function and build the
promise in the loop.
@malandis malandis merged commit a96bf0a into main Oct 14, 2024
13 checks passed
@malandis malandis deleted the test/replica-reads branch October 14, 2024 20:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants