Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicit checkpoint wait on Tablet Split is now based on LastRecordCheckpoint #246

Merged
merged 6 commits into from
Jul 13, 2023

Conversation

adithya-kb
Copy link
Contributor

We will populate the last seen record's checkpoint in the "lastRecordCheckpoint" field. This is meant to have the explicit checkpoint details of the last seen record from any response (i,e the from_op_id and the record’s commit_time). We will try to update this value in every GetChanges call.

After “lastRecordCheckpoint” is updated as needed, we will then see the response checkpoint from the current GetChanges response. If the response checkpoint is higher :

  1. We will store the response’s checkpoint, if the response’s checkpoint is higher than “lastReadRecordCheckpoint”
  2. We will add this tablet to a wait-list, where we will first wait until the ExplicitCheckpoint of the tablet is equal to: “lastRecordCheckpoint”.
  3. After the ExplicitCheckpoint was equal to “lastRecordCheckpoint”, then we can update the ExplicitCheckpoint value of the tablet to the response’s checkpoint i.e directly update ExplicitCheckpoint in the “tabletToExplicitCheckpoint“ map.

This will take care of updating the ExplicitCheckpoint in cases of NoOp messages in the RAFT WAL , and also take care of tablets splitting right after a previous split in which case there will be no records to checkpoint.

@adithya-kb adithya-kb changed the title Explicit tablet split Explicit checkpoint wait on Tablet Split is now based on LastRecordCheckpoint Jul 11, 2023
Copy link
Collaborator

@vaibhav-yb vaibhav-yb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes seem fine, I am a little skeptical about the comment I provided, please confirm once if that will not have unintended effects.

Another change I remember you added on the service side was a change in proto files, I believe you will need to update the the yb-client version here in the pom.xml as well. The change is purely a connector side change.

// At this position, we know we have received a callback for split tablet
// handle tablet split and delete the tablet from the waiting list.

// Call getChanges to make sure checkpoint is set on the cdc_state table.
LOGGER.info("Setting explicit checkpoint is set to {}.{}", explicitCheckpoint.getTerm(), explicitCheckpoint.getIndex());
setCheckpointWithGetChanges(tableIdToTable.get(part.getTableId()), part,
cp, explicitCheckpoint, schemaNeeded.get(part.getId()),
lastRecordCheckpoint, explicitCheckpoint, schemaNeeded.get(part.getId()),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will call the GetChanges API by setting the from_op_id as lastRecordCheckpoint, are we sure that won't have any side effects?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, need to revert this to cp again

@vaibhav-yb vaibhav-yb added the bug Something isn't working label Jul 11, 2023
Copy link
Collaborator

@vaibhav-yb vaibhav-yb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things, in the YugabyteDBStreamingChangeEventSource class:

  1. In the handleTabletSplit method, add a line to remove the instance of split tablet from the explicit map:
// Remove the corresponding entry for explicit checkpoint as well.
tabletToExplicitCheckpoint.remove(splitTabletId);
  1. Initialise the explicit checkpoint for children tablets in the addTabletIfNotPresent:
tabletToExplicitCheckpoint.put(p.getId(), OpId.from(pair.getCdcSdkCheckpoint()).toCdcSdkCheckpoint());

@adithya-kb adithya-kb merged commit 3179462 into main Jul 13, 2023
0 of 2 checks passed
@vaibhav-yb vaibhav-yb deleted the explicit_tablet_split branch August 18, 2023 06:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants