[indexer] use in memory buffer to store obj changes and update snapsh… #18007

emmazzz · 2024-05-31T06:31:30Z

…ot table

Description

Introduce an object change buffer that gets populated by the checkpoint commiter and consumed by the snapshot processor to update the objects snapshot table asynchronously.

Test plan

Tested locally against devnet data.

Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.

For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.

…ot table

vercel · 2024-05-31T06:31:33Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
multisig-toolkit	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Jun 5, 2024 11:39pm
sui-docs	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Jun 5, 2024 11:39pm
sui-kiosk	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Jun 5, 2024 11:39pm
sui-typescript-docs	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Jun 5, 2024 11:39pm

crates/sui-indexer/src/handlers/committer.rs

wlmyng · 2024-05-31T07:14:24Z

crates/sui-indexer/src/handlers/objects_snapshot_processor.rs

+// The number of object changes in the buffer before the snapshot processor starts processing them.
+// TODO: placeholder value, need to tune this


Makes sense, the object updates in this buffer will likely be larger than the typical live objects update but we don't want it to be too large either

wlmyng · 2024-05-31T07:15:26Z

crates/sui-indexer/src/handlers/objects_snapshot_processor.rs

@@ -138,14 +161,22 @@ where
            }
        }

-        info!("Objects snapshot processor starts updating objects_snapshot periodically...");
-        loop {
+        // We are not in backfill mode but it's possible that the snapshot checkpoint is behind the


Wouldn't the snapshot checkpoint always be behind though

Ah I see, this is to handle the scenario where we've started populating the buffer, but the lhs of the buffer is larger than the max checkpoint of snapshot. In that case, we must still upsert from objects history

I think we can speed up this process in a similar vein by first fetching the object updates, and then separately doing the upsert? Instead of relying on the current upsert logic which takes around 20 mins to index 600 checkpoints

it't possible to be faster, but reading those object updates and upsert has the networking overhead of passing the data back-n-forth.

wlmyng · 2024-05-31T07:20:18Z

crates/sui-indexer/src/handlers/objects_snapshot_processor.rs

+        // Now the cp in objects snapshot is greater than what's in the buffer, so we can start flushing
+        // the buffer to objects snapshot.


Not sure if addressed but one potential issue might arise if snapshot exceeds buffer such that some object in snapshot may be a later version than what we'll end up upserting from buffer

Yes it's indeed possible, although the buffer will contain later versions too and will overwrite with the correct version. To solve this, I can just add cp to each TransactionObjectChanges in the buffer and discard the ones that's before start_cp.

Addressed in the latest commit.

gegaowp

overall looks great, left several comments, thanks for picking this up and working this out quickly!

gegaowp · 2024-05-31T09:40:57Z

crates/sui-indexer/src/handlers/objects_snapshot_processor.rs

+                    if buffer_size > OBJECTS_SNAPSHOT_BUFFER_THRESHOLD {
+                        // flush everything in the buffer
+                        // TODO: what if there's too many things in the buffer? Maybe it's better to flush in batches
+                        let object_changes = self.buffer.lock().unwrap().buffer.drain(..).collect();


iiuc if we drain everything from the buffer and DB commit finishes very quickly, then objects_snapshot will be very close to the latest cp, in other words, the available range can be < 10, which seems off?

an alternative is, we track the buffer_size by checkpoint, it has a max size for example 1800, a flushing starting size for example 900, and a flushing end size for example 300, then

the sender keeps pushing until hitting 1800, if it hits 1800, it will wait to avoid bloating memory

the receiver will flush as long as the total size is > 900, and always flush until 300 checkpoints are left, so that we always have at least 300 checkpoints in the available range

we have a different max size is to reduce the odds / smooth some checkpoint commits on the receiver end, instead of blocking immediately on the sender side.

Addressed the comment in the latest commit by flushing when the buffer cp is behind by 900 and setting the batch size to be 600.

gegaowp · 2024-05-31T09:49:07Z

crates/sui-indexer/src/handlers/committer.rs

+            let mut buffer = objects_snapshot_buffer.lock().unwrap();
+            // The buffer has never been used so we need to set the startup_checkpoint
+            if buffer.startup_checkpoint.is_none() {
+                buffer.startup_checkpoint = Some(first_checkpoint_seq);


this is the init logic of startup_checkpoint, but seems that I did not find the updating logic when we remove checkpoints?

It's not updated. It's only here to tell snapshot processor when it can start using the buffer.

gegaowp · 2024-05-31T10:13:23Z

crates/sui-indexer/src/handlers/objects_snapshot_processor.rs

@@ -162,8 +193,36 @@ where
                            .latest_object_snapshot_sequence_number
                            .set(start_cp as i64);
                    }
+
+                    buffer_cp = self.buffer.lock().unwrap().startup_checkpoint;


emmazzz · 2024-05-31T23:25:11Z

crates/sui-indexer/src/handlers/objects_snapshot_processor.rs

@@ -184,6 +191,7 @@ where
                        .unwrap_or_default();

                    if latest_cp > start_cp + self.config.snapshot_max_lag as u64 {
+                        let end_cp =


Oops this line should be deleted

wlmyng

Damn, thanks for tackling this Emma! These changes make sense to me.

On the committer side, we introduce a sender channel, and some new logic so that when we are not in objects_snapshot backfill mode, we'll load object changes into the sender channel one checkpoint at a time.

The story gets a bit more complex on the objects_snapshot_processor side.

Say we start from backfill, and we get to a point where start_cp > fullnode_cp - MAX_LAG. At this instant, the state of indexer db is such that snapshot's max cp = checkpoint table's max cp. Since we have caught up, we flip the switch to transition out of backfill state. There's one more check we need to do before resuming things as normal, and that's making sure that there isn't a gap between objects_snapshot and the buffer. If snapshot's checkpoint is more than 1 checkpoint behind the buffer, we cannot start flushing the buffer into objects_snapshot until we've bridged the gap, otherwise we'll have some missing object updates data. So, we need to update objects_snapshot the old fashioned way until objects_snapshot_cp > buffer_cp - 1. The -1 is there because it's fine for objects_snapshot to be at some checkpoint 99 while buffer is 100.

Once this gap has been plugged, we are good to go with flushing buffer into objects_snapshot.

semgrep-code-mystenlabs · 2024-06-05T01:49:46Z

Semgrep found 1 ssc-efa14576-9601-4ae6-939c-3da58aa25013 finding:

examples/trading/frontend/pnpm-lock.yaml
- L4700

Risk: Affected versions of vite are vulnerable to Improper Handling Of Case Sensitivity / Exposure Of Sensitive Information To An Unauthorized Actor / Improper Access Control. The vulnerability arises when the Vite development server's option, server.fs.deny, can be circumvented on case-insensitive file systems through the utilization of case-augmented versions of filenames, as the matcher derived from config.server.fs.deny fails to prevent access to sensitive files when raw filesystem paths are requested with augmented casing.

Manual Review Advice: A vulnerability from this advisory is reachable if you host vite's development server on Windows, and you rely on server.fs.deny to deny access to certain files

Fix: Upgrade this library to at least version 4.5.2 at sui/examples/trading/frontend/pnpm-lock.yaml:4700.

Reference(s): GHSA-c24v-8rfc-w8vw, CVE-2023-34092, CVE-2024-23331

_{Ignore this finding from ssc-efa14576-9601-4ae6-939c-3da58aa25013.}

github-actions · 2024-08-07T01:54:40Z

This PR is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days.

[indexer] use in memory buffer to store obj changes and update snapsh…

e954120

…ot table

emmazzz requested a review from gegaowp May 31, 2024 06:31

wlmyng reviewed May 31, 2024

View reviewed changes

gegaowp reviewed May 31, 2024

View reviewed changes

use metered channel as buffer

f13ec92

vercel bot deployed to Preview – sui-docs May 31, 2024 23:19 View deployment

emmazzz commented May 31, 2024

View reviewed changes

wlmyng reviewed Jun 1, 2024

View reviewed changes

add sleep and cargo fmt

72b5912

vercel bot deployed to Preview – sui-docs June 1, 2024 03:58 View deployment

use objects snapshot cp as watermark

d7e0710

vercel bot deployed to Preview – multisig-toolkit June 5, 2024 01:44 View deployment

vercel bot deployed to Preview – sui-docs June 5, 2024 01:44 View deployment

vercel bot deployed to Preview – sui-kiosk June 5, 2024 01:44 View deployment

vercel bot deployed to Preview – sui-typescript-docs June 5, 2024 01:46 View deployment

resolve deadlock

5c067a4

vercel bot deployed to Preview – sui-typescript-docs June 5, 2024 23:38 View deployment

vercel bot deployed to Preview – sui-kiosk June 5, 2024 23:39 View deployment

vercel bot deployed to Preview – multisig-toolkit June 5, 2024 23:39 View deployment

vercel bot deployed to Preview – sui-docs June 5, 2024 23:39 View deployment

github-actions bot added the Stale label Aug 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[indexer] use in memory buffer to store obj changes and update snapsh… #18007

[indexer] use in memory buffer to store obj changes and update snapsh… #18007

emmazzz commented May 31, 2024

vercel bot commented May 31, 2024 •

edited

Loading

wlmyng May 31, 2024

wlmyng May 31, 2024

wlmyng May 31, 2024

gegaowp May 31, 2024

wlmyng May 31, 2024

emmazzz May 31, 2024

emmazzz May 31, 2024

gegaowp left a comment

gegaowp May 31, 2024

emmazzz May 31, 2024

gegaowp May 31, 2024 •

edited

Loading

emmazzz May 31, 2024

gegaowp May 31, 2024

emmazzz May 31, 2024

wlmyng left a comment •

edited

Loading

semgrep-code-mystenlabs bot commented Jun 5, 2024

github-actions bot commented Aug 7, 2024

		// The number of object changes in the buffer before the snapshot processor starts processing them.
		// TODO: placeholder value, need to tune this

		// Now the cp in objects snapshot is greater than what's in the buffer, so we can start flushing
		// the buffer to objects snapshot.

[indexer] use in memory buffer to store obj changes and update snapsh… #18007

Are you sure you want to change the base?

[indexer] use in memory buffer to store obj changes and update snapsh… #18007

Conversation

emmazzz commented May 31, 2024

Description

Test plan

Release notes

vercel bot commented May 31, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gegaowp left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gegaowp May 31, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wlmyng left a comment • edited Loading

Choose a reason for hiding this comment

semgrep-code-mystenlabs bot commented Jun 5, 2024

github-actions bot commented Aug 7, 2024

vercel bot commented May 31, 2024 •

edited

Loading

gegaowp May 31, 2024 •

edited

Loading

wlmyng left a comment •

edited

Loading