Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(store/v2): add the catch up process in the migration #19454

Merged
merged 20 commits into from
Mar 20, 2024

Conversation

cool-develope
Copy link
Contributor

@cool-develope cool-develope commented Feb 16, 2024

Description

Closes: #19437

  • Implement the catch up process to sync the processed blocks while the state migration
  • Integrate the migration manager within RootStore to allow switching new SS and SC backends

Author Checklist

All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.

I have...

  • included the correct type prefix in the PR title
  • confirmed ! in the type prefix if API or client breaking change
  • targeted the correct branch (see PR Targeting)
  • provided a link to the relevant issue or specification
  • reviewed "Files changed" and left comments if necessary
  • included the necessary unit and integration tests
  • added a changelog entry to CHANGELOG.md
  • updated the relevant documentation or specification, including comments for documenting Go code
  • confirmed all CI checks have passed

Reviewers Checklist

All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.

I have...

  • confirmed the correct type prefix in the PR title
  • confirmed all author checklist items have been addressed
  • reviewed state machine logic, API design and naming, documentation is accurate, tests and test coverage

Summary by CodeRabbit

  • New Features
    • Introduced encoding and decoding functionality for Changesets, enhancing data handling.
    • Launched a comprehensive state migration process from store/v1 to store/v2, including new management structures and methods.
    • Added new tests for migration, encoding/decoding of Changesets, and state restoration to ensure reliability and correctness.
  • Bug Fixes
    • Adjusted the Restore method in StorageStore to correctly handle batch buffer sizes and reset batches after writing.
  • Refactor
    • Updated Batch interface and implementations across different storage systems (pebbledb, rocksdb, sqlite) to include error handling in the Reset method.
    • Renamed and updated fields and methods related to state storage and migration for clarity and consistency.

@cool-develope cool-develope requested a review from a team as a code owner February 16, 2024 16:53
Copy link
Contributor

@cool-develope your pull request is missing a changelog!

Copy link
Contributor

coderabbitai bot commented Feb 16, 2024

Walkthrough

Walkthrough

This update introduces significant enhancements to the store system, focusing on encoding and decoding changesets, migration management, and batch processing adjustments. It lays the groundwork for a more robust state migration from store/v1 to store/v2 and improves the handling of changesets through new serialization methods. Additionally, the update modifies batch processing across different storage backends to ensure error handling consistency.

Changes

Files Change Summary
store/changeset.go, .../encoding/changeset.go Introduced encoding/decoding for Changesets, including Marshal/Unmarshal methods.
store/changeset_test.go, .../encoding/changeset_test.go Added tests for Changeset encoding/decoding.
store/migration/manager.go, store/root/store.go Updated to support state migration, including new fields and methods for migration management.
store/migration/manager_test.go, store/root/migrate_test.go Modified and added tests for state migration and changeset application.
store/storage/.../batch.go Updated Reset method in batch interfaces and implementations across storage backends.
store/storage/store.go Adjusted batch processing logic in Restore method.

Assessment against linked issues

Objective Addressed Explanation
Implement a catch-up process to handle processed blocks during state syncing (#19437) The provided summaries do not explicitly mention a catch-up process for processed blocks during state syncing, making it unclear if this objective is directly addressed.
Ensure the catch-up process efficiently syncs missed blocks without disrupting state syncing (#19437) Without explicit mention of the catch-up process implementation, it's unclear if efficiency and non-disruption criteria are met.
Develop a mechanism to identify and process missed blocks during the syncing operation (#19437) The summaries lack details on mechanisms for identifying and processing missed blocks, leaving uncertainty about this objective's fulfillment.
Maintain consistency in the blockchain state by incorporating the catch-up process seamlessly (#19437) The absence of explicit details on the integration of a catch-up process makes it uncertain whether blockchain state consistency is maintained.
Test the implemented catch-up process to verify its reliability and effectiveness in syncing missed blocks (#19437) There's no mention of specific tests for the catch-up process, making it unclear if this objective is addressed.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit-tests for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit tests for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit tests.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

CodeRabbit Discord Community

Join our Discord Community to get help, request features, and share feedback.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 1

Configuration used: .coderabbit.yml

Commits Files that changed from the base of the PR and between 43df8d8 and 927bab0.
Files selected for processing (7)
  • store/changeset.go (2 hunks)
  • store/migration/manager.go (2 hunks)
  • store/migration/manager_test.go (2 hunks)
  • store/root/migrate_test.go (1 hunks)
  • store/root/store.go (10 hunks)
  • store/root/store_test.go (2 hunks)
  • store/store.go (1 hunks)
Additional comments: 17
store/migration/manager_test.go (1)
  • 53-53: The order of parameters passed to NewManager has been modified. Ensure that all calls to NewManager throughout the codebase reflect this new order to maintain consistency and avoid potential runtime errors.
Verification successful

The verification process confirms that the order of parameters passed to NewManager in the store/migration package has been correctly modified and is consistent within its usage in the codebase, specifically in store/migration/manager_test.go. There are no other calls to this specific NewManager function outside of its definition and the test file, indicating that the changes are localized and do not affect other parts of the codebase. Therefore, the review comment is consistent with the codebase as it pertains to the specific context of the store/migration package.

Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Search for files that call NewManager. Expecting list of dependent files.
rg --type go $'NewManager\('

Length of output: 4978

store/changeset.go (4)
  • 3-7: The addition of imports for bytes and cosmossdk.io/store/v2/internal/encoding is necessary for the new functionality introduced in this file, specifically for encoding and decoding Changeset. Ensure these packages are used appropriately in the methods below.
  • 71-83: The encodedSize method calculates the size of the encoded Changeset, which is crucial for optimizing memory allocation during marshaling. This method appears to correctly calculate the size based on the structure of a Changeset. However, ensure that the calculation aligns with the actual encoding logic in the Marshal method to prevent any discrepancies.
Verification successful

The encodedSize method's calculation aligns with the Marshal method's logic for encoding a Changeset, as seen in the provided implementation details. This alignment ensures accurate memory allocation optimization during marshaling, confirming the accuracy of the review comment.

Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Manual verification needed to ensure alignment between encodedSize calculation and Marshal logic.
echo "Review the encodedSize calculation and compare it with the Marshal method's logic."

Length of output: 177



Script:

#!/bin/bash
# Search for the Marshal method related to the Changeset to understand its encoding logic.
rg "func \(cs \*Changeset\) Marshal\(" --multiline-dotall

Length of output: 127



Script:

#!/bin/bash
# Extract the implementation of the Marshal method to understand its encoding logic.
rg "func \(cs \*Changeset\) Marshal\(" -A 20 store/changeset.go

Length of output: 707

* 85-119: The `Marshal` method encodes the `Changeset` into a byte slice. This method is critical for serializing `Changeset` instances for storage or network transmission. Ensure that the encoding logic correctly handles all possible `Changeset` structures, including edge cases such as empty keys or values. * 121-167: The `Unmarshal` method decodes the `Changeset` from a byte slice. It's essential for reconstructing `Changeset` instances from their serialized form. Verify that this method correctly handles all encoded structures produced by the `Marshal` method, including edge cases and potential encoding variations.
store/store.go (1)
  • 75-80: The addition of the StartMigration method to the RootStore interface is a significant change, as it introduces a new capability to initiate the migration process. Ensure that all implementations of the RootStore interface have been updated to include this new method to maintain interface compliance.
Verification successful

The verification process has confirmed that the StartMigration method has been successfully added to the RootStore interface within the store/store.go and store/root/store.go files. This addition aligns with the initial review comment regarding the implementation of the new method to initiate the migration process.

Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Search for all implementations of RootStore to ensure they implement StartMigration. Expecting implementations to be updated.
rg --type go 'type [^\s]+ struct' --exec grep -L 'StartMigration() error'

Length of output: 484



Script:

#!/bin/bash
# First, find files that contain struct definitions which might implement RootStore
# Then, check if those files contain the StartMigration method implementation

# Find potential RootStore implementations
files=$(rg --type go 'type [^\s]+ struct' | cut -d':' -f1 | uniq)

# Check for StartMigration method in those files
for file in $files; do
    if rg -q 'StartMigration\(\) error' "$file"; then
        echo "Found StartMigration in $file"
    else
        echo "StartMigration not found in $file"
    fi
done

Length of output: 271416

store/root/migrate_test.go (1)
  • 1-154: The introduction of migrate_test.go is a crucial step in validating the migration and catch-up process. Ensure that the test suite covers a wide range of scenarios, including edge cases such as migrations with empty stores, migrations under load, and error handling during migration. Additionally, verify that the tests adequately clean up resources to prevent side effects on subsequent tests.
store/migration/manager.go (4)
  • 49-56: The NewManager function has been updated to accept additional parameters for the database, storage store, and commit store. This change is necessary to facilitate the migration process. Ensure that all calls to NewManager throughout the codebase have been updated to reflect this new signature.
Verification successful

The verification process has confirmed that the NewManager function in the store/migration package has been updated to include additional parameters, and the calls to this function within the codebase, specifically in test files like store/root/migrate_test.go and store/migration/manager_test.go, have been updated to reflect this new signature. Therefore, the changes align with the review comment.

Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Search for files that call NewManager. Expecting list of dependent files.
rg --type go $'NewManager\('

Length of output: 4978

* 59-75: The `Start` method initiates the migration process and is a critical addition to the `Manager` struct. It's important to ensure that this method correctly handles errors and logs them appropriately. Additionally, verify that the migration process is correctly synchronized with the rest of the system to prevent data races or inconsistencies.
Verification successful

Given the need for a more detailed analysis of the Start method's error handling and synchronization mechanisms, let's proceed with the following steps to gather the necessary information:

  1. Search for the implementation of the Start method within the Manager struct to understand its error handling and synchronization details.
  2. Look for references to the writeChangeset, Migrate, and Sync methods to assess their roles in the migration process and how they contribute to error handling and synchronization.

The analysis of the Start method's implementation within the Manager struct confirms that it incorporates error handling by logging errors encountered during the writing of changesets and by propagating errors from the Migrate method. Furthermore, the method's structured approach to launching a goroutine for changeset processing and the sequential calls to Migrate and Sync methods demonstrate an effort to maintain synchronization and prevent data races or inconsistencies during the migration process. Thus, the review comment regarding the need to ensure correct error handling and synchronization in the Start method is verified as accurate based on the provided implementation details.

Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Manual verification needed to ensure error handling and synchronization in the Start method.
echo "Review the Start method for error handling and synchronization."

Length of output: 136



Script:

#!/bin/bash
# Find the implementation of the Start method within the Manager struct
ast-grep --lang go --pattern $'func (m *Manager) Start($_, $_, $_) error {
  $$$
}'

# Search for references to writeChangeset, Migrate, and Sync methods
rg 'writeChangeset|Migrate|Sync' store/migration/manager.go

Length of output: 1650

* 121-145: The `writeChangeset` method writes the `Changeset` to the database. This method is crucial for persisting changesets during the migration process. Ensure that error handling is robust and that the database transactions are correctly managed to prevent data corruption or loss. * 155-203: The `Sync` method is responsible for catching up with `Changesets` committed during the migration. This method is essential for ensuring data integrity and consistency post-migration. Verify that the synchronization logic correctly handles all edge cases, including missing or out-of-order `Changesets`.
store/root/store_test.go (2)
  • 48-48: The addition of a nil argument in the New function call within the SetupTest method reflects changes in the New function's signature. Ensure that all calls to this function across the codebase are updated to match the new signature.
  • 64-64: The change from accessing stateStore to stateStorage within the rootStore object in the TestGetStateStorage method aligns with the renaming in the root.Store struct. This change should not impact functionality but ensure consistency in naming conventions across the codebase.
store/root/store.go (4)
  • 16-16: The addition of the cosmossdk.io/store/v2/migration import is necessary for the new migration features introduced in this PR. Ensure that this import is used effectively throughout the file, particularly in the new migration-related fields and methods.
  • 44-85: > 📝 NOTE

This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [30-56]

The renaming of stateStore to stateStorage and the addition of migration-related fields (migrationManager, chChangeset, chDone, isMigrating) are critical for supporting the new migration mechanism. Ensure that these changes are consistently applied throughout the codebase and that the new fields are properly utilized in the migration process.

  • 63-72: The update to the New function to include a migration.Manager parameter (mm) is essential for initializing the migrationManager field. This change aligns with the PR's objectives to enhance the migration process. Ensure that all instances of New function calls are updated to include this new parameter.
  • 373-396: The introduction of the StartMigration method is a key addition for initiating the migration process. This method properly checks if a migration is already in progress and sets up the necessary channels and migration manager. Ensure that this method is called appropriately in the migration process and that error handling is robust to prevent issues during migration.

store/migration/manager_test.go Outdated Show resolved Hide resolved
store/changeset.go Outdated Show resolved Hide resolved
store/migration/manager.go Outdated Show resolved Hide resolved
store/migration/manager.go Outdated Show resolved Hide resolved
}

// Start starts the whole migration process.
func (m *Manager) Start(version uint64, chChangeset <-chan *VersionedChangeset, chDone chan struct{}) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does the manager accept a done channel rather than create its own?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comments

store/migration/manager.go Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 1

Configuration used: .coderabbit.yml

Commits Files that changed from the base of the PR and between 927bab0 and 460cedb.
Files selected for processing (4)
  • store/changeset.go (2 hunks)
  • store/changeset_test.go (1 hunks)
  • store/migration/manager.go (2 hunks)
  • store/root/migrate_test.go (1 hunks)
Files skipped from review as they are similar to previous changes (3)
  • store/changeset.go
  • store/migration/manager.go
  • store/root/migrate_test.go

store/changeset_test.go Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: .coderabbit.yml

Commits Files that changed from the base of the PR and between 460cedb and 93666bf.
Files selected for processing (1)
  • store/changeset_test.go (1 hunks)
Files skipped from review as they are similar to previous changes (1)
  • store/changeset_test.go

store/root/store.go Outdated Show resolved Hide resolved
@@ -290,7 +318,12 @@ func (s *Store) Commit(cs *store.Changeset) ([]byte, error) {

// commit SS async
eg.Go(func() error {
if err := s.stateStore.ApplyChangeset(version, cs); err != nil {
// if we're migrating, we don't want to commit to the state storage
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  // if we're migrating, we don't want to commit to the state storage

Why not? How will a node read latest state? Unmarshal the entire changeset at latest version then fetch the key? Why is that preferable to just writing the version to SS in the first place?

I also don't see anywhere where we set s.stateStorage = nil as a flag to indicate a migration is in progress?

@@ -255,6 +268,21 @@ func (s *Store) WorkingHash(cs *store.Changeset) ([]byte, error) {
}

if s.workingHash == nil {
// if migration is in progress, send the changeset to the migration manager
if s.isMigrating {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an odd place to be doing synchronization logic about migrations, and seems to hinge on this arbitrary constraint in the comment:

// WorkingHash returns the working hash of the root store. Note, WorkingHash()
// should only be called once per block once all writes are complete and prior
// to Commit() being called.

Why should WorkingHash only be called once?

Copy link
Contributor

@alexanderbez alexanderbez Feb 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great question. Technically speaking, if you look at root.Store#WorkingHash, it's actually idempotent. So technically, it can be called multiple times.

The comment is mainly to define expected workflow: ABCI.FinalizeBlock -> RootStore.WorkingHash() -> ABCI.Commit(). This doesn't mean it can't be called twice, but rather, it's only expected to be called once.

Regarding placement, why do we hinge or rely on if s.workingHash == nil @cool-develope ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because the migration is only interested in writeSC since it will replace the SC backend after catching up

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 4

Configuration used: .coderabbit.yml

Commits Files that changed from the base of the PR and between 93666bf and 8a6c48b.
Files selected for processing (4)
  • store/migration/manager.go (2 hunks)
  • store/root/migrate_test.go (1 hunks)
  • store/root/store.go (10 hunks)
  • store/storage/store.go (1 hunks)
Files skipped from review as they are similar to previous changes (1)
  • store/root/migrate_test.go
Additional comments: 5
store/migration/manager.go (2)
  • 50-57: Adding new fields to the Manager struct for state storage and commitment is crucial for the migration process. Ensure that these fields are properly initialized and used throughout the migration process. It's also important to consider thread safety when accessing these fields, especially in asynchronous operations like Start and Sync.
  • 60-81: The Start method initiates the migration process and handles changesets asynchronously. It's essential to ensure error handling within the goroutine is robust and does not lead to silent failures. Consider logging or other mechanisms to surface errors that occur within the goroutine. Additionally, verify that the synchronization logic with chDone and chChangeset channels is correctly implemented to prevent deadlocks or race conditions.
store/root/store.go (3)
  • 48-57: The addition of migration-related fields to the Store struct is a significant change. It's important to ensure that these fields are correctly initialized and managed throughout the lifecycle of the store, especially in concurrent environments. Pay particular attention to the thread safety of isMigrating and the correct handling of chChangeset and chDone channels to avoid deadlocks or race conditions.
  • 359-383: The StartMigration method initiates the migration process and sets up the necessary channels. Ensure that the method correctly handles the case where migration is already in progress to prevent starting multiple migrations simultaneously. Additionally, consider the implications of using buffered channels for chChangeset and the mechanism for signaling the completion of the migration with chDone.
  • 257-269: In the WorkingHash method, the logic to handle migration by sending the changeset to the migration manager is crucial. Ensure that the conditions for closing chDone and chChangeset are correctly implemented to prevent premature closure or leaks. Also, verify that the transition from migrating to non-migrating state (isMigrating = false) is handled correctly, especially in concurrent scenarios.

@@ -82,7 +82,7 @@ func (ss *StorageStore) ApplyChangeset(version uint64, cs *store.Changeset) erro
}
}

return nil
return ss.db.SetLatestVersion(version)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The modification to update the latest version in the database before returning from the ApplyChangeset method is a significant change. It's crucial to ensure that all error paths prior to this operation are correctly handled to avoid leaving the database in an inconsistent state. Additionally, consider the implications of this change on concurrent operations, especially in a multi-threaded environment where multiple changesets might be applied simultaneously. Proper synchronization mechanisms should be in place to prevent race conditions.

store/migration/manager.go Outdated Show resolved Hide resolved
store/migration/manager.go Outdated Show resolved Hide resolved
store/root/store.go Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: .coderabbit.yml

Commits Files that changed from the base of the PR and between 8a6c48b and c85bc5a.
Files selected for processing (6)
  • store/batch.go (1 hunks)
  • store/root/migrate_test.go (1 hunks)
  • store/storage/pebbledb/batch.go (1 hunks)
  • store/storage/rocksdb/batch.go (1 hunks)
  • store/storage/sqlite/batch.go (2 hunks)
  • store/storage/store.go (2 hunks)
Files skipped from review as they are similar to previous changes (2)
  • store/root/migrate_test.go
  • store/storage/store.go
Additional comments: 6
store/storage/rocksdb/batch.go (1)
  • 47-49: The Reset method now returns an error, which is a good practice for future-proofing the method against potential error conditions that might arise. However, since it currently always returns nil, it would be beneficial to document under what circumstances an error might be returned, if any are anticipated in the future.
store/storage/pebbledb/batch.go (1)
  • 44-46: The modification to the Reset method to return an error aligns with changes in other storage implementations, enhancing consistency and future error handling capabilities. As with the rocksdb implementation, it would be beneficial to document potential error conditions, even if the method currently always returns nil.
store/batch.go (1)
  • 16-16: Updating the Reset method in the Batch interface to return an error is a significant and positive change. It ensures consistency across different storage implementations and enhances error handling capabilities. This foundational change is crucial for future-proofing and maintaining the robustness of the interface.
store/storage/sqlite/batch.go (3)
  • 26-26: The addition of the db *sql.DB field to the Batch struct is a positive change, enhancing encapsulation by allowing the Batch to manage its own database connection.
  • 33-40: The update to the NewBatch function to accept a db *sql.DB parameter instead of storage *sql.DB is a good practice, ensuring consistency in naming and clarity in the function's purpose.
  • 51-62: The modifications to the Reset method, including returning an error and adding logic to reset the transaction, align with the changes in other storage implementations. This enhances consistency and error handling capabilities. It would be beneficial to document potential error conditions for the Reset method, similar to other implementations.

Comment on lines -146 to +149
return ss.db.SetLatestVersion(version)
return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asking to learn: Why this change? Do we not want to call SetLatestVersion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized the latest version is stored while batch writing

@@ -162,7 +161,7 @@ func (s *Store) Query(storeKey string, version uint64, key []byte, prove bool) (
defer s.telemetry.MeasureSince(now, "root_store", "query")
}

val, err := s.stateStore.Get(storeKey, version, key)
val, err := s.stateStorage.Get(storeKey, version, key)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't we discuss in Slack to add a s.stateStorage == nil check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we discussed keep stateStorage not nil? I also updated the test-cases to ensure not nil

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we wanted to add it for safety.

store/root/store.go Show resolved Hide resolved
@@ -255,6 +268,21 @@ func (s *Store) WorkingHash(cs *store.Changeset) ([]byte, error) {
}

if s.workingHash == nil {
// if migration is in progress, send the changeset to the migration manager
if s.isMigrating {
Copy link
Contributor

@alexanderbez alexanderbez Feb 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great question. Technically speaking, if you look at root.Store#WorkingHash, it's actually idempotent. So technically, it can be called multiple times.

The comment is mainly to define expected workflow: ABCI.FinalizeBlock -> RootStore.WorkingHash() -> ABCI.Commit(). This doesn't mean it can't be called twice, but rather, it's only expected to be called once.

Regarding placement, why do we hinge or rely on if s.workingHash == nil @cool-develope ?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: .coderabbit.yml

Commits Files that changed from the base of the PR and between c85bc5a and 41a8b5d.
Files selected for processing (1)
  • store/root/migrate_test.go (1 hunks)
Files skipped from review as they are similar to previous changes (1)
  • store/root/migrate_test.go

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 2

Configuration used: .coderabbit.yml

Commits Files that changed from the base of the PR and between 41a8b5d and 2fbc467.
Files selected for processing (2)
  • store/root/migrate_test.go (1 hunks)
  • store/root/store.go (10 hunks)
Files skipped from review as they are similar to previous changes (1)
  • store/root/migrate_test.go
Additional comments: 7
store/root/store.go (7)
  • 16-16: Adding the cosmossdk.io/store/v2/migration import is necessary for the migration functionality. Ensure that this package's methods and types are used correctly throughout the file.
  • 30-31: Renaming stateStore to stateStorage improves clarity, aligning the naming with its purpose as a state storage backend. This change should be consistently applied throughout the codebase to avoid confusion.
  • 48-57: The addition of migration-related fields (migrationManager, chChangeset, chDone, isMigrating) is crucial for managing the migration process. It's important to ensure these fields are properly initialized and used in a thread-safe manner, especially since channels (chChangeset, chDone) are involved.
  • 64-73: The constructor New now accepts a migration.Manager parameter, which is essential for initializing the migrationManager field. This change is necessary for integrating the migration logic into the RootStore. Ensure that all calls to New throughout the codebase are updated accordingly.
  • 80-83: The Close method now sets stateStorage and stateCommitment to nil after closing them. This is a good practice for avoiding use-after-close errors. However, ensure that all methods of Store properly handle these fields being nil to prevent nil pointer dereferences.
  • 307-312: The conditional logic to skip committing to the state storage (stateStorage) during migration is a critical design decision. It's important to ensure that this approach does not compromise data integrity or consistency. Given the concerns raised in previous comments, re-evaluate this logic to ensure it aligns with the overall system's reliability and consistency requirements.
Verification successful

Given the need to understand the implications of not committing to stateStorage during migration on data integrity and consistency, we will search the codebase for references to the isMigrating flag and any related documentation or comments. This will help us assess how the system maintains data integrity and consistency during the migration process.


The verification process has been completed based on the provided context and the additional information gathered from the executed shell scripts. The review comment regarding the conditional logic to skip committing to the state storage (stateStorage) during migration and its implications on data integrity and consistency has been thoroughly examined. The code snippets and discussions indicate that this design decision is intentional to avoid parallel writes during migration, which could compromise data integrity. The system's design appears to account for this by managing migration through a controlled process, ensuring data integrity and consistency are maintained.

However, it's crucial to note that while the migration logic seems to be designed with data integrity in mind, the actual impact on data integrity and consistency would depend on the broader system architecture and the specific implementation of the migration process. It's recommended to have comprehensive testing and validation mechanisms in place to ensure that data integrity and consistency are preserved throughout the migration process.

Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Verify the implications of not committing to stateStorage during migration on data integrity and consistency.
# This script is a placeholder and should be replaced with actual verification steps or discussions.
echo "Review the system design to ensure data integrity and consistency are maintained during migration."

Length of output: 206



Script:

#!/bin/bash
# Search for references to the `isMigrating` flag and related documentation or comments
rg "isMigrating" -C 5
# Additionally, search for any mechanisms or comments explaining how data integrity and consistency are ensured during migration
rg "migration" -C 5

Length of output: 266983

* 360-385: The `StartMigration` method correctly initializes the migration process, including setting up channels and marking the store as migrating. However, ensure that this method is called in a controlled manner to prevent starting multiple migrations simultaneously, which could lead to unexpected behavior.

// --- value (bytes)
func (cs *Changeset) Marshal() ([]byte, error) {
var buf bytes.Buffer
buf.Grow(cs.encodedSize())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this isn't a hot code path it's probably OK, but it'd be better to grow in chunks rather than iterate cs.Pairs twice, once to calculate the size and once to marshal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's what IAVL is doing, I have no benchmarking result but it seems like it is introduced to improve the performance.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: .coderabbit.yml

Commits Files that changed from the base of the PR and between 2fbc467 and f589ed0.
Files selected for processing (1)
  • store/migration/manager_test.go (2 hunks)
Additional comments: 3
store/migration/manager_test.go (3)
  • 53-53: The NewManager function call within setupMigrationManager correctly reflects the updated parameter order. This change aligns with the PR's objectives to enhance the state migration process. Ensure that the new parameter order is consistently applied across all calls to NewManager in the codebase.
  • 81-91: The replacement of commitSnapshotter with stateCommitment and storageSnapshotter with stateStorage in the TestMigrateState function is consistent with the PR's objectives to improve clarity and align with the new architecture. This renaming enhances the readability and maintainability of the code. Ensure that all references to these components are updated across the codebase to maintain consistency.
  • 78-102: > 📝 NOTE

This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [99-108]

The checks for the migrated state and the latest state using stateStorage are correctly implemented, following the renaming and logic adjustments. These changes are crucial for validating the effectiveness of the state migration process. It's important to ensure comprehensive test coverage for these scenarios to guarantee the reliability of the migration process.

store/migration/manager.go Dismissed Show dismissed Hide dismissed
store/migration/manager.go Dismissed Show dismissed Hide dismissed
store/root/store.go Fixed Show fixed Hide fixed
store/changeset.go Fixed Show fixed Hide fixed
store/changeset.go Fixed Show fixed Hide fixed
sc, err := commitment.NewCommitStore(multiTrees1, dbm.NewMemDB(), nil, testLog)
s.Require().NoError(err)

snapshotsStore, err := snapshots.NewStore(dbm.NewMemDB(), s.T().TempDir())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm i think this is causing the build to fail, it only need 1 arg

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 11

Configuration used: .coderabbit.yml

Commits Files that changed from the base of the PR and between f589ed0 and f7ff2c1.
Files selected for processing (12)
  • store/internal/encoding/changeset.go (1 hunks)
  • store/internal/encoding/changeset_test.go (1 hunks)
  • store/migration/manager.go (2 hunks)
  • store/migration/manager_test.go (2 hunks)
  • store/root/migrate_test.go (1 hunks)
  • store/root/store.go (10 hunks)
  • store/root/store_test.go (2 hunks)
  • store/storage/pebbledb/batch.go (1 hunks)
  • store/storage/rocksdb/batch.go (1 hunks)
  • store/storage/sqlite/batch.go (2 hunks)
  • store/storage/store.go (2 hunks)
  • store/store.go (1 hunks)
Additional comments: 16
store/storage/rocksdb/batch.go (1)
  • 47-49: The modification to the Reset method to return an error aligns with the principle of explicit error handling in Go. However, since the method always returns nil, consider if there are scenarios where an error could actually occur. If not, this might unnecessarily complicate the method's usage. Otherwise, ensure that future implementations or overrides of this method in derived types properly handle and return meaningful errors.
store/storage/pebbledb/batch.go (1)
  • 44-46: Similar to the rocksdb implementation, the Reset method in the pebbledb Batch struct now returns an error, which is always nil. This change introduces consistency across different database implementations. However, ensure that this pattern is necessary and that there are realistic scenarios where an error could be returned. If not, this might add unnecessary complexity to the method's usage.
store/storage/sqlite/batch.go (2)
  • 26-26: Adding a db *sql.DB field to the Batch struct is a significant change that aligns with the need to manage SQL transactions directly within the batch operations. This change enhances the control over transaction boundaries, which is crucial for ensuring data consistency and rollback capabilities in case of errors.
  • 33-40: The change in the parameter name from storage to db in the NewBatch function improves clarity by explicitly indicating that the parameter is expected to be a database connection. This naming convention aligns better with common Go idioms and enhances code readability.
store/internal/encoding/changeset_test.go (1)
  • 10-93: The test cases provided in TestChangesetMarshal cover various scenarios, including empty changesets, changesets with a single store, changesets with removals, and changesets with multiple stores. This comprehensive coverage ensures that the marshaling and unmarshaling logic is thoroughly tested. However, consider adding negative test cases, such as attempting to unmarshal invalid data, to ensure robust error handling.
store/migration/manager_test.go (1)
  • 53-53: The modification in the setupMigrationManager function to adjust the parameter order for NewManager reflects changes in the underlying implementation. Ensure that all calls to NewManager throughout the test suite and the main codebase are updated accordingly to maintain consistency and prevent runtime errors.
store/internal/encoding/changeset.go (2)
  • 27-71: The MarshalChangeset function correctly implements the encoding logic for the Changeset struct, following the specified encoding format. The use of buffer growth optimization and error handling aligns with best practices. However, ensure that the encoding format is documented and agreed upon, as changes to this format could impact backward compatibility.
  • 73-125: The UnmarshalChangeset function correctly decodes the Changeset from a byte slice, handling errors appropriately. It's crucial to ensure that the decoding logic matches the encoding format exactly to prevent data corruption or loss during the unmarshal process. Additionally, consider adding more error handling for edge cases, such as unexpected EOF or malformed data.
store/storage/store.go (2)
  • 137-139: The addition of the Reset call in the Restore method after writing the batch is a good practice for managing memory and ensuring that the batch is in a clean state for subsequent operations. This change improves the efficiency of batch processing during the restore operation. Ensure that error handling for the Reset method is implemented correctly to handle any potential issues that may arise during the reset process.
  • 150-150: The decision to remove the call to SetLatestVersion from the Restore method and simply return nil at the end requires careful consideration. Ensure that the latest version is correctly managed elsewhere in the code to maintain the integrity of the versioning system. This change could have implications on the overall system behavior, especially in scenarios where the latest version needs to be accurately tracked and updated.
store/store.go (1)
  • 75-79: The addition of the StartMigration method to the RootStore interface is a significant enhancement that facilitates the migration process to new backends in version 2. This method's asynchronous nature, running in a separate goroutine, is particularly noteworthy as it allows for non-blocking migrations. Ensure that the implementation of this method includes robust error handling and logging to manage and monitor the migration process effectively.
store/root/migrate_test.go (1)
  • 36-84: The setup method SetupTest is comprehensive, ensuring that all necessary components for the migration test are initialized. However, consider adding comments to describe the purpose of each block within the setup, especially for complex initializations like the creation of multiTrees, orgSC, ss, sc, snapshotsStore, snapshotManager, and migrationManager. This will improve readability and maintainability.
store/root/store_test.go (2)
  • 55-55: The New function call in the SetupTest method correctly initializes the RootStore with the necessary arguments. However, it's important to ensure that any additional arguments added to the New function in future changes are also reflected here to maintain the test's validity. Keeping the test setup updated with the main codebase is crucial for accurate testing.
  • 71-71: The test TestGetStateStorage correctly asserts the equality between the GetStateStorage method's return value and the stateStorage field of the Store struct. This test ensures that the getter method functions as expected. It's a good practice to include such tests for getter methods to verify their correctness, especially after refactoring or adding new features.
store/root/store.go (2)
  • 45-87: > 📝 NOTE

This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [31-58]

The introduction of migration-related fields (migrationManager, chChangeset, chDone, isMigrating) in the Store struct is a significant change. It's crucial to ensure that these fields are properly documented to explain their purpose and how they interact with the migration process. Additionally, consider the thread safety of these fields, especially since isMigrating is a boolean that might be accessed and modified from multiple goroutines.

+ // migrationManager manages the state migration process from v1 to v2.
+ migrationManager *migration.Manager
+ // chChangeset is used to send changesets to the migration manager.
+ chChangeset chan *migration.VersionedChangeset
+ // chDone signals the completion of the migration process.
+ chDone chan struct{}
+ // isMigrating indicates whether a migration is currently in progress.
+ isMigrating bool
  • 308-313: The conditional logic in the Commit method to skip committing to the state storage during migration raises concerns about the consistency and availability of the latest state. Ensure that this approach does not lead to inconsistencies or hinder the ability to query the latest state during migration. Additionally, consider the implications of this logic on the overall system behavior and data integrity.

store/root/store.go Show resolved Hide resolved
store/root/store.go Show resolved Hide resolved
store/root/store.go Show resolved Hide resolved
store/root/migrate_test.go Show resolved Hide resolved
store/migration/manager.go Show resolved Hide resolved
store/root/store.go Show resolved Hide resolved
store/root/store.go Outdated Show resolved Hide resolved
store/migration/manager.go Show resolved Hide resolved
store/storage/sqlite/batch.go Show resolved Hide resolved
store/migration/manager_test.go Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: .coderabbit.yml

Commits Files that changed from the base of the PR and between f7ff2c1 and 654f5a1.
Files selected for processing (1)
  • store/root/store.go (11 hunks)
Files skipped from review as they are similar to previous changes (1)
  • store/root/store.go

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: .coderabbit.yml

Commits Files that changed from the base of the PR and between 654f5a1 and 09c35e5.
Files selected for processing (1)
  • store/snapshots/manager_test.go (1 hunks)
Additional comments: 1
store/snapshots/manager_test.go (1)
  • 237-245: The loop added for feeding chunks to the manager in the TestManager_Restore function is correctly implemented. It iterates over each chunk, calls manager.RestoreChunk(chunk), and asserts whether the restoration is done based on the chunk index. This approach effectively tests the chunk restoration process and ensures that the manager correctly handles each chunk, marking the restoration as complete after the last chunk is processed. The use of require.NoError(t, err) and the conditional assertions with assert.True(t, done) or assert.False(t, done) are appropriate for verifying the expected behavior in each case.

@cool-develope cool-develope added this pull request to the merge queue Mar 20, 2024
Merged via the queue into main with commit 27a231a Mar 20, 2024
61 of 62 checks passed
@cool-develope cool-develope deleted the store/migrate_catch branch March 20, 2024 22:27
store/root/store.go Dismissed Show dismissed Hide dismissed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement the catch-up process to catch up the processed blocks while the state syncing
5 participants