Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] FS Metadata Duplication #990

Open
turan18 opened this issue Dec 11, 2023 · 0 comments
Open

[Bug] FS Metadata Duplication #990

turan18 opened this issue Dec 11, 2023 · 0 comments
Labels
bug Something isn't working

Comments

@turan18
Copy link
Contributor

turan18 commented Dec 11, 2023

Description

When converting TOC to metadata, we create a root FS bucket for every TOC, where each FS bucket is identified by a random ID.

func (r *reader) init(toc ztoc.TOC, rOpts Options) (retErr error) {
// Initialize root node
var ok bool
for i := 0; i < 100; i++ {
fsID := xid.New().String()
if err := r.initRootNode(fsID); err != nil {
if errors.Is(err, bolt.ErrBucketExists) {
continue // try with another id
}
return fmt.Errorf("failed to initialize root node %q: %w", fsID, err)
}
ok = true
break
}
if !ok {
return fmt.Errorf("failed to get a unique id for metadata reader")
}
return r.initNodes(toc)
}

This means that we create a new FS tree every time we create a reader for a TOC, regardless of whether it exists in the DB. This causes a-lot of duplication and eats up a-lot of disk space on the host machine (eg: when restoring the snapshotter).

Steps to reproduce the bug

  1. Pull an image
nerdctl pull --snapshotter soci public.ecr.aws/soci-workshop-examples/rabbitmq:latest
  1. Check the size of metadata DB
ls -l --block-size=MB metadata.db

-rw------- 1 root root 17MB Dec 11 19:28 metadata.db
  1. Kill and restart the snapshotter
systemctl kill soci-snapshotter
systemctl restart soci-snapshotter
  1. Re-pull the image
nerdctl pull --snapshotter soci public.ecr.aws/soci-workshop-examples/rabbitmq:latest
  1. Check the size of metadata DB
ls -l --block-size=MB metadata.db

-rw------- 1 root root 23MB Dec 11 19:33 metadata.db

Note: I was originally expecting the size to double, there may be some fancy bbolt storage optimization going on here.

Describe the results you expected

Ideally, we shouldn't write a TOC to the DB if it already exists. One possible solution is to have the FS bucket's identified by the hash of the TOC/zTOC instead of using a random id.

Host information

  1. OS:
  2. Snapshotter Version:
  3. Containerd Version:

Any additional context or information about the bug

No response

@turan18 turan18 added the bug Something isn't working label Dec 11, 2023
@github-project-automation github-project-automation bot moved this to ❓ Ungroomed in soci-snapshotter Dec 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Ungroomed
Development

No branches or pull requests

1 participant