Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VirtualDev::alloc_blk nblks 16bit overflow can possibly cause silent data corruption #560

Open
yamingk opened this issue Sep 23, 2024 · 0 comments
Labels
bug Something isn't working
Milestone

Comments

@yamingk
Copy link
Contributor

yamingk commented Sep 23, 2024

Problem Statement:

VirtualDev::alloc_blk takes 16bit nblks, however existing caller in homestore take 32bit data size and pass 32bit nblks to vdev layer which will cause overflow when the data size is larger than 256MB, it will cause corruption (worst case data mismatch) when data size being written is larger than 256MB.

Meta Svc layer has protection when the overflow happens, e.g. it can't match the input context_sz with the nblks matched sizes and will hit release assert, however this is still will cause crash.

Code pointer:
repl_req_ctx::alloc_local_blks
--> auto status = data_service().alloc_blks(sisl::round_up(uint32_cast(data_size), // <<< overflow
BlkDataService::alloc_blks
--> return m_vdev->alloc_blks(nblks, hints, out_blkids); // <<< overflow
MetaBlkService::alloc_meta_blks
--> const auto ret = m_sb_vdev->alloc_blks(nblks, blk_alloc_hints{}, bids); // <<< overflow

The underlying alloc_blks when overflow happens, if we go deep into the actual blk allocator (both the bitmap and append blk allocator), it will return SUCCESS with bids.size() equls to zero. There is no bug in the underlying blkallocator, the point is there is no assert in them also to detect the input nblks is actual 0, it will just return SUCCESS with nothing allocated, when this happens, silent data corruption will happen.

The fix needs to be done at vdev layer.

@yamingk yamingk added this to the MileStone4.3 milestone Sep 23, 2024
@yamingk yamingk added the bug Something isn't working label Sep 23, 2024
@yamingk yamingk changed the title VirtualDev::alloc_blk nblks 16bit overflow can cause return SUCCESS with zero blks allocated VirtualDev::alloc_blk nblks 16bit overflow can cause silent data corruption Sep 23, 2024
@yamingk yamingk changed the title VirtualDev::alloc_blk nblks 16bit overflow can cause silent data corruption VirtualDev::alloc_blk nblks 16bit overflow can possibly cause silent data corruption Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant