-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tcmur: fix check_iovec_length #481
base: main
Are you sure you want to change the base?
Conversation
readsector0 checker for multipath is not working, and the log like: [DEBUG_SCSI_CMD] tcmu_print_cdb_info:1069 glfs/block0: 28 0 0 0 0 0 0 0 1 0 [ERROR] check_lba_and_length:107: iov len mismatch: iov len 4096, xfer len 1, block size 512 check_lba_and_length:107: iov len mismatch: iov len 4096, xfer len 1, block size 512 This is because in kernel space the sg->length is aligned to the page size, and also the ringbufer data area's block size. So here we need to make sure that the iov len is not less than the scsi command's require. Signed-off-by: Xiubo Li <[email protected]>
@bgly Thanks. |
If the value returned by tcmu_iovec_length does not match the length of the command we have issues with all the rw paths, because are using the iovec length to perform IOs. I think with your patch in the write path for example, we will end up writing 4096 bytes instead of only 512 which is going to cause corruption. |
Oh yeah here is GH issue you might want to take into consideration when fixing this issue: |
Each handler would probably need to make sure they do not over-write past what they are supposed to? The function @lxbsz is modifying only checks validity and a command is still valid if iov_length < sectors * tcmu_get_dev_block_size(dev) ? |
Yeah, in kernel space the iov_length is aligned to 4096(if page size is 4k), so the iov_length will always larger or equal to sectors * tcmu_get_dev_block_size(dev).
This change won't avoid the corruption then.
@mikechristie @bgly Thanks. |
I am not 100% sure. Yes that will fix the issue, but I think iov_len will be buggy where sometimes it describes the size of the buffer pointed to by iov_base and sometimes it is the number of bytes that are in the buffer at iov_base. I am still looking through the kernel and user space code. I think that the original code meant for iov_len to be the number of bytes in the buffer at iov_base that were supposed to be transferred and somewhere a long the way we messed that up. If that is the case we should fix the kernel because that bug will cause issues with other user space apps. |
What target driver did you hit this with? Did you hit it with loop?
For drivers that go through target_alloc_sgl won't the sg->length be 512 bytes for a 512 byte RW? cmd->data_length =512 so sgl_alloc_order gets passed 512. It then does min(512, PAGE_SIZE << order) which would be 512 for elem_len which gets set to sg->length in sg_set_page. So later when target_core_user does a min(sg_remaining, block_remaining) we should get iov_len = 512. |
The LIO/TCMU for the gluster-block. [...]
This should be what we are expecting. Currently, it seems the sg->length is the page size. I need to check the LIO core code about this. |
Does gluster-block always use iscsi? I am just asking because some fabric drivers like loop allocates scatter gather entries differently than iscsi because the sg list comes from the scsi/block initiator layer.
What kernel is this with? I am actually seeing the iov_len at 512 for 512 byte RWs. |
Sorry for late. Yes currently it is, only iscsi is support.
The kernel version is 3.10.0-862.14.2.el7.x86_64.debug. |
readsector0 checker for multipath is not working, and the log like:
[DEBUG_SCSI_CMD] tcmu_print_cdb_info:1069 glfs/block0: 28 0 0 0 0 0 0 0 1 0
[ERROR] check_lba_and_length:107: iov len mismatch: iov len 4096, xfer len 1, block size 512
check_lba_and_length:107: iov len mismatch: iov len 4096, xfer len 1, block size 512
This is because in kernel space the sg->length is aligned to the
page size, and also the ringbufer data area's block size. So here
we need to make sure that the iov len is not less than the scsi
command's require.
Signed-off-by: Xiubo Li [email protected]