Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-43949: [C++] io::BufferedInput: Fixing the invalid state with SetBufferSize #44387

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mapleFU
Copy link
Member

@mapleFU mapleFU commented Oct 12, 2024

Rationale for this change

See #43949

The problem is Peek and Read both calls SetBufferSize, however:

  1. Read implicit says that, when SetBufferSize or read, the previous buffer is not being required. In this scenerio, bytes_buffered_ is always 0, since Read will consume all data. And new_buffer_size == std::min(new_buffer_size, (raw_read_bound_ - raw_read_total_))
  2. Peek still requires the buffer-size here. So, bytes_buffered_ might not 0. When Peek, the new_buffer_size would less than expected, which shrinks the buffer

What changes are included in this PR?

Update the Logic of SetBufferSize

  1. If bytes_buffered_ == 0, SetBufferSize can discard the current buffer
  2. Otherwise, SetBufferSize should resize minimal to buffer_size_ + (raw_read_bound_ - raw_read_total_), since it should considering the current buffer

Are these changes tested?

Yes

Are there any user-facing changes?

Bugfix

Copy link

⚠️ GitHub issue #43949 has been automatically assigned in GitHub to PR creator.

@mapleFU
Copy link
Member Author

mapleFU commented Oct 12, 2024

@pitrou @felipecrv @wgtmac This might be a bit critical fixing

Also cc @biljazovic sorry for so slow replying

@github-actions github-actions bot added the awaiting review Awaiting review label Oct 12, 2024
@@ -51,6 +51,7 @@ class BufferedBase {
return !is_open_;
}

// Reset the `buffer_` to a new buffer of size `buffer_size_`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Reset the `buffer_` to a new buffer of size `buffer_size_`
// Resize buffer_ to buffer_size_

Let's keep consistent variable names in the comment across this file.

@@ -284,17 +285,30 @@ class BufferedInputStream::Impl : public BufferedBase {

// Resize internal read buffer. Note that the internal buffer-size
// should be not larger than the raw_read_bound_.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// should be not larger than the raw_read_bound_.
// should not be larger than the raw_read_bound_.

Comment on lines +288 to +289
//
// SetBufferSize will not change the buffer_pos_ and bytes_buffered_.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//
// SetBufferSize will not change the buffer_pos_ and bytes_buffered_.
// It will not change buffer states including buffer_pos_ and bytes_buffered_.

Comment on lines +309 to +310
new_buffer_size =
std::min(new_buffer_size, buffer_size_ + (raw_read_bound_ - raw_read_total_));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
new_buffer_size =
std::min(new_buffer_size, buffer_size_ + (raw_read_bound_ - raw_read_total_));
new_buffer_size =
std::min(new_buffer_size, buffer_pos_ + buffer_size_ + (raw_read_bound_ - raw_read_total_));

Isn't buffer_pos_ required to be kept as well?

@@ -350,7 +364,7 @@ class BufferedInputStream::Impl : public BufferedBase {
}

Status DoBuffer() {
// Fill buffer
// Refill the buffer from the raw stream with `buffer_size_` bytes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Refill the buffer from the raw stream with `buffer_size_` bytes.
// Fill the buffer from the raw stream with at most `buffer_size_` bytes.

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Oct 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants