Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix hang on TTY read #192

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

danielstuart14
Copy link
Contributor

In some cases, when a underlying error happens to a TTY port between poll::wait_read_fd and unistd::read, the read function would hang waiting for some data that is never received.

This commit sets the port to non-canonical mode, with VMIN = VTIME = 0. With this change, it has the effect of making reads non-blocking, returning right away.

The timeout behaviour is maintained, as prior to reading we call unix::poll through poll::wait_read_fd.

Fixes: #7

In some cases, when a underlying error happens to a TTY port between poll::wait_read_fd and unistd::read, the read function would hang waiting for some data that is never received.

This commit sets the port to non-canonical mode, with VMIN = VTIME = 0. With this change, it has the effect of making reads non-blocking, returning right away.

The timeout behaviour is maintained, as prior to reading we call unix::poll through poll::wait_read_fd.

Fixes: serialport#7
@danielstuart14
Copy link
Contributor Author

@sirhcel @eldruin let me know if there is anything you'd like me to change.

Copy link
Contributor

@sirhcel sirhcel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for looking into the dusty corners of our issue list @danielstuart14 and creating this PR!

In some cases, when a underlying error happens to a TTY port between poll::wait_read_fd and unistd::read, the read function would hang waiting for some data that is never received.

Yes, at a first glance this looks like something the actual code does not consider. Do you have an example to reproduce such a situation?

This commit sets the port to non-canonical mode, with VMIN = VTIME = 0. With this change, it has the effect of making reads non-blocking, returning right away.

Setting VMIN and VTIME to zero looks like the right thing for not blocking on the read after successfully polling for new data to read. But as annotated below - is VTIME set to `timeout unintentionally then?

And isn't there some explicit handling required in case reading returns 0, eventually with errno set?

The timeout behaviour is maintained, as prior to reading we call unix::poll through poll::wait_read_fd.

Wouldn't VTIME == timeout change the timeout behavior in when read is callen after polling?

pub(crate) fn set_timeout(termios: &mut Termios, timeout: Duration) {
let timeout = u128::min(timeout.as_millis() / 100, u8::MAX as u128) as u8;
termios.c_cc[libc::VMIN as usize] = 0;
termios.c_cc[libc::VTIME as usize] = timeout;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description (and the documentation of termios as well) mention setting VTIME to 0 as well. What's the reason for setting it to timeout here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this utils function to bridge how timeouts are set on termios (by setting VTIME to a value with 100ms steps) with the Duration type on rust.
If you check at

termios::set_timeout(&mut termios, Duration::from_millis(0));
, this function is being called with a duration = 0, which means that VTIME is set to zero.

@danielstuart14
Copy link
Contributor Author

danielstuart14 commented Jul 10, 2024

Thank you for looking into the dusty corners of our issue list @danielstuart14 and creating this PR!

In some cases, when a underlying error happens to a TTY port between poll::wait_read_fd and unistd::read, the read function would hang waiting for some data that is never received.

Yes, at a first glance this looks like something the actual code does not consider. Do you have an example to reproduce such a situation?

This commit sets the port to non-canonical mode, with VMIN = VTIME = 0. With this change, it has the effect of making reads non-blocking, returning right away.

Setting VMIN and VTIME to zero looks like the right thing for not blocking on the read after successfully polling for new data to read. But as annotated below - is VTIME set to `timeout unintentionally then?

And isn't there some explicit handling required in case reading returns 0, eventually with errno set?

The timeout behaviour is maintained, as prior to reading we call unix::poll through poll::wait_read_fd.

Wouldn't VTIME == timeout change the timeout behavior in when read is callen after polling?

Hey there!
I'm maintaining a testing jig with 25 serial devices connected to a single computer, and I saw this issue happen maybe three times in the last 6 months. I can't reproduce it on the bench, as the timings (and maybe even the intrinsics of it) are very particular. My guess right now is that a packet dropped by some issue with the USB controller (which I verified was the case through some dmesg errors) triggered this behavior. Just disconnecting the USB doesn't cause it. I got stuck for quite some time with this intermittent issue, but then found the aforementioned issue and added some debug messages to the code, which confirmed that it was indeed getting stuck in the read.

About VTIME, I'm setting it to zero through the bridge function I created, check my reply above.

Any internal errors should be returned by the unistd "read" implementation. What maybe we could do is return an error if the poll function returned successfully but the read didn't return any data, but it would need some thinking regarding the behavior when timeout = 0 or buf length = 0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TTYPort::read doesn't respect timeout
2 participants