Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection close issue after upgrading from 2.10 to 2.12 #1654

Open
dergraf opened this issue Sep 24, 2024 · 6 comments
Open

Connection close issue after upgrading from 2.10 to 2.12 #1654

dergraf opened this issue Sep 24, 2024 · 6 comments

Comments

@dergraf
Copy link
Contributor

dergraf commented Sep 24, 2024

After upgrading Cowboy 2.10 to 2.12 we've noticed a different behaviour when downloading a large file using http1.1 over a rather bad internet connection. It's a Java client producing a Exception in thread "main" org.apache.hc.core5.http.ConnectionClosedException: Premature end of Content-Length delimited message body (expected: 119354644; received: 23919978) error. So my guess is the changes related to the timeouts in 2.12 trigger this issue. idle_timeout is unchanged at 5 minutes , so I tried to set the new reset_idle_timeout_on_send which didn't help. Maybe anyone has an idea what could cause the changed behaviour here. Thanks!

@essen
Copy link
Member

essen commented Sep 24, 2024

Could also be 1a175e7

* HTTP/1.1 connections would sometimes use the wrong timeout
  value to determine whether the connection should be closed.
  This resulted in connections staying up longer than
  intended. This should no longer be the case.

A git bisect would tell us much. There aren't that many commits and half of them are documentation so it should be straightforward.

@dergraf
Copy link
Contributor Author

dergraf commented Sep 24, 2024

You were spot on with 1a175e7 . I've dropped this commit from your main branch and deployed a 2.12', the download worked flawlessly. I'm going to try tomorrow over a even slower internet connection to double check that this was the issue.

@dergraf
Copy link
Contributor Author

dergraf commented Sep 25, 2024

Double checked on the slowest connection I could find. Took me 3 minutes to download the 120MB, the version without the change worked like a charm. ;)

Should I create a PR reverting the change?

@essen
Copy link
Member

essen commented Sep 25, 2024

No we should find where it goes wrong and fix it. It's probably using request_timeout where it should be idle_timeout. It was the opposite before (using idle_timeout when it should have been using request_timeout). Both behaviors are wrong but the commit should be closer to what we want (and it has tests).

My guess is that it doesn't check something properly and thinks there are no ongoing requests, but only debugging would tell us what's really happening.

A good first step would be writing a test that sends a large enough body that fails in the same way as you're seeing it.

@riccardomanfrin
Copy link

riccardomanfrin commented Oct 3, 2024

Hi guys, the problem is with this commit 1a175e7 as it was mentioned
It changes the behavior for connections that were long standing in http 1.1

Originally I started cowboy with

:cowboy.start_tls(__MODULE__, cowboy_opts, %{
        idle_timeout: :infinity,
        inactivity_timeout: :infinity,
        env: %{dispatch: compiled_routes}
      })

To revert to the original behavior you have to add the option request_timeout: :infinity, like so:

:cowboy.start_tls(__MODULE__, cowboy_opts, %{
        request_timeout: :infinity,
        idle_timeout: :infinity,
        inactivity_timeout: :infinity,
        env: %{dispatch: compiled_routes}
      })

To be fair it's confusing to me the meaning of these 3 timeouts.. and at least in my mind request_timeout makes me think of the timeout while a request is being processed, not the timeout of the connection itself.

Unfortunately in my case I'm trying to prolong the lifetime of the connection even though there were no ongoing connections, which you might say to be an abuse of the protocol.. but it fits my use case..

I might switch to http2 if you think that the eventual fix is going to not allow me to do the trick in the future.

@essen
Copy link
Member

essen commented Oct 3, 2024

The added option should do what you want, including when the issue in this ticket is fixed. The issue in this ticket is that it sometimes uses the wrong timeout, but if you configure all of them as infinity, no problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants