-
Notifications
You must be signed in to change notification settings - Fork 445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug: WebTransport session establishment failed. Too many pending WebTransport sessions (64) #1896
Comments
I believe this is a blocker for ipfs/helia#182 because:
@maschad are you actively working on this? |
I'm not actively working on it at the moment @SgtPooki although I think @achingbrain 's PR #1947 may be related. |
I think #1947 will help with the unstable bit but I can't help but wonder if there's some cleanup we need to do that we're missing to prevent the "Too many pending sessions" thing in the first place. |
Refactors session closing to happen in one function and call that function when the session has closed or failed to init. Doesn't quite solve the "Too many pending WebTransport Sessions" problem but does slow it down a little bit. Refs: #1896
This may be a bug in Chrome. When we forcibly close WebTransport connections whose More details here: https://bugs.chromium.org/p/chromium/issues/detail?id=1473980 |
I've tried to add a global count to the WebTransport transport to ensure we don't go over 64 "pending" connections, taking "pending" as meaning "has yet to resolve/reject the Counting the various WebTransport sessions that have been opened and what happened to them, it seems sessions that reject* their Therefore regardless of any limit we set on how many connections we open simultaneously, once the number of errored connections plus the number of yet-to-resolve/reject connections reaches 65 no further WebTransport connections can be opened. This is bad news and needs a browser fix because once 65 connections have errored it's essentially game over until the page is reloaded. I've updated the chromium bug report with this information. * = The rejection reasons are normal network things - an unreachable host, a handshake timeout, etc. |
A comment on the Chromium bug links to this design doc - it seems Chromium unilaterally applies an anti-DOS measure by keeping "failed" connections in the "pending" state for 5 minutes after the failure. This also seems to include sessions that have had their This seriously limits the amount of connections that can be opened over time. The Chromium bug is still valid, I think - because the 5 minute delay does not seem to be applied, failed connections are "pending" I've put a simple demo page together here that doesn't have any libp2p code in it - https://webtransport-pending-sessions.on.fleek.co/ We can use this to see if the issue has been resolved over time. Interestingly Firefox does not apply the 5 minute wait though it does crash quite reliably. I've tried adding a dial queue to the WebTransport transport that applies the 5 minute wait for new dials once 64 have errored, but we request dial slots quicker than the old ones time out so everything sort of grinds to a halt. We may be able to do something about this by increasing the auto dial retry threshold to something over 5 minutes, this should give Chromium enough time to reach it's internal timeout, after the bug that means it never reaches its internal timeout is fixed 🫠 |
Thanks for staying on top of this one and keeping us updated @achingbrain |
@lidel and Javier from Igalia are working with the Chrome team to get a fix into Chrome. Firefox nightly does have WebTransport and seems to work. |
link to test page: https://libp2p-webtransport-sessions.on.fleek.co/ |
Waiting on Igalia to submit a patch to Chrome that fixes this. |
Notes from Igalia work stream (under various |
Sometimes I see thousands instances of this warning in Chrome:
WebTransport session establishment failed. Too many pending WebTransport sessions (64)
This module may need some sort of dial queue to ensure it doesn't open too many connections and trigger this error.
ported over from libp2p/js-libp2p-webtransport#64
The text was updated successfully, but these errors were encountered: