Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle limited open Connections due to keepalive connections #9916

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

sgothel
Copy link

@sgothel sgothel commented Aug 28, 2024

Summary

  • ProtocolHandlerInterface::checkTimeout(..)
    is now called via StreamSocket::checkRemoval()
    to ensure tests even w/o input data (was: handlePoll())

  • ProtocolHandlerInterface::checkTimeout(..)

    • Add bool return value: true -> shutdown connection, caller shall stop processing
    • Implemented for http::Session
      • Timeout (30s) net::Defaults::HTTPTimeout with missing response
    • Implemented for WebSocketHandler
      • Timeout (2s = net::Defaults::WSPingTimeout)
        after missing WS native frame ping/pong (server only)
  • StreamSocket -> Socket (properties moved)

    • bytes sent/received
    • closed state
  • Socket (added properties)

    • creation- and last-seen -time
    • socket type and port
    • checkRemoval(..)
      • called directly from SocketPoll::poll()
      • only for IPv4/v6 network connections
      • similar to ProtocolHandlerInterface::checkTimeout(..)
      • added further criteria (age, throughput, ..)
        • Timeout (64s = net::Defaults::SocketPollTimeout)
          if (now - lastSeen) > timeout
        • Timeout (net::Defaults::MinBytesPerSec)
          if Throughput < MinBytesPerSec (disabled by default)
        • TODO: Add maximimal IPv4/IPv6 socket-count criteria, drop oldest.
        • age via Socket::creationTime is currently not used as requested
  • SocketPoll::poll()

    • Additionally erases if !socket->isOpen() || socket->checkRemoval()

Changes to original PR

  • net::Config -> net::Defaults, without any xml config file side-effects
    • only allow net::Defaults to be changed in unit tests via dedicated callback
  • removed Socket age criteria via Socket::creationTime and net::defaults::MaxDuration
    • Socket::creationTime still relevant for minimum throughput criteria
    • dropped the net::defaults::MaxDuration
  • max-connections
    • removed under- overflow checks
    • using SC atomic ops (not just relaxed)
  • enhanced API doc of Util::TimeAverage (clarification)
  • unit tests in one place / commit
  • more unification/squashing to single semantic commits
  • more sensible order of commits
  • removed dead code
  • dropped noexcept patch

Further:

  • cleaned up the commit messages
  • removed the COOLWSD.cpp default config reordering
  • reordered the 'Unify socket stats', as it must be upfront the actual connection count limitation
    and before moved _lastSeenHTTPHeader
  • fixed another intermediate breakage in the commit chain
  • bisectable proof (but no unit test runs ofc) .. via
    • passed git rebase -i --exec "make -j 20" for all commits
  • Use scoped enums with efficient types (STATE_ENUM)
    • now using desired STATE_ENUM

Unit tests: +7 files, 879 insertions(+), remaining changed lines +800, -222

Checklist

  • I have run make prettier-write and formatted the code.
  • All commits have Change-Id
  • I have run tests with make check
  • Passed certain unit tests and cypress desktop annotations.js
  • I have issued make run and manually verified that everything looks okay
  • Documentation (manuals or wiki) has been updated or is not required

@sgothel sgothel force-pushed the private/sgothel/cool_9833_connection_limit branch 3 times, most recently from a2c3068 to 3ef99c5 Compare August 28, 2024 00:58
@sgothel sgothel self-assigned this Aug 28, 2024
@sgothel
Copy link
Author

sgothel commented Aug 28, 2024

This change set created a linkage error (android), the 'usual' type_info, vtable thing.
Can't figure a missed 'virtual' or even pure virtual '=0' virtual key in the code, also reviewed the MOBILEAPP macro.
Hence I need to reproduce and build the android locally here .. WIP.

net/Socket.cpp Outdated
@@ -1298,6 +1385,52 @@ LocalServerSocket::~LocalServerSocket()
# define LOG_CHUNK(X)
#endif

bool StreamSocket::checkForcedRemoval(std::chrono::steady_clock::time_point now) noexcept
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might need to move this impl to after

"#endif // !MOBILEAPP" (if mobile needs it) and/or put ifdef MOBILEAPP around it in the header if it doesn't or a !MOBILE stub, something like that anyway

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. Great eyes. Thx!

Copy link
Contributor

@Ashod Ashod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @sgothel. There are some really good stuff here!

I understand this is a draft, but it reads like a mix of random fixes and improvements (some of it is good and useful, like adding const and = default, make_shared, etc.), re-inventing methods for doing things already supported, and debugging/tracing support for your understanding (to be removed).

It's very hard to read, as there is no clear theme or purpose.

I left various comments, but I didn't dive too deep in a number of areas because it was unclear why we are either replacing an existing approach, or inventing a new one when we can improve on the existing one (where there are shortcomings).

At a minimum, I'd separate the backtrace improvements from the socket improvements, and likely the misc. improvements from new code/features/tests.

Finally, I'm troubled by vformatString and the re-working a lot of the socket states and internals with a huge set of new members that change the existing semantics with unclear benefits.

Perhaps there is some context I'm missing. Perhaps this was agreed on and I'm unaware.

net/HttpRequest.hpp Outdated Show resolved Hide resolved
net/HttpRequest.hpp Show resolved Hide resolved
net/HttpRequest.hpp Show resolved Hide resolved
net/ServerSocket.hpp Outdated Show resolved Hide resolved
net/Socket.cpp Outdated Show resolved Hide resolved
test/UnitSocketLifecycle.cpp Outdated Show resolved Hide resolved
test/UnitSocketLifecycle.cpp Outdated Show resolved Hide resolved
test/WebSocketSession.hpp Outdated Show resolved Hide resolved
common/Util.cpp Outdated Show resolved Hide resolved
net/Socket.cpp Outdated Show resolved Hide resolved
@sgothel
Copy link
Author

sgothel commented Aug 28, 2024

Thank you @Ashod for the review.

I understand this is a draft, but it reads like a mix of random fixes and improvements (some of it is good and useful, like adding const and = default, make_shared, etc.), re-inventing methods for doing things already supported, and debugging/tracing support for your understanding (to be removed).

The draft status was motivated by my intention to discuss the actual semantics and goals,
i.e. mitigate limited connections - or better, ensure connections are closed as well as
to reduce the risk of running out of sockets/file-handles.

Which brings us to the next point I guess.

It's very hard to read, as there is no clear theme or purpose.

When reading commit by commit, I believe everything is well split apart semantically.
I believed the purpose was well described in #9833 and the commit 3ef99c5
actually starting to address (matching this drafts' description) was meaningful.

Indeed, I added a few unrelated single commits into this review. Understanding that we allow
reasonable small changes w/o issue report, esp of intrinsic nature, I believe it should be OK.
(admin vs workflow compromise as discussed earlier)

I left various comments, but I didn't dive too deep in a number of areas because it was unclear why we are either replacing an existing approach, or inventing a new one when we can improve on the existing one (where there are shortcomings).

I am not aware of really replacing anything. I merely moved and added functionality.

At a minimum, I'd separate the backtrace improvements from the socket improvements, and likely the misc. improvements from new code/features/tests.

As described above, a zero side-effect change in its own commit.

The Backtrace object in particular is useful for all, as we can produce an instance
capturing the current stack, move it around and output where desired.
As discussed, I did not add addr2line or libunwind dependencies, which I understand ofc.

Finally, I'm troubled by vformatString and the re-working a lot of the socket states and internals with a huge set of new members that change the existing semantics with unclear benefits.

I just like printf formatting as it is more performant and flexible (at least to me).
IMHO I use it in a most reasonable save fashion, while being supported by the compiler to
remove the risk of wrong format specifiers and stack values. The latter is still a weakness, which I am aware of.

Perhaps there is some context I'm missing. Perhaps this was agreed on and I'm unaware.

Allow me to continue answering your other remarks below.

@sgothel sgothel force-pushed the private/sgothel/cool_9833_connection_limit branch from 2d51475 to 508ccb0 Compare August 28, 2024 19:01
@caolanm
Copy link
Contributor

caolanm commented Aug 29, 2024

I think it'll be easier to manage this if its split up so that the mostly uncontroversial misc cleanups of existing stuff where no change in logic is intended, the make_unique, make_shared, privator ctor -> = delete, constifying, expansion of auto& to real type, emplace_back, etc goes into a separate pr.

The backtrace debugging aid looks probably useful to have to me, but can also be its own pr I think (and probably not worth fighting the "CodeQL scanning - action / Analyze" step which apparently can't handle the constexpr use which is blocking ci from passing this pr).

And generally trim this pr down to the core change to make it easier to think about it

@Ashod
Copy link
Contributor

Ashod commented Aug 30, 2024

Thanks for the responses, @sgothel! We've discussed some offline, so didn't reply here (e.g. printf-style formatting being fast--in fact it's very slow compared to only serialization).

Indeed, I added a few unrelated single commits into this review. Understanding that we allow reasonable small changes w/o issue report, esp of intrinsic nature, I believe it should be OK. (admin vs workflow compromise as discussed earlier)

In general, agreed. But I'll echo @caolanm's comment here that breaking up into PRs, that can be decided to merge independent of other parts, is always helpful.

I should also point out that this PR is well above a 1000 lines. This is a rare size. Considering that some parts change the interface of a large hierarchy (the sockets), which is a fundamental core part of the whole project and very delicate (can easily break things catastrophically), it requires reading and studying far more code (i.e. the context and consequences) to review properly. An isolated and low-risk change of a few 100 lines is fine. A complex few 100 line change (such as in the sockets) is challenging and time consuming. Sometimes we have no choice because we can't break it up. But a 1000 lines that can be easily broken up is hardly necessary.

Finally, even though we can review individual commits, it's preferable to have a single theme to a PR. Helps the reviewer. In some cases a few misc. changes can go together (comment fixes, cosmetics, formatting, logging, etc.). But mixing functional changes with non-functional changes dilutes the process and can reduce the quality of the review. More so with high-risk changes (i.e. socket code) mixed with debugging code (i.e. backtracing).

Regardless, you have a lot of great improvements here, and I don't want to undermine the importance of your contribution!
Thank you.

@sgothel sgothel force-pushed the private/sgothel/cool_9833_connection_limit branch from 508ccb0 to cb801db Compare September 2, 2024 03:41
@sgothel
Copy link
Author

sgothel commented Sep 2, 2024

Thank you both for your review and kind words.

I have split up single commits into 4 branches, i.e. this pull-request branch and

For this feature work, I need to complete the unit test incl. the timeout configuration.
Looks OK .. shall provide it in 1-2 days.

net/HttpRequest.hpp Fixed Show fixed Hide fixed
net/Socket.cpp Fixed Show fixed Hide fixed
net/Socket.hpp Fixed Show fixed Hide fixed
net/WebSocketHandler.hpp Fixed Show fixed Hide fixed
net/WebSocketHandler.hpp Fixed Show fixed Hide fixed
net/WebSocketHandler.hpp Fixed Show fixed Hide fixed
@sgothel sgothel force-pushed the private/sgothel/cool_9833_connection_limit branch from cb801db to 6fc374e Compare September 6, 2024 11:28
net/Socket.cpp Fixed Show fixed Hide fixed
net/Socket.hpp Fixed Show fixed Hide fixed
net/Socket.hpp Fixed Show fixed Hide fixed
net/Socket.hpp Fixed Show fixed Hide fixed
@sgothel sgothel force-pushed the private/sgothel/cool_9833_connection_limit branch from 6fc374e to 31c1b94 Compare September 6, 2024 11:51
net/NetUtil.hpp Show resolved Hide resolved
@Ashod
Copy link
Contributor

Ashod commented Sep 6, 2024

@sgothel, there were multiple suggestions to split this large PR up to reduce its size. That was when it had 18 commits and was "only" a 1000 lines. Now it's past 1500 lines and is 1 commit! I'm sensing this isn't going in the direction of the feedback.

What's the plan, please?

test/UnitTimeoutNone.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@mmeeks mmeeks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As said needs to be split up; lets try to get the less controversial pieces in and merged as separate pull requests - to slim this down =) Can you work on that Sven ?

wsd/COOLWSD.cpp Outdated Show resolved Hide resolved
@sgothel
Copy link
Author

sgothel commented Sep 6, 2024

As said needs to be split up; lets try to get the less controversial pieces in and merged as separate pull requests - to slim this down =) Can you work on that Sven ?

will do the split later then (tomorrow) + finishing the tests

@sgothel
Copy link
Author

sgothel commented Sep 6, 2024

@sgothel, there were multiple suggestions to split this large PR up to reduce its size. That was when it had 18 commits and was "only" a 1000 lines. Now it's past 1500 lines and is 1 commit! I'm sensing this isn't going in the direction of the feedback.

What's the plan, please?

As I wrote, I will split out the config and TimeAverage (~400 lines or so) and put the remainder at the bottom -> 3 commits within this pull request. Perhaps more, w/ a fresh look.

But splitting this up in multiple pull request (PR) makes would probably makes no sense, as these pieces are required and each PR must be tested. IMHO.

@sgothel
Copy link
Author

sgothel commented Sep 26, 2024

Added minor changes addressing @Ashod earlier notes.

@sgothel sgothel requested a review from mmeeks September 26, 2024 14:02
net/Socket.hpp Show resolved Hide resolved
@sgothel
Copy link
Author

sgothel commented Sep 26, 2024

Note: Unit tests: +7 files, 879 insertions(+), remaining changed lines +800, -222

Copy link
Contributor

@mmeeks mmeeks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok - this is great; thanks Sven & Caolan - much easier to review, can go in at the beginning of the next Red merge window; ignore my comments about unit tests - I see them at the end; other minor cleanups can come later / get tacked on.
Nice work - thanks ! =)

net/Socket.hpp Outdated Show resolved Hide resolved
net/Socket.cpp Outdated Show resolved Hide resolved
net/Socket.cpp Outdated Show resolved Hide resolved
net/Socket.cpp Outdated Show resolved Hide resolved
net/Socket.cpp Show resolved Hide resolved
@mmeeks
Copy link
Contributor

mmeeks commented Oct 3, 2024

Actually - I lie the c1,2,3,4 thing really needs fixing or explaining; its not clear to me that it will work =)

@sgothel sgothel force-pushed the private/sgothel/cool_9833_connection_limit branch from fe29153 to 62a5cc2 Compare October 9, 2024 11:49
@sgothel
Copy link
Author

sgothel commented Oct 9, 2024

@mmeeks @caolanm addresses Michael's notes

  • inlined the changes in rebased commits
  • rebased to (Monday's) master

@sgothel
Copy link
Author

sgothel commented Oct 9, 2024

* rebased to (Monday's) master

I apologize: Should have pushed with a rebase to master 1st, then another push with the changes.
This would have allowed us to see the push delta on the fixed notes only using this github UI diff.
Now it shows the total change :-(

Sven Göthel added 18 commits October 10, 2024 15:13
Preparation for cool#9833.

Socket Restructuring:
- Hoist open/close state from StreamSocket to Socket

Signed-off-by: Sven Göthel <[email protected]>
Change-Id: I7e1a9329e0848c40a210f6250e29e26950da6fbc
…own. SocketPoll::poll robustness

Preparation for cool#9833.

Socket Restructuring:
- Unify closing

SocketPoll::poll robustness
- Ensure erasing Socket if !isOpen()

Signed-off-by: Sven Göthel <[email protected]>
Change-Id: I7e1a9329e0848c40a210f6250e29e26950da6fbc
…hutdown*

Preparation for cool#9833.

WebSocketHandler::shutdown*
- Add shutdownSilent(), preparation to force shutdown on timeout
  w/o closing frame.

Signed-off-by: Caolán McNamara <[email protected]>
Signed-off-by: Sven Göthel <[email protected]>
Change-Id: I68d1d18311a059f94b84ef1867f8343ef7a09c1d
Signed-off-by: Caolán McNamara <[email protected]>
Signed-off-by: Sven Göthel <[email protected]>
Change-Id: Ia79f8027eb5cc8911d5cdf71ba1a73fdd1b0dde3
Signed-off-by: Caolán McNamara <[email protected]>
Signed-off-by: Sven Göthel <[email protected]>
Change-Id: I9c34d3d9b2caf4aa0e9ba09782bf4a51c5e9a239
Signed-off-by: Caolán McNamara <[email protected]>
Signed-off-by: Sven Göthel <[email protected]>
Change-Id: I9c5302b292c406bdf05eb031615a93d8939b0a88
Signed-off-by: Caolán McNamara <[email protected]>
Signed-off-by: Sven Göthel <[email protected]>
Change-Id: Ifaa7ea79d05d0f1580a6f8ffcc6db66814005199
Signed-off-by: Caolán McNamara <[email protected]>
Signed-off-by: Sven Göthel <[email protected]>
Change-Id: If736c3e1b00fbadd15504e97970a6d4d8f5873d7
- DelaySocket::State
- ServerSocket::Type
- ::SharedFDType,
- StreamSocket::ReadType
- StreamSocket::WSState

Signed-off-by: Caolán McNamara <[email protected]>
Signed-off-by: Sven Göthel <[email protected]>
Change-Id: I33c1a233632b25d0632a9130a03aee2991a6e3e7
Preparation for cool#9833, to be used to provide a time-based
ping response time average.

TimeAverage allows us to calculate the time based average of a value
while adding new time-point/value tuples at arbitrary time-points.

The result is stable against small fluctuations
and shall provide a reasonable actionable criteria.

Signed-off-by: Sven Göthel <[email protected]>
Change-Id: I7e1a9329e0848c40a210f6250e29e26950da6fbc
…imeout)

Preparation for cool#9833.

Using runtime mutable defaults for network properties allows us to
test the socket limitation code as well as to fine-tune certain properties for clients.

Costs are removal of constexpr properties, using a static local (singleton) instance,
cached where used - hence rendering the code less optimized and more pessimistic.

However, the singleton instance is fetched at object creation w/o cache hit using
a static-local, networking timing outweights memory reads and no hot-spot
performance heavy loop is affected.
Hence rendering the feature testable at runtime shall outweight the lesser costs.

- All network limits and timeouts are configurable within our unit tests only
    - see net::Defaults in NetUtil.hpp

Signed-off-by: Sven Göthel <[email protected]>
Signed-off-by: Caolán McNamara <[email protected]>
Change-Id: I7e1a9329e0848c40a210f6250e29e26950da6fbc
…as creationTime to ctor

This could reduce costs involved via std::chrono::steady_clock::now(),
but also allows socket creation using a dedicated time-point.

Signed-off-by: Caolán McNamara <[email protected]>
Signed-off-by: Sven Göthel <[email protected]>
Change-Id: I7e1a9329e0848c40a210f6250e29e26950da6fbc
…incl. creation- and lastSeenTime

Socket Restructuring:
- Unify stats(bytes + throughput)
- Persistent creationTime (immutable) and lastSeenTime (mutable)

Signed-off-by: Caolán McNamara <[email protected]>
Signed-off-by: Sven Göthel <[email protected]>
Change-Id: If9e3b28cfd4c6fdb73b978b80ef2063c63def6d1
moved `StreamSocket::_lastSeenHTTPHeader` to
`std::chrono::steady_clock::time_point &lastHTTPHeader`
where it is being required - reduces general footprint.

Signed-off-by: Sven Göthel <[email protected]>
Signed-off-by: Caolán McNamara <[email protected]>
Change-Id: I5148fa7ee2aa5ce88ef8bf31c2e036921d01277c
- ProtocolHandlerInterface::checkTimeout(..)
    is now called via StreamSocket::checkRemoval()
    to ensure tests even w/o input data (was: handlePoll())

- ProtocolHandlerInterface::checkTimeout(..)
    - Add bool return value: true -> shutdown connection, caller shall stop processing
    - Implemented for http::Session
    - Timeout (30s) net::Defaults::HTTPTimeout with missing response
    - Implemented for WebSocketHandler
    - Timeout (2s = net::Defaults::WSPingTimeout)
        after missing WS native frame ping/pong (server only)

- StreamSocket::checkRemoval(..)
    - called directly from SocketPoll::poll()
    - only for IPv4/v6 network connections
    - similar to ProtocolHandlerInterface::checkTimeout(..)
    - calls ProtocolHandlerInterface::checkTimeout(..)
    - added further criteria (throughput, ..)
    - Timeout (64s = net::Defaults::SocketPollTimeout)
        if (now - lastSeen) > timeout
    - Timeout (net::Defaults::MinBytesPerSec)
        if Throughput < MinBytesPerSec (disabled by default)
    - TODO: Add maximal IPv4/IPv6 socket-count criteria, drop oldest.
    - age via Socket::creationTime is currently not used as requested

- SocketPoll::poll()
    - Additionally erases if !socket->isOpen() || socket->checkRemoval()

Signed-off-by: Sven Göthel <[email protected]>
Signed-off-by: Caolán McNamara <[email protected]>
Change-Id: I7e1a9329e0848c40a210f6250e29e26950da6fbc
Adding connection limit where registered via SocketPoll::setLimiter(),
i.e. DocumentBrokerPoll and WebServerPoll only.

SocketPoll::poll() will drop _new_ overhead connections exceeding MaxConnections.
This has been discussed, in favor of dropping the oldest connections.

Aligned net::Defaults::MaxConnections w/ pre-existing MAX_CONNECTIONS.

SocketPoll::setLimiter():
- Increments given non-zero connectionLimit by one
  for WS upgrade socket.

Signed-off-by: Sven Göthel <[email protected]>
Change-Id: I7e1a9329e0848c40a210f6250e29e26950da6fbc
test/UnitTimeoutBase.hpp
  - Base for UnitTimeout* tests
test/UnitTimeoutNone.cpp
  - Test without limitation, no disconnection
test/UnitTimeoutConnections.cpp
  - Test disconnection with limited max-connections (5) via net::Defaults
test/UnitTimeoutWSPing.cpp
  - Test disconnection with limited WSPingTimout (20us) using WSPingPeriod (10ms) via net::Defaults
test/UnitTimeoutSocket.cpp
  - Test disconnection with limited MinBytesPerSec (100kBps) via net::Defaults

test/WebSocketSession's WebSocketSession::poll()
added a isConnected() check allowing to abort.

Signed-off-by: Sven Göthel <[email protected]>
Change-Id: I7e1a9329e0848c40a210f6250e29e26950da6fbc
Signed-off-by: Sven Göthel <[email protected]>
Signed-off-by: Caolán McNamara <[email protected]>
Change-Id: I9f7a6fbbf93381de72471ea878582f8e5db88962
@sgothel sgothel force-pushed the private/sgothel/cool_9833_connection_limit branch from 62a5cc2 to acc1d92 Compare October 10, 2024 13:26
@sgothel
Copy link
Author

sgothel commented Oct 10, 2024

resolved merge conflict due to changed net/Socket.*

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: To Test
Development

Successfully merging this pull request may close these issues.

Handle limited open Connections due to keepalive connections (cool#9621)
4 participants