v0.11.7
Many important fixes for RTS/DB and the language in general!
Added
- Bash completion is now part of the Debian packages & brew formula
Changed
- actondb now uses a default value for gossip port of RPC port +1 [#913]
- The gossip protocol only propagates the RPC port & parts of the
implementation has a hard-coded assumption that the gossip port has a +1
offset - In order to avoid configuration errors, the default gossip port is now RPC
port + 1 and if another gossip port is explicitly configured, an error log
message is emitted on startup. - While this is marked as a change, it could really be considered a fix as any
other configuration of the system was invalid anyway.
- The gossip protocol only propagates the RPC port & parts of the
Fixed
- Fixed include path for M1
- /opt/homebrew/include added to header include path [#892]
- Actually fixes builds on M1!
- This has "worked" because the only M2 where Acton was tested also had header
files in /usr/local/include but on a fresh install it errored out.
- Fix up-to-date check in compiler for imported modules from stdlib [#890]
- Fix seed arg parsing in actondb that lead to "Illegal instruction" error
- Fix nested dicts definitions [#869]
- Now possible to directly define nested dicts
- Avoid inconsistent view between RTS & DB in certain situations [#788]
- If an RTS node was stopped & quickly rejoins or if a transient partition
happens and the gossip round does not complete before the partition heals. - We now wait for gossip round to complete.
- This ensures that local actor placement doesn't fail during such events.
- If an RTS node was stopped & quickly rejoins or if a transient partition
- Fix handling of missed timer events [#907]
- Circumstances such as suspending the Acton RTS or resuming a system from the
database could lead to negative timeout, i.e. sleep for less than 0 seconds. - The libuv timeout argument is an uint64 and feeding in a negative signed
integer results in a value like 18446744073709550271, which roughly meant
sleeping for 584 million years, i.e. effectively blocking the RTS timerQ. - It's now fixed by treating negative timeouts as 0, so we immediately wake up
to handle the event, however late we might be.
- Circumstances such as suspending the Acton RTS or resuming a system from the
- Timer events now wake up WT threads after system resumption [#907]
- Worker Threads (WT) are created in
NoExist
state and should transition
intoIdle
once initiated, however that was missing leading to a deadlock. - This was masked as in most cases, a WT and will transition into
Working
once they've carried out some work and then back intoIdle
wake_wt
function, which is called to wake up a WT after a timer event is
triggered, wakes up threads that are currently inIdle
state, if they are
inNoExist
, it will do nothing.- If there is no work, such as the case after system resumption from the DB,
WTs will stay in theNoExist
state and thenwake_wt
will do nothing, so
the system is blocked. - WT now properly transition into
Idle
.
- Worker Threads (WT) are created in
- Only communicate with live DB nodes from RTS DB client [#910] [#916]
- When the RTS communicates with the DB nodes, we've broadcast messages to all
servers we know about. If they are down, they've had their socket fd set to
0 to signal that the server is down. However, fd=0 is not invalid, it is
stdin, so we ended up sending data to stdin creating lots of garbage output
on the terminal. - fd -1 is used to signal an invalid fd, which prevents similar mistakes.
- The DB node status is inspected and messages are only sent to live servers.
- When the RTS communicates with the DB nodes, we've broadcast messages to all
- Avoid segfault on resuming TCP listener & TCP listener connection [#922]
- Invalidate fds on actor resumption [#917]
- Remove remaining ending new lines from RTS log messages [#926]
- Remove ending new lines from DB log messages [#932]
Testing / CI
- Rewritten RTS / DB tests [#925] [#929]
- More robust event handling, directly reacting when something happens, for
example if a DB server segfaults or we see unexpected output we can abort
the test - Now has much better combined output of DB & app output for simple
correlation during failures - Test orchestrator now written in Acton (previously Python), at least async
IO callback style is better supported to directly react to events...
- More robust event handling, directly reacting when something happens, for