Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Web socket syncing #478

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Web socket syncing #478

wants to merge 4 commits into from

Conversation

tdroxler
Copy link
Member

@tdroxler tdroxler commented Jun 23, 2023

Resolves: #353

After thousands of tests and adjustment I think we can start the discussion, it still needs some unit tests, but I'm not sure yet how easy it will be to test the websockets.

The PR needs https://github.com/alephium/dev-alephium/pull/827 from the node, because we need the events to match the data we receive from rest calls.

The strategy is:

  • Sync with BlockFlowSyncService as usual
  • Once we are up-to-date: latest block is less than 30s old we move to
    websocket syncing
  • If websocket messages are late: > 30s old, we close the websocket and
    move back to http syncing. This can happen in case of network issue,
    this is safer as we could suddenly receive lots of message through the
    websocket and the DB can't follow (happen once in my tests)
  • If the websocket close for anyreason we also move back to http
    syncing.

The main idea is to always rely and fall back on our BlockFlowSyncService which is well tested in production.

I tried various edge cases, like cutting my network etc.

The strategy is:
* Sync with `BlockFlowSyncService` as usual
* Once we are up-to-date: latest block is less than 30s old we move to
  websocket syncing
* If websocket messages are late: > 30s old, we close the websocket and
  move back to http syncing. This can happen in case of network issue,
  this is safer as we could suddenly receive lots of message through the
  websocket and the DB can't follow.
* If the websocket close for anyreason we also move back to http
  syncing.

The main idea is to always rely and fall back on our
`BlockFlowSyncService` which is well tested in production.
@simerplaha
Copy link
Member

simerplaha commented Aug 1, 2023

Neat! We have WebSockets! Is the way to test this is to do a full sync and then manually terminate the server to test how it behaves?

Tricky one to test. I'm letting it sync overnight and kinda manually causing abrupt server shutdowns to see how the connection & sync behaves, with debug statements, if I'm not careful enough it corrupts my RocksDB instance and I have to start over.

Surely there is an easier way to test. Any suggestion?

it still needs some unit tests, but I'm not sure yet how easy it will be to test the websockets.

Yep, tricky one indeed.

@tdroxler
Copy link
Member Author

tdroxler commented Aug 2, 2023

Tricky one to test. I'm letting it sync overnight and kinda manually causing abrupt server shutdowns to see how the connection & sync behaves, with debug statements, if I'm not careful enough it corrupts my RocksDB instance and I have to start over.

If you shutdown the node, then the explorer-backend should also stop, but it sounds super weird that the node's rockDB get corrupted, the explorer-backend shouldn't have any influence on the node.

how are you causing the abrupt shutdown? I could try to reproduce

@simerplaha
Copy link
Member

Oh it's not a bug. It's me intentionally causing abrupt/force shutdown to see how the connection behaves with println debug statement. Node and explorer behave as expected. Not sure how else to test it. Was trying to convey that testing is tricky, as you've pointed out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sync using full node's web socket
2 participants