Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Retry sentinel startup on failure #200

Merged
merged 1 commit into from
Dec 15, 2022

Conversation

HalosGhost
Copy link
Collaborator

@HalosGhost HalosGhost commented Dec 5, 2022

In particular, infinitely retry starting the sentinel if its coordinator client fails to initialize.

Note: this probably isn't ideal; exponential back-off delay with a retry threshold (after which we should fatally error out) is probably better in the long-run. This, however, will suffice for now.

Edit: As infinite-retry causes the tests to fail, I've gone ahead and implemented a really basic threshold and increasing-delay. Currently, the threshold is 4 retries, and the delay is exponential (increasing by a factor of two). I will be double-checking that the README walkthrough runs correctly (on a reasonably low-powered machine of mine) after the tests pass before I request review from someone else to try to make sure this will fix the issue (at least for most people).

Fixes #186

@HalosGhost HalosGhost force-pushed the fix/sentinel-startup-retry branch 2 times, most recently from 3411d13 to 0f529d4 Compare December 6, 2022 21:40
In particular, retry N times (N = 4), increasing the retry
delay on each failure (exponentially by factor of 2).

This combination seems to give a reasonable level of stability.

Co-authored-by: Michael L. Szulczewski <[email protected]>
Signed-off-by: Sam Stuewe <[email protected]>
@HalosGhost
Copy link
Collaborator Author

CC: @karlovskiy, @mszulcz-mitre, @AlexRamRam, @metalicjames

Would any of you mind pulling this change and testing locally. It seems to work quite reliably on my two test machines (one of which is woefully under-powered), but I'd love to get a second tACK before I merge it.

@karlovskiy
Copy link

@HalosGhost I've had chance to check your commit and it just works!

@HalosGhost
Copy link
Collaborator Author

@karlovskiy want to request your own review and add an approval? PR authors cannot approve their own PRs. :)

@karlovskiy
Copy link

@HalosGhost Ok. I'll be glad to ;)

Copy link

@darbha-ram darbha-ram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested startup and basic wallet operations, they work.

@HalosGhost HalosGhost merged commit 86b4118 into mit-dci:trunk Dec 15, 2022
@HalosGhost HalosGhost deleted the fix/sentinel-startup-retry branch December 15, 2022 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Part of "Run the Code" section in README.md appears broken
3 participants