Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Retry Mechanism to Handle Message Distribution Errors in NSQ #1475

Closed

Conversation

bagusandrian
Copy link

@bagusandrian bagusandrian commented Nov 29, 2023

Issue

The current NSQ implementation lacks proper handling for message distribution errors, particularly in scenarios where errors such as Out of Memory (OOM) or Out of Disk Space occur during message distribution from topic to channel. This results in lost messages for some channels listed on the topic.

Changes Made

I have introduced a retry mechanism to address this issue. The code now includes retry , which is based on the value of msg.maxRetryChannel. This modification aims to improve the reliability of message distribution and prevent message loss when errors such as OOM or disk space exhaustion are encountered.

Proposed Solution

The solution involves checking the value of msg.maxRetryChannel and retrying the message distribution process accordingly. By incorporating this retry mechanism, we aim to enhance the robustness of NSQ in handling errors during message distribution.

Impact

These changes should have a positive impact on the reliability of NSQ, especially in environments where issues like OOM or disk space constraints may arise during message distribution. However, it's important to note any potential side effects and risks associated with the retry mechanism.

I welcome feedback and suggestions for further improvements. This enhancement aims to address a critical issue in NSQ's reliability and prevent message loss in error-prone scenarios.

@mreiferson
Copy link
Member

Hi @bagusandrian, I appreciate you taking the time to submit this, but this has been a much discussed aspect of NSQ (see #510, with many links to other threads). If we ever decide to actually invest in meaningfully changing this aspect of NSQ, I don't think that a retry mechanism is the way to go.

@mreiferson mreiferson closed this Dec 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants