-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Adds Prometheus metrics to ODIS combiner daemon (Odis Combiner to k8s) #49
Conversation
|
I'd be in favor of sunsetting cloud-build and replacing by GitHub Actions. Happy to address that in a separate PR. |
No top level dependency changes detected. Learn more about Socket for GitHub ↗︎ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reminder to add a changeset
This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks!! Before I merge the odisCombinerK8s branch I'll go through one last time and ensure we aren't double counting errors anywhere
if (!res.headersSent) { | ||
if (err instanceof OdisError) { | ||
sendFailure(err.code, err.status, res, req.url) | ||
} else { | ||
Counters.errors.labels(req.url).inc() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is double counting I believe
config.phoneNumberPrivacy.fullNodeRetryDelayMs | ||
).catch((err) => { | ||
logger.error({ err, account }, 'failed to get on-chain DEK for account') | ||
Counters.errors.labels('getDEK').inc() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a preset for this?
@@ -23,10 +24,12 @@ export function disableDomain( | |||
): ResultHandler<DisableDomainRequest> { | |||
return async (request, response) => { | |||
if (!disableDomainRequestSchema(DomainSchema).is(request.body)) { | |||
Counters.warnings.labels(request.url, WarningMessage.INVALID_INPUT).inc() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this go in errorResult?
Description
Docker images:
Adds the following counters and histograms:
combiner_requests_total
combiner_responses_total
combiner_blockchain_errors_total
combiner_bls_compute_errors_total
combiner_endpoint_handler_errors_total
combiner_not_enough_sig_errors_total
combiner_sig_request_errors_total
combiner_sig_response_errors_total
combiner_sig_inconsistency_errors_total
combiner_sig_response_total
combiner_unknown_errors_total
combiner_warnings_total
combiner_endpoint_latency
combiner_full_node_latency
combiner_signer_latency
I couldn't find a way to create the event loop lag Histogram :(
Current metrics can be seen here: https://clabs.grafana.net/goto/2qF4RaWSg?orgId=1
Other changes
Describe any minor or "drive-by" changes here.
Tested
An explanation of how the changes were tested or an explanation as to why they don't need to be.
Related issues
Backwards compatibility
Brief explanation of why these changes are/are not backwards compatible.
Documentation
The set of community facing docs that have been added/modified because of this change