-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(localenv): span metrics generation #2849
Conversation
- adds configuration that generates span metrics from tempo traces - can see new `traces_spanmetrics_bucket` etc. in local grafana dashboard
✅ Deploy Preview for brilliant-pasca-3e80ec canceled.
|
I considered if we wanted to add visualizations for each resolver. Like stat metrics for 25th, 50th, 95th percentile etc or the heatmap/histogram like we have for the pay times but opted not to. First, I feel like we will better understand what details we need as we actually consume these (as part of performance testing analysis?). Second, I think we probably mostly care about the extreme high end (ie 95th, 99th percentile etc). In which case maybe we just add another bar gauge like the included one but for 99th percentile. Open to other ideas for what visualizations we need for this but I think this one gives us the gist of what we're looking for. |
I'm curious exactly what our plan is with the local dashboard? Are we using it for dev? Are we using it to measure only certain local metrics. It's not exactly clear to me. |
"refId": "A" | ||
} | ||
], | ||
"title": "Panel Title", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets update this title
"uid": "PBFA97CFB590B2093" | ||
}, | ||
"editorMode": "code", | ||
"expr": "histogram_quantile(0.95, sum(rate(traces_spanmetrics_latency_bucket{span_name=~\"^(mutation|query).*\"}[$__rate_interval])) by (le, span_name))", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be something other than$__rate_interval
, but instead the selected interval of the dashboard? That way you can see the timings per last x minutes/seconds etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be something other than$__rate_interval, but instead the selected interval of the dashboard?
From what I can tell it does factor in the current time range. I spun up the localenv, ran some queries and saw the data in this visualization with 5m time range. I waited 5m+ and saw no data until I bumped to 15m time range.
Im also seeing it generally recommended as starting point for the rate arg :
Mainly for performance testing and debugging |
I'm mostly using it to validate the metric collection and develop visualizations. If it were applicable to the live version I would add them there after merging this (although I guess technically it wouldn't have any data until the next release). I dont think we need to maintain parity with the live version or have examples for every single metric, but its nice to have some basic proof-of-concept visualizations for the different types of metrics (traces, histograms, counts, etc.) IMO. Thinking back to our conversation about development workflow I think in theory it would be nice to develop locally, commit, then publishing to grafana from ci. This would unify it with our general change workflow and it would be version controlled. But not sure its worth the setup tbh. |
* feat(localenv): add span metric generation - adds configuration that generates span metrics from tempo traces - can see new `traces_spanmetrics_bucket` etc. in local grafana dashboard * feat(localenv): add gql resolver metric * chore(localenv): give panel title
Changes proposed in this pull request
Visualization preview:
Context
fixes: #2848
Checklist
fixes #number