-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do conditional locking of pgsk->lock LWLock #39
base: master
Are you sure you want to change the base?
Conversation
This lock might be source of the lock contention.
Hi, I still think that this would be better solved by allowing extensions to store custom data in the new stats in dynshm infrastructure, rather than trying to add workarounds that lead to silently ignore some activity in each and every extension. That's especially true for pg_stat_kcache: this extension requires pg_stat_statements to be loaded and inherit the same number of entries, so even if this patch somehow helps you should still get similar eviction pattern and therefore similar need to use exclusive lwlock on pg_stat_statements side. More to the point: if your monitoring system is so heavy that you regularly need to ignore many metrics, what value has this monitoring system? You will only see the data when everything runs smoothly and ignore data during the incident, leading to totally unreliable metrics. |
@rjuju thank you for your very quick response!
"Totally" here seems to be a very strong word. A new entry is not recorded only if someone (monitoring or human) is concurrently reading the stats, which should be rare (but when it happens, it causes bad latency spikes for dozens, sometimes hundreds of backends – I'm talking about heavily loaded cases, 10-50k TPS). But other entries continue receiving updates – metrics are increasing since it doesn't require exclusive locks. Currently, or both pgss and pgsk we already cannot claim that we have fully reliable stats – we are limited by For example, if we consider I would prefer the "recording" moment to be less invasive. These entries are already "in the tail" of my stats, which are truncated anyway (sometimes even to lower number – say, if you have Prometheus, you probably truncating pgss / pgsk to as few as Top-100-500 entries). So if those who are evicted and willing to jump back to my car slow me down, I'd prefer them to try again a bit later. Maybe we could have a new GUC for this, allowing admin to decide what to prefer (perhaps, not changing the default behavior)? I, personally, would definitely turn this approach on in most loaded systems. |
No, it could also happen if any other backend is holding the lwlock in any mode for long enough. So any other backend concurrently updating an existing entry, adding a new entry or trying to evict some will lead to new entries not being recorded.
Yes, it's known that there's a limit in the total number of entries that can be recorded. But at least you have the guarantee the the current entries are the most recent one, which should also be the most interesting ones in case of performance issue. With this patch that guarantee disappear and you can't even rely on seeing the activity during the performance problem. Now if you take a snapshot of the entries say every 5 minutes and during that time you generate more than pg_stat_statements.max new entries then you clearly have a problem, and you should either increase the number of entries, change your app or switch to pgbadger or similar.
If you mean calling pg_stat_statements_reset() or pg_stat_kcache_reset(), that sounds like a really bad idea. Those functions are expensive, especially for pg_stat_statements due to the query text handling.
How will that help since pg_stat_statements will still have the same bottleneck? Again the new pgstat infrastructure is designed to address such problems and that would give a way to fix all similar extensions. |
I take it this commit refers to this issue. |
@vitabaks I thought about it too earlier but looking back at the pointed thread that's a different issue. The underlying issue was mostly an escalation of lock conflicts due to querying a quite big pg_stat_statements view very often, until at one point an exclusive lwlock is needed ( eviction, new entry, reset...) and then all hell breaks loose. |
This lock might be the source of the lock contention.
A way to reproduce the problem is described here.
@NikolayS asked me to create a prototype for benchmarking. I hope he will provide some feedback on testing this.