Do conditional locking of pgsk->lock LWLock #39

x4m · 2023-04-21T12:00:44Z

This lock might be the source of the lock contention.

A way to reproduce the problem is described here.

@NikolayS asked me to create a prototype for benchmarking. I hope he will provide some feedback on testing this.

This lock might be source of the lock contention.

rjuju · 2023-04-21T13:35:56Z

Hi,

I still think that this would be better solved by allowing extensions to store custom data in the new stats in dynshm infrastructure, rather than trying to add workarounds that lead to silently ignore some activity in each and every extension.

That's especially true for pg_stat_kcache: this extension requires pg_stat_statements to be loaded and inherit the same number of entries, so even if this patch somehow helps you should still get similar eviction pattern and therefore similar need to use exclusive lwlock on pg_stat_statements side.

More to the point: if your monitoring system is so heavy that you regularly need to ignore many metrics, what value has this monitoring system? You will only see the data when everything runs smoothly and ignore data during the incident, leading to totally unreliable metrics.

NikolayS · 2023-04-21T16:18:41Z

@rjuju thank you for your very quick response!

...You will only see the data when everything runs smoothly and ignore data during the incident, leading to totally unreliable metrics.

"Totally" here seems to be a very strong word.

A new entry is not recorded only if someone (monitoring or human) is concurrently reading the stats, which should be rare (but when it happens, it causes bad latency spikes for dozens, sometimes hundreds of backends – I'm talking about heavily loaded cases, 10-50k TPS).

But other entries continue receiving updates – metrics are increasing since it doesn't require exclusive locks.

Currently, or both pgss and pgsk we already cannot claim that we have fully reliable stats – we are limited by pg_stat_statements.max, which default is 5000 – far from being enough to cover all normalized queries in some systems, especially if we have a lot of various DBs and apps in PG cluster, and/or track_utility is on and some Java app emits set applitcation_name to $session_id...; flooding our stats table, and so on. Some entries are evicted and recorded again, back and forth, all the time. In other words, currently we cannot guarantee totally reliable metrics.

For example, if we consider sum( some metric ), we should always keep in mind that it's only part of the whole.

I would prefer the "recording" moment to be less invasive. These entries are already "in the tail" of my stats, which are truncated anyway (sometimes even to lower number – say, if you have Prometheus, you probably truncating pgss / pgsk to as few as Top-100-500 entries). So if those who are evicted and willing to jump back to my car slow me down, I'd prefer them to try again a bit later.

Maybe we could have a new GUC for this, allowing admin to decide what to prefer (perhaps, not changing the default behavior)? I, personally, would definitely turn this approach on in most loaded systems.

rjuju · 2023-04-22T01:52:34Z

A new entry is not recorded only if someone (monitoring or human) is concurrently reading the stats, which should be rare

No, it could also happen if any other backend is holding the lwlock in any mode for long enough. So any other backend concurrently updating an existing entry, adding a new entry or trying to evict some will lead to new entries not being recorded.

Currently, or both pgss and pgsk we already cannot claim that we have fully reliable stats – we are limited by pg_stat_statements.max

Yes, it's known that there's a limit in the total number of entries that can be recorded. But at least you have the guarantee the the current entries are the most recent one, which should also be the most interesting ones in case of performance issue. With this patch that guarantee disappear and you can't even rely on seeing the activity during the performance problem.

Now if you take a snapshot of the entries say every 5 minutes and during that time you generate more than pg_stat_statements.max new entries then you clearly have a problem, and you should either increase the number of entries, change your app or switch to pgbadger or similar.

which are truncated anyway (sometimes even to lower number – say, if you have Prometheus, you probably truncating pgss / pgsk to as few as Top-100-500 entries

If you mean calling pg_stat_statements_reset() or pg_stat_kcache_reset(), that sounds like a really bad idea. Those functions are expensive, especially for pg_stat_statements due to the query text handling.

Maybe we could have a new GUC for this, allowing admin to decide what to prefer (perhaps, not changing the default behavior)? I, personally, would definitely turn this approach on in most loaded systems.

How will that help since pg_stat_statements will still have the same bottleneck?

Again the new pgstat infrastructure is designed to address such problems and that would give a way to fix all similar extensions.

vitabaks · 2024-01-19T10:25:02Z

I take it this commit refers to this issue.

rjuju · 2024-01-19T10:56:42Z

@vitabaks I thought about it too earlier but looking back at the pointed thread that's a different issue. The underlying issue was mostly an escalation of lock conflicts due to querying a quite big pg_stat_statements view very often, until at one point an exclusive lwlock is needed ( eviction, new entry, reset...) and then all hell breaks loose.

Andrey M. Borodin and others added 2 commits April 21, 2023 16:57

Do conditional locking of pgsk->lock LWLock

c7a4bc0

This lock might be source of the lock contention.

Fix warning in reset

63be75a

rjuju self-requested a review April 21, 2023 13:22

rjuju self-assigned this Apr 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do conditional locking of pgsk->lock LWLock #39

Do conditional locking of pgsk->lock LWLock #39

x4m commented Apr 21, 2023

rjuju commented Apr 21, 2023

NikolayS commented Apr 21, 2023 •

edited

Loading

rjuju commented Apr 22, 2023 •

edited

Loading

vitabaks commented Jan 19, 2024

rjuju commented Jan 19, 2024

Do conditional locking of pgsk->lock LWLock #39

Are you sure you want to change the base?

Do conditional locking of pgsk->lock LWLock #39

Conversation

x4m commented Apr 21, 2023

rjuju commented Apr 21, 2023

NikolayS commented Apr 21, 2023 • edited Loading

rjuju commented Apr 22, 2023 • edited Loading

vitabaks commented Jan 19, 2024

rjuju commented Jan 19, 2024

NikolayS commented Apr 21, 2023 •

edited

Loading

rjuju commented Apr 22, 2023 •

edited

Loading