Re: Postgres is not able to handle more than 4k tables!?

Поиск
Список
Период
Сортировка
От Konstantin Knizhnik
Тема Re: Postgres is not able to handle more than 4k tables!?
Дата
Msg-id f8cdadbc-6c6b-b4ea-6896-5f1f4dd81778@postgrespro.ru
обсуждение исходный текст
Ответ на Re: Postgres is not able to handle more than 4k tables!?  (Nikolay Samokhvalov <samokhvalov@gmail.com>)
Список pgsql-hackers

On 09.07.2020 19:19, Nikolay Samokhvalov wrote:
> Hi Konstantin, a silly question: do you consider the workload you have 
> as well-optimized? Can it be optimized further? Reading this thread I 
> have a strong feeling that a very basic set of regular optimization 
> actions is missing here (or not explained): query analysis and 
> optimization based on pg_stat_statements (and, maybe pg_stat_kcache), 
> some method to analyze the state of the server in general, resource 
> consumption, etc.
>
> Do you have some monitoring that covers pg_stat_statements?
>
> Before looking under the hood, I would use multiple pg_stat_statements 
> snapshots (can be analyzed using, say, postgres-checkup or pgCenter) 
> to understand the workload and identify the heaviest queries -- first 
> of all, in terms of total_time, calls, shared buffers reads/hits, 
> temporary files generation. Which query groups are Top-N in each 
> category, have you looked at it?
>
> You mentioned some crazy numbers for the planning time, but why not to 
> analyze the picture holistically and see the overall numbers? Those 
> queries that have increased planning time, what their part of 
> total_time, on the overall picture, in %? (Unfortunately, we cannot 
> see Top-N by planning time in pg_stat_statements till PG13, but it 
> doesn't mean that we cannot have some good understanding of overall 
> picture today, it just requires more work).
>
> If workload analysis & optimization was done holistically already, or 
> not possible due to some reason — pardon me. But if not and if your 
> primary goal is to improve this particular setup ASAP, then the topic 
> could be started in the -performance mailing list first, discussing 
> the workload and its aspects, and only after it's done, raised in 
> -hackers. No?

Certainly, both we and customer has made workload analysis & optimization.
It is not a problem of particular queries, bad plans, resource 
exhaustion,...

Unfortunately there many scenarios when Postgres demonstrates not 
gradual degrade of performance with increasing workload,
but "snow avalanche" whennegative feedback cause very fastparalysis of 
the system.

This case is just one if this scenarios. It is hard to say for sure what 
triggers the avalanche... Long living transaction, huge number of tables,
aggressive autovacuum settings... But there is cascade of negative 
events which cause system which normally function for months to stop 
working at all.

In this particular case we have the following chain:

- long living transaction cause autovacuum to send a lot of invalidation 
message
- this messages cause overflow of invalidation message queues, forcing 
backens to invalidate their caches and reload from catalog.
- too small value of fastpath lock cache cause many concurrent accesses 
to shared lock hash
- contention for LW-lock caused by small number of lock partition cause 
starvation




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: Postgres is not able to handle more than 4k tables!?
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: some more pg_dump refactoring