Re: Sample rate added to pg_stat_statements

Поиск
Список
Период
Сортировка
От Ilia Evdokimov
Тема Re: Sample rate added to pg_stat_statements
Дата
Msg-id 18631d46-1741-4edc-b116-8d9631cdf919@tantorlabs.com
обсуждение исходный текст
Ответ на Re: Sample rate added to pg_stat_statements  (Sami Imseih <samimseih@gmail.com>)
Ответы Re: Sample rate added to pg_stat_statements
Список pgsql-hackers
On 28.01.2025 23:50, Ilia Evdokimov wrote:
>
>>
>>> If anyone has the capability to run this benchmark on machines with 
>>> more
>>> CPUs or with different queries, it would be nice. I’d appreciate any
>>> suggestions or feedback.
>> I wanted to share some additional benchmarks I ran as well
>> on a r8g.48xlarge ( 192 vCPUs, 1,536 GiB of memory) configured
>> with 16GB of shared_buffers. I also attached the benchmark.sh
>> script used to generate the output.
>> The benchmark is running the select-only pgbench workload,
>> so we have a single heavily contentious entry, which is the
>> worst case.
>>
>> The test shows that the spinlock (SpinDelay waits)
>> becomes an issue at high connection counts and will
>> become worse on larger machines. A sample_rate going from
>> 1 to .75 shows a 60% improvement; but this is on a single
>> contentious entry. Most workloads will likely not see this type
>> of improvement. I also could not really observe
>> this type of difference on smaller machines ( i.e. 32 vCPUs),
>> as expected.
>>
>> ## init
>> pgbench -i -s500
>>
>> ### 192 connections
>> pgbench -c192 -j20 -S -Mprepared -T120 --progress 10
>>
>> sample_rate = 1
>> tps = 484338.769799 (without initial connection time)
>> waits
>> -----
>>    11107  SpinDelay
>>     9568  CPU
>>      929  ClientRead
>>       13  DataFileRead
>>        3  BufferMapping
>>
>> sample_rate = .75
>> tps = 909547.562124 (without initial connection time)
>> waits
>> -----
>>    12079  CPU
>>     4781  SpinDelay
>>     2100  ClientRead
>>
>> sample_rate = .5
>> tps = 1028594.555273 (without initial connection time)
>> waits
>> -----
>>    13253  CPU
>>     3378  ClientRead
>>      174  SpinDelay
>>
>> sample_rate = .25
>> tps = 1019507.126313 (without initial connection time)
>> waits
>> -----
>>    13397  CPU
>>     3423  ClientRead
>>
>> sample_rate = 0
>> tps = 1015425.288538 (without initial connection time)
>> waits
>> -----
>>    13106  CPU
>>     3502  ClientRead
>>
>> ### 32 connections
>> pgbench -c32 -j20 -S -Mprepared -T120 --progress 10
>>
>> sample_rate = 1
>> tps = 620667.049565 (without initial connection time)
>> waits
>> -----
>>     1782  CPU
>>      560  ClientRead
>>
>> sample_rate = .75
>> tps = 620663.131347 (without initial connection time)
>> waits
>> -----
>>     1736  CPU
>>      554  ClientRead
>>
>> sample_rate = .5
>> tps = 624094.688239 (without initial connection time)
>> waits
>> -----
>>     1741  CPU
>>      648  ClientRead
>>
>> sample_rate = .25
>> tps = 628638.538204 (without initial connection time)
>> waits
>> -----
>>     1702  CPU
>>      576  ClientRead
>>
>> sample_rate = 0
>> tps = 630483.464912 (without initial connection time)
>> waits
>> -----
>>     1638  CPU
>>      574  ClientRead
>>
>> Regards,
>>
>> Sami
>
>
> Thank you so much for benchmarking this on a pretty large machine with 
> a large number of CPUs. The results look fantastic, and I truly 
> appreciate your effort.
>
> BWT, I realized that the 'sampling' test needs to be added not only to 
> the Makefile but also to meson.build. I've included that in the v14 
> patch.
>
> -- 
> Best regards,
> Ilia Evdokimov,
> Tantor Labs LLC.


In my opinion, if we can't observe bottleneck of spinlock on 32 CPUs, we 
should determine the CPU count at which it becomes. This will help us 
understand the scale of the problem. Does this make sense, or are there 
really no real workloads where the same query runs on more than 32 CPUs, 
and we've been trying to solve a non-existent problem?

--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.




В списке pgsql-hackers по дате отправления: