Re: Improving connection scalability (src/backend/storage/ipc/procarray.c)

Поиск
Список
Период
Сортировка
От Ranier Vilela
Тема Re: Improving connection scalability (src/backend/storage/ipc/procarray.c)
Дата
Msg-id CAEudQAocUznNq59O5ykQBTN_KB4odiJK0r5ow_4dy78FOBYg_A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Improving connection scalability (src/backend/storage/ipc/procarray.c)  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Ответы Re: Improving connection scalability (src/backend/storage/ipc/procarray.c)  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
Em qui., 26 de mai. de 2022 às 22:30, Tomas Vondra <tomas.vondra@enterprisedb.com> escreveu:
On 5/27/22 02:11, Ranier Vilela wrote:
>
> ...
>
> Here the results with -T 60:

Might be a good idea to share your analysis / interpretation of the
results, not just the raw data. After all, the change is being proposed
by you, so do you think this shows the change is beneficial?
I think so, but the expectation has diminished.
I expected that the more connections, the better the performance.
And for both patch and head, this doesn't happen in tests.
Performance degrades with a greater number of connections.
GetSnapShowData() isn't a bottleneck?
 

> Linux Ubuntu 64 bits
> shared_buffers = 128MB
>
> ./pgbench -M prepared -c $conns -j $conns -T 60 -S -n -U postgres
>
> pgbench (15beta1)
> transaction type: <builtin: select only>
> scaling factor: 1
> query mode: prepared
> number of clients: 100
> number of threads: 100
> maximum number of tries: 1
> duration: 60 s
>
> conns            tps head              tps patched
>
> 1         17126.326108            17792.414234
> 10       82068.123383            82468.334836
> 50       73808.731404            74678.839428
> 80       73290.191713            73116.553986
> 90       67558.483043            68384.906949
> 100     65960.982801            66997.793777
> 200     62216.011998            62870.243385
> 300     62924.225658            62796.157548
> 400     62278.099704            63129.555135
> 500     63257.930870            62188.825044
> 600     61479.890611            61517.913967
> 700     61139.354053            61327.898847
> 800     60833.663791            61517.913967
> 900     61305.129642            61248.336593
> 1000   60990.918719            61041.670996
>

These results look much saner, but IMHO it also does not show any clear
benefit of the patch. Or are you still claiming there is a benefit?
We agree that they are micro-optimizations.
However, I think they should be considered micro-optimizations in inner loops,
because all in procarray.c is a hotpath.
The first objective, I believe, was achieved, with no performance regression.
I agree, the gains are small, by the tests done.
But, IMHO, this is a good way, small gains turn into big gains in the end, when applied to all code.

Consider GetSnapShotData()
1. Most of the time the snapshot is not null, so:
if (snaphost == NULL), will fail most of the time.

With the patch:
if (snapshot->xip != NULL)
{
    if (GetSnapshotDataReuse(snapshot))
     return snapshot;
}

Most of the time the test is true and GetSnapshotDataReuse is not called for new
snapshots.

count, subcount and  suboverflowed, will not be initialized, for all snapshots.

2. If snapshot is taken during recoverys
The pgprocnos and ProcGlobal->subxidStates are not touched unnecessarily.
Only if is not suboverflowed.
Skipping all InvalidTransactionId, mypgxactoff, backends doing logical decoding,
and XID is >= xmax.

3. Calling GetSnapshotDataReuse() without first acquiring ProcArrayLock.
There's an agreement that this would be fine, for now.

Consider ComputeXidHorizons()
1. ProcGlobal->statusFlags is touched before the lock.
2. allStatusFlags[index] is not touched for all numProcs.

All changes were made with the aim of avoiding or postponing unnecessary work.


BTW it's generally a good idea to do multiple runs and then use the
average and/or median. Results from a single may be quite noisy.

>
> Linux Ubuntu 64 bits
> shared_buffers = 2048MB
>
> ./pgbench -M prepared -c $conns -j $conns -S -n -U postgres
>
> pgbench (15beta1)
> transaction type: <builtin: select only>
> scaling factor: 1
> query mode: prepared
> number of clients: 100
> number of threads: 100
> maximum number of tries: 1
> number of transactions per client: 10
>
> conns          tps head                tps patched
>
> 1           2918.004085              3211.303789
> 10       12262.415696            15540.015540
> 50       13656.724571            16701.182444
> 80       14338.202348            16628.559551
> 90       16597.510373            16835.016835
> 100     17706.775793            16607.433487
> 200     16877.067441            16426.969799
> 300     16942.260775            16319.780662
> 400     16794.514911            16155.023607
> 500     16598.502151            16051.106724
> 600     16717.935001            16007.171213
> 700     16651.204834            16004.353184
> 800     16467.546583            16834.591719
> 900     16588.241149            16693.902459
> 1000   16564.985265            16936.952195
>

I think we've agreed these results are useless.

>
> Linux Ubuntu 64 bits
> shared_buffers = 2048MB
>
> ./pgbench -M prepared -c $conns -j $conns -T 60 -S -n -U postgres
>
> pgbench (15beta1)
> transaction type: <builtin: select only>
> scaling factor: 1
> query mode: prepared
> number of clients: 100
> number of threads: 100
> maximum number of tries: 1
> duration: 60 s
>
> conns           tps head               tps patched
>
> 1         17174.265804            17792.414234
> 10       82365.634750            82468.334836
> 50       74593.714180            74678.839428
> 80       69219.756038            73116.553986
> 90       67419.574189            68384.906949
> 100     66613.771701            66997.793777
> 200     61739.784830            62870.243385
> 300     62109.691298            62796.157548
> 400     61630.822446            63129.555135
> 500     61711.019964            62755.190389
> 600     60620.010181            61517.913967
> 700     60303.317736            61688.044232
> 800     60451.113573            61076.666572
> 900     60017.327157            61256.290037
> 1000   60088.823434            60986.799312
>

I have no idea why shared buffers 2GB would be interesting. The proposed
change was related to procarray, not shared buffers. And scale 1 is
~15MB of data, so it fits into 128MB just fine.
 I thought about doing this benchmark, in the most common usage situation (25% of RAM).


Also, the first ~10 results for "patched" case match results for 128MB
shared buffers. That seems very unlikely to happen by chance, so this
seems rather suspicious.
Probably, copy and paste mistake.
I redid this test, for patched:

Linux Ubuntu 64 bits
shared_buffers = 2048MB

./pgbench -M prepared -c $conns -j $conns -T 60 -S -n -U postgres

pgbench (15beta1)
transaction type: <builtin: select only>
scaling factor: 1
query mode: prepared
number of clients: 100
number of threads: 100
maximum number of tries: 1
duration: 60 s

conns           tps head                tps patched

1         17174.265804            17524.482668
10       82365.634750            81840.537713
50       74593.714180            74806.729434
80       69219.756038            73116.553986
90       67419.574189            69130.749209
100     66613.771701            67478.234595
200     61739.784830            63094.202413
300     62109.691298            62984.501251
400     61630.822446            63243.232816
500     61711.019964            62827.977636
600     60620.010181            62838.051693
700     60303.317736            61594.629618
800     60451.113573            61208.629058
900     60017.327157            61171.001256
1000   60088.823434            60558.067810
 
regards,
Ranier Vilela

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Yura Sokolov
Дата:
Сообщение: Re: PG15 beta1 sort performance regression due to Generation context change
Следующее
От: Maxim Orlov
Дата:
Сообщение: Re: Add 64-bit XIDs into PostgreSQL 15