Re: what to revert

Поиск
Список
Период
Сортировка
От Ants Aasma
Тема Re: what to revert
Дата
Msg-id CA+CSw_taAWC5zqa8cjQ6GG0Ca3rTXeWXJ_jD3BTDyLbPwf6EEw@mail.gmail.com
обсуждение исходный текст
Ответ на what to revert  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: what to revert  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On Tue, May 3, 2016 at 9:57 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> If you tell me how to best test it, I do have a 4-socket server sitting idly
> in the corner (well, a corner reachable by SSH). I can get us some numbers,
> but I haven't been following the snapshot_too_old so I'll need some guidance
> on what to test.

I worry about two contention points with the current implementation.

The main one is the locking within MaintainOldSnapshotTimeMapping()
that gets called every time a snapshot is taken. AFAICS this should
show up by setting old_snapshot_threshold to any positive value and
then running a simple within shared buffers scale factor read only
pgbench at high concurrency (number of CPUs or a small multiple). On a
single socket system this does not show up.

The second one is probably a bit harder to hit,
GetOldSnapshotThresholdTimestamp() has a spinlock that gets hit
everytime a scan sees a page that has been modified after the snapshot
was taken. A workload that would tickle this is something that uses a
repeatable read snapshot, builds a non-temporary table and runs
reporting on it. Something like this would work:

BEGIN ISOLATION LEVEL REPEATABLE READ;
DROP TABLE IF EXISTS test_:client_id;
CREATE TABLE test_:client_id (x int, filler text);
INSERT INTO test_:client_id  SELECT x, repeat(' ', 1000) AS filler
FROM generate_series(1,1000) x;
SELECT (SELECT COUNT(*) FROM test_:client_id WHERE x != y) FROM
generate_series(1,1000) y;
COMMIT;

With this script running with -c4 on a 4 core workstation I'm seeing
the following kind of contention and a >2x loss in throughput:

+   14.77%  postgres  postgres           [.] GetOldSnapshotThresholdTimestamp
-    8.01%  postgres  postgres           [.] s_lock  - s_lock     + 88.15% GetOldSnapshotThresholdTimestamp     +
10.47%TransactionIdLimitedForOldSnapshots     + 0.71% TestForOldSnapshot_impl     + 0.57% GetSnapshotCurrentTimestamp
 

Now this is kind of an extreme example, but I'm willing to bet that on
multi socket hosts similar issues can crop up with common real world
use cases.

Regards,
Ants Aasma



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "David G. Johnston"
Дата:
Сообщение: Re: Pg_stop_backup process does not run - Backup Intervals
Следующее
От: David Rowley
Дата:
Сообщение: Re: pg9.6 segfault using simple query (related to use fk for join estimates)