Fix for visibility check on 14.5 fails on tpcc with high concurrency

Поиск
Список
Период
Сортировка
От Dimos Stamatakis
Тема Fix for visibility check on 14.5 fails on tpcc with high concurrency
Дата
Msg-id CO2PR0801MB2310579F65529380A4E5EDC0E20A9@CO2PR0801MB2310.namprd08.prod.outlook.com
обсуждение исходный текст
Ответы Re: Fix for visibility check on 14.5 fails on tpcc with high concurrency  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Список pgsql-hackers

Hi hackers,

 

When running tpcc on sysbench with high concurrency (96 threads, scale factor 5) we realized that a fix for visibility check (introduced in PG-14.5) causes sysbench to fail in 1 out of 70 runs.

The error is the following:

 

SQL error, errno = 0, state = 'XX000': new multixact has more than one updating member

 

And it is caused by the following statement:

 

UPDATE warehouse1

          SET w_ytd = w_ytd + 234

          WHERE w_id = 3;

 

The commit that fixes the visibility check is the following:

https://github.com/postgres/postgres/commit/e24615a0057a9932904317576cf5c4d42349b363

 

We reverted this commit and tpcc does not fail anymore, proving that this change is problematic.

Steps to reproduce:

1. Install sysbench

  https://github.com/akopytov/sysbench

2. Install percona sysbench TPCC

  https://github.com/Percona-Lab/sysbench-tpcc

3. Run percona sysbench -- prepare

  # sysbench-tpcc/tpcc.lua --pgsql-host=localhost --pgsql-port=5432 --pgsql-user={USER} --pgsql-password={PASSWORD} --pgsql-db=test_database --db-driver=pgsql --tables=1 --threads=96 --scale=5 --time=60 prepare

4. Run percona sysbench -- run

  # sysbench-tpcc/tpcc.lua --pgsql-host=localhost --pgsql-port=5432 --pgsql-user={USER} --pgsql-password={PASSWORD} --pgsql-db=test_database --db-driver=pgsql --tables=1 --report-interval=1 --rand-seed=1 --threads=96 --scale=5 --time=60 run

 

We tested on a machine with 2 NUMA nodes, 16 physical cores per node, and 2 threads per core, resulting in 64 threads total. The total memory is 376GB.

Attached please find the configuration file we used (postgresql.conf).

 

This commit was supposed to fix a race condition during the visibility check. Please let us know whether you are aware of this issue and if there is a quick fix.

Any input is highly appreciated.

 

Thanks,

Dimos

[ServiceNow]

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Fix comments atop pg_get_replication_slots
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: Logical Replication Custom Column Expression