High rate of transaction failure with the Serializable Isolation Level

Поиск
Список
Период
Сортировка
От Reza Taheri
Тема High rate of transaction failure with the Serializable Isolation Level
Дата
Msg-id ded604822b28427a80b0751a973e4258@EX13-MBX-013.vmware.com
обсуждение исходный текст
Ответы Re: High rate of transaction failure with the Serializable Isolation Level  (Craig Ringer <craig@2ndquadrant.com>)
Re: High rate of transaction failure with the Serializable Isolation Level  (Craig Ringer <craig@2ndquadrant.com>)
Re: High rate of transaction failure with the Serializable Isolation Level  (Kevin Grittner <kgrittn@ymail.com>)
Список pgsql-performance

Hello PGSQL performance community,

[By way of introduction, we are a TPC subcommittee that is developing a benchmark with cloud-like characteristics for virtualized databases. The end-to-end benchmarking kit will be publicly available, and will run on PGSQL]

 

I am running into very high failure rates when I run with the Serializable Isolation Level. I have simplified our configuration to a single database with a constant workload, a TPC-E workload if you will, to focus on this this problem. We are running with PGSQL 9.2.4, ODBC 2.2.14 (as well as 2.3.3pre, which didn’t help), RHEL 6.4, and a 6-way VM with 96GB of memory on a 4-socket Westmere server.

 

With our 9 transactions running with a mix of SQL_TXN_READ_COMMITTED and SQL_TXN_REPEATABLE_READ, we get less than 1% deadlocks, all of which occur because each row in one table, BROKER, may be read or written by multiple transactions at the same time. So, there are legitimate conflicts, which we deal with using an exponential backoff algorithm that sleeps for 10ms/30ms/90ms/etc.

 

When we raise the Trade-Result transaction to SQL_TXN_SERIALIZABLE, we face a storm of conflicts. Out of 37,342 Trade-Result transactions, 15,707 hit an error, and have to be rolled back and retired one or more times. The total failure count (due to many transactions failing more than once) is 31,388.

 

What is unusual is that the majority of the failures occur in a statement that should not have any isolation conflicts. About 17K of failures are from the statement below:

2014-07-23 11:27:15 PDT 26085 ERROR:  could not serialize access due to read/write dependencies among transactions

2014-07-23 11:27:15 PDT 26085 DETAIL:  Reason code: Canceled on identification as a pivot, during write.

2014-07-23 11:27:15 PDT 26085 HINT:  The transaction might succeed if retried.

2014-07-23 11:27:15 PDT 26085 CONTEXT:  SQL statement "update   TRADE

                set     T_COMM = comm_amount,

                        T_DTS = trade_dts,

                        T_ST_ID = st_completed_id,

                        T_TRADE_PRICE = trade_price

                where   T_ID = trade_id"

        PL/pgSQL function traderesultframe5(ident_t,value_t,character,timestamp without time zone,trade_t,s_price_t) line 15 at SQL statement

 

This doesn’t make sense since at any given time, only one transaction might possibly be accessing the row that is being updated. There should be no conflicts if we have row-level locking/isolation


The second most common conflict happens 7.6K times in the statement below:

2014-07-23 11:27:23 PDT 26039 ERROR:  could not serialize access due to read/write dependencies among transactions

2014-07-23 11:27:23 PDT 26039 DETAIL:  Reason code: Canceled on identification as a pivot, during conflict in checking.

2014-07-23 11:27:23 PDT 26039 HINT:  The transaction might succeed if retried.

2014-07-23 11:27:23 PDT 26039 CONTEXT:  SQL statement "insert

                into    SETTLEMENT (    SE_T_ID,

                                        SE_CASH_TYPE,

                                        SE_CASH_DUE_DATE,

                                        SE_AMT)

                values (        trade_id,

                                cash_type,

                                due_date,

                                se_amount

                        )"

        PL/pgSQL function traderesultframe6(ident_t,timestamp without time zone,character varying,value_t,timestamp without time zone,trade_t,smallint,s_qty_t,character) line 23 at SQL statement

 

I don’t understand why an insert would hit a serialization conflict

 

We also have 4.5K conflicts when we try to commit:

2014-07-23 11:27:23 PDT 26037 ERROR:  could not serialize access due to read/write dependencies among transactions

2014-07-23 11:27:23 PDT 26037 DETAIL:  Reason code: Canceled on identification as a pivot, during commit attempt.

2014-07-23 11:27:23 PDT 26037 HINT:  The transaction might succeed if retried.

2014-07-23 11:27:23 PDT 26037 STATEMENT:  COMMIT

 

 

Does PGSQL raise locks to page level when we run with SQL_TXN_SERIALIZABLE? Are there any knobs I can play with to alleviate this?  FWIW, the same transactions on MS SQL Server see almost no conflicts.

 

Thanks,
Reza

В списке pgsql-performance по дате отправления:

Предыдущее
От: Felipe Santos
Дата:
Сообщение: Re: Building multiple indexes on one table.
Следующее
От: Rural Hunter
Дата:
Сообщение: Re: Very slow planning performance on partition table