Re: [HACKERS] logical decoding of two-phase transactions

Поиск
Список
Период
Сортировка
От Stas Kelvich
Тема Re: [HACKERS] logical decoding of two-phase transactions
Дата
Msg-id 1FE466EA-7058-484D-B0DB-42CD81FA59F0@postgrespro.ru
обсуждение исходный текст
Ответ на Re: [HACKERS] logical decoding of two-phase transactions  (Craig Ringer <craig@2ndquadrant.com>)
Ответы Re: [HACKERS] logical decoding of two-phase transactions
Список pgsql-hackers
> On 31 Jan 2017, at 12:22, Craig Ringer <craig@2ndquadrant.com> wrote:
>
> Personally I don't think lack of access to the GID justifies blocking 2PC logical decoding. It can be added
separately.But it'd be nice to have especially if it's cheap. 

Agreed.

> On 2 Feb 2017, at 00:35, Craig Ringer <craig@2ndquadrant.com> wrote:
>
> Stas was concerned about what happens in logical decoding if we crash between PREPSRE TRANSACTION and COMMIT
PREPARED.But we'll always go back and decode the whole txn again anyway so it doesn't matter. 

Not exactly. It seems that in previous discussions we were not on the same page, probably due to unclear arguments by
me.

From my point of view there is no problems (or at least new problems comparing to ordinary 2PC) with preparing
transactionson slave servers with something like “#{xid}#{node_id}” instead of GID if issuing node is coordinator of
thattransaction. In case of failure, restart, crash we have the same options about deciding what to do with uncommitted
transactions.

My concern is about the situation with external coordinator. That scenario is quite important for users of postgres
native2pc, notably J2EE user.  Suppose user (or his framework) issuing “prepare transaction ‘mytxname’;" to servers
withordinary synchronous physical replication. If master will crash and replica will be promoted than user can
reconnectto it and commit/abort that transaction using his GID. And it is unclear to me how to achieve same behaviour
withlogical replication of 2pc without GID in commit record. If we will prepare with “#{xid}#{node_id}” on acceptor
nodes,then if donor node will crash we’ll lose mapping between user’s gid and our internal gid; contrary we can prepare
withuser's GID on acceptors, but then we will not know that GID on donor during commit decode (by the time decoding
happensall memory state already gone and we can’t exchange our xid to gid). 

I performed some tests to understand real impact on size of WAL. I've compared postgres -master with wal_level =
logical,after 3M 2PC transactions with patched postgres where GID’s are stored inside commit record too. Testing with
194-bytesand 6-bytes GID’s. (GID max size is 200 bytes) 

-master, 6-byte GID after 3M transaction: pg_current_xlog_location = 0/9572CB28
-patched, 6-byte GID after 3M transaction: pg_current_xlog_location = 0/96C442E0

so with 6-byte GID’s difference in WAL size is less than 1%
-master, 194-byte GID after 3M transaction: pg_current_xlog_location = 0/B7501578
-patched, 194-byte GID after 3M transaction: pg_current_xlog_location = 0/D8B43E28

and with 194-byte GID’s difference in WAL size is about 18%

So using big GID’s (as J2EE does) can cause notable WAL bloat, while small GID’s are almost unnoticeable.

May be we can introduce configuration option track_commit_gid by analogy with track_commit_timestamp and make that
behaviouroptional? Any objections to that? 

--
Stas Kelvich
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: [HACKERS] pg_basebackup -R
Следующее
От: Ashutosh Sharma
Дата:
Сообщение: Re: [HACKERS] pageinspect: Hash index support