[HACKERS] snapbuild woes

Поиск
Список
Период
Сортировка
От Petr Jelinek
Тема [HACKERS] snapbuild woes
Дата
Msg-id f37e975c-908f-858e-707f-058d3b1eb214@2ndquadrant.com
обсуждение исходный текст
Ответы Re: [HACKERS] snapbuild woes  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
Re: [HACKERS] snapbuild woes  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
Hi,

I recently found couple of issues with the way initial logical decoding
snapshot is made.

First one is outright bug, which has to do with how we track running
transactions. What snapbuild basically does while doing initial snapshot
is read the xl_running_xacts record, store the list of running txes and
then wait until they all finish. The problem with this is that
xl_running_xacts does not ensure that it only logs transactions that are
actually still running (to avoid locking PGPROC) so there might be xids
in xl_running_xacts that already committed before it was logged. This in
turn means that the snapbuild will never make snapshot if the situation
occurs. The reason why this didn't bite us much yet is that the
snapbuild will consider all transactions finished if the empty
xl_running_xacts comes which usually happens after a while (but might
not on a consistently busy server).

The fix I came up with this is to extend the mechanism to also consider
all transactions finished when xl_running_xacts which has xmin bigger
then that previous xmax comes. This seems to work pretty well on busy
servers. This fix is attached as
0001-Mark-snapshot-consistent-when-all-running-txes-have.patch. I
believe it should be backpatched all the way to 9.4.


The other issue is performance problem again on busy servers with
initial snapshot. We track transactions for catalog modifications so
that we can do proper catalog time travel for decoding of changes. But
for transactions that were running while we started trying to get
initial consistent snapshot, there is no good way to tell if they did
catalog changes or not so we consider them all as catalog changing. We
make separate historical snapshot for every such transaction. This by
itself is fine, the problem is that current implementation also
considers all the transactions that started after we started watching
for changes but before we reached consistent state to also do catalog
changes even though there we actually do know if they did any catalog
change or not. In practice it means we make snapshots that are not
really necessary and if there was long running transaction for which the
snapshot builder has to wait for then we can create thousands of unused
snapshots which affects performance in bad ways (I've seen the initial
snapshot taking hour because of this).

The attached 0002-Skip-unnecessary-snapshot-builds.patch changes this
behavior so that we don't make snapshots for transactions that we seen
wholly and know that they didn't make catalog changes even if we didn't
reach consistency yet.

-- 
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Artur Zakirov
Дата:
Сообщение: Re: [HACKERS] [sqlsmith] Crash in tsquery_rewrite/QTNBinary
Следующее
От: Petr Jelinek
Дата:
Сообщение: Re: [HACKERS] snapbuild woes