Re: [PATCH] Logical decoding support for sequence advances

Поиск
Список
Период
Сортировка
От Craig Ringer
Тема Re: [PATCH] Logical decoding support for sequence advances
Дата
Msg-id CAMsr+YHF6-gUhzXYZLo3gEb8ceSP7_6zbm4vrWjwSCsiDEApPA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PATCH] Logical decoding support for sequence advances  (Petr Jelinek <petr@2ndquadrant.com>)
Список pgsql-hackers
On 11 March 2016 at 22:24, Petr Jelinek <petr@2ndquadrant.com> wrote:
On 02/03/16 08:05, Craig Ringer wrote:
On 1 March 2016 at 05:30, Petr Jelinek <petr@2ndquadrant.com
<mailto:petr@2ndquadrant.com>> wrote:
 


    I wonder if it would be acceptable to create new info flag for
    RM_SEQ_ID that would behave just like XLOG_SEQ_LOG but would be used
    only for the nontransactional updates (nextval) so that decoding
    could easily differentiate between transactional and
    non-transactional update of sequence and then just either call the
    callback immediately or add the change to reorder buffer based on
    that. The redo code could just have simple OR expression to behave
    same with both of the info flags.


That's much cleaner than trying to keep track of sequence creations and
really pretty harmless. I'll give that a go and see how it looks.

    Seems like simpler solution than building all the tracking code on
    the decoding side to me.


+1


Except this won't work for sequences that have been created in same transaction as the nextval()/setval() was called because in those cases we don't want to decode the advancement of sequence until the end of transaction and we can't map the relfilenode to sequence without going through reorder buffer in those cases either


I'll explain this a bit (for when I forget all about it and come back to it confused, or if someone else picks this up):

The issue is transactions like

BEGIN;
CREATE TABLE blah (id serial primary key, something text);
INSERT INTO blah (something) SELECT .... ;
COMMIT;

Here we create the sequence, then we advance the sequence in subsequent statements that're part of the same xact but not directly connected to the sequence creation. There's no convenient way to tell, when we see the Form_pg_sequence updates in WAL for the newly created sequence, that it's for a not-yet-committed xact so we shouldn't send the advance to the client yet.

Once the xact that created the sequence commits we have to make sure we send its latest state, not the initial state when it was created. So with the above proposal we'd still need to look up those new-info-flagged entries against a map of uncommitted sequences by relfilenode and decide whether to send it immediately or update the latest state of an uncommitted sequence in a reorder buffer. 

IOW we have to do pretty much what I described before. We can still log sequence updates with a different info flag but we need to know how to associate the record with the xact that created it, so we have to log the creating xid in the record for the initial state of a newly created sequence. At least that'd be less ugly than trying to peek at decoded catalog updates in the reorder buffer to spot new sequence creation and can be done only when wal_level = logical, but it'd mean that the two record types were different in more than just info flag.

The other wrinkle Petr refers to is that when decoding XLOG_SEQ_LOG we only have a relfilenode. We don't know the oid of the sequence, which we need to look up its name. The reorder buffer code uses RelidByRelfilenode for that, which requires a snapshot. I'm not sure what problem that poses, since we'd obviously need a snapshot set up to look up the name by oid anyway and we'd be using the most recently committed historic snapshot for both.

Anyway, this is still complicated because of the mess with sequences being both transactional and not-transactional in ways that rely on how the low level storage and WAL works.

Unfortunately I don't expect to have time to produce a new patch for 9.6.

(BTW, I'd be interested in seeing what code breaks if we introduced a compile option in src/include/pg_config_manual.h to force oid and relfilenode randomization rather than starting off with them being the same.)

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Speed up Clog Access by increasing CLOG buffers
Следующее
От: Ioseph Kim
Дата:
Сообщение: Re: propose: detail binding error log