Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
От | Masahiko Sawada |
---|---|
Тема | Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns |
Дата | |
Msg-id | CAD21AoC=jJEaNx5ersq_nxUQtuE2KcH_Rip0rNOWnvoYwVdSOg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Список | pgsql-hackers |
On Tue, Jul 19, 2022 at 4:35 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: Thank you for the comments! > > At Tue, 19 Jul 2022 10:17:15 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in > > Good work. I wonder without comments this may create a problem in the > > future. OTOH, I don't see adding a check "catchange.xcnt > 0" before > > freeing the memory any less robust. Also, for consistency, we can use > > a similar check based on xcnt in the SnapBuildRestore to free the > > memory in the below code: > > + /* set catalog modifying transactions */ > > + if (builder->catchange.xip) > > + pfree(builder->catchange.xip); > > But xip must be positive there. We can add a comment explains that. > Yes, if we add the comment for it, probably we need to explain a gcc's optimization but it seems to be too much to me. > > + * Array of transactions and subtransactions that had modified catalogs > + * and were running when the snapshot was serialized. > + * > + * We normally rely on HEAP2_NEW_CID and XLOG_XACT_INVALIDATIONS records to > + * know if the transaction has changed the catalog. But it could happen that > + * the logical decoding decodes only the commit record of the transaction. > + * This array keeps track of the transactions that have modified catalogs > > (Might be only me, but) "track" makes me think that xids are added and > removed by activities. On the other hand the array just remembers > catalog-modifying xids in the last life until the all xids in the list > gone. > > + * and were running when serializing a snapshot, and this array is used to > + * add such transactions to the snapshot. > + * > + * This array is set once when restoring the snapshot, xids are removed > > (So I want to add "only" between "are removed"). > > + * from the array when decoding xl_running_xacts record, and then eventually > + * becomes empty. Agreed. WIll fix. > > > + catchange_xip = ReorderBufferGetCatalogChangesXacts(builder->reorder); > > catchange_xip is allocated in the current context, but ondisk is > allocated in builder->context. I see it kind of inconsistent (even if > the current context is same with build->context). Right. I thought that since the lifetime of catchange_xip is short, until the end of SnapBuildSerialize() function we didn't need to allocate it in builder->context. But given ondisk, we need to do that for catchange_xip as well. Will fix it. > > > + if (builder->committed.xcnt > 0) > + { > > It seems to me comitted.xip is always non-null, so we don't need this. > I don't strongly object to do that, though. But committed.xcnt could be 0, right? We don't need to copy anything by calling memcpy with size = 0 in this case. Also, it looks more consistent with what we do for catchange_xcnt. > > - * Remove TXN from its containing list. > + * Remove TXN from its containing lists. > > The comment body only describes abut txn->nodes. I think we need to > add that for catchange_node. Will add. > > > + Assert((xcnt > 0) && (xcnt == rb->catchange_ntxns)); > > (xcnt > 0) is obvious here (otherwise means dlist_foreach is broken..). > (xcnt == rb->catchange_ntxns) is not what should be checked here. The > assert just requires that catchange_txns and catchange_ntxns are > consistent so it should be checked just after dlist_empty.. I think. > If we want to check if catchange_txns and catchange_ntxns are consistent, should we check (xcnt == rb->catchange_ntxns) as well, no? This function requires the caller to use rb->catchange_ntxns as the length of the returned array. I think this assertion ensures that the actual length of the array is consistent with the length we pre-calculated. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Alvaro HerreraДата:
Сообщение: Re: Costing elided SubqueryScans more nearly correctly
Следующее
От: Morris de OryxДата:
Сообщение: Re: System column support for partitioned tables using heap