Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

Поиск
Список
Период
Сортировка
От Kyotaro Horiguchi
Тема Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Дата
Msg-id 20220719.163544.1139566326103773644.horikyota.ntt@gmail.com
обсуждение исходный текст
Ответ на Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns  (Masahiko Sawada <sawada.mshk@gmail.com>)
Список pgsql-hackers
At Tue, 19 Jul 2022 10:17:15 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in 
> Good work. I wonder without comments this may create a problem in the
> future. OTOH, I don't see adding a check "catchange.xcnt > 0" before
> freeing the memory any less robust. Also, for consistency, we can use
> a similar check based on xcnt in the SnapBuildRestore to free the
> memory in the below code:
> + /* set catalog modifying transactions */
> + if (builder->catchange.xip)
> + pfree(builder->catchange.xip);

But xip must be positive there.  We can add a comment explains that.


+     * Array of transactions and subtransactions that had modified catalogs
+     * and were running when the snapshot was serialized.
+     *
+     * We normally rely on HEAP2_NEW_CID and XLOG_XACT_INVALIDATIONS records to
+     * know if the transaction has changed the catalog. But it could happen that
+     * the logical decoding decodes only the commit record of the transaction.
+     * This array keeps track of the transactions that have modified catalogs

(Might be only me, but) "track" makes me think that xids are added and
removed by activities. On the other hand the array just remembers
catalog-modifying xids in the last life until the all xids in the list
gone.

+     * and were running when serializing a snapshot, and this array is used to
+     * add such transactions to the snapshot.
+     *
+     * This array is set once when restoring the snapshot, xids are removed

(So I want to add "only" between "are removed").

+     * from the array when decoding xl_running_xacts record, and then eventually
+     * becomes empty.


+    catchange_xip = ReorderBufferGetCatalogChangesXacts(builder->reorder);

catchange_xip is allocated in the current context, but ondisk is
allocated in builder->context.  I see it kind of inconsistent (even if
the current context is same with build->context).


+    if (builder->committed.xcnt > 0)
+    {

It seems to me comitted.xip is always non-null, so we don't need this.
I don't strongly object to do that, though.

-     * Remove TXN from its containing list.
+     * Remove TXN from its containing lists.

The comment body only describes abut txn->nodes. I think we need to
add that for catchange_node.


+    Assert((xcnt > 0) && (xcnt == rb->catchange_ntxns));

(xcnt > 0) is obvious here (otherwise means dlist_foreach is broken..).
(xcnt == rb->catchange_ntxns) is not what should be checked here. The
assert just requires that catchange_txns and catchange_ntxns are
consistent so it should be checked just after dlist_empty.. I think.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



В списке pgsql-hackers по дате отправления:

Предыдущее
От: John Naylor
Дата:
Сообщение: Re: NAMEDATALEN increase because of non-latin languages
Следующее
От: Masahiko Sawada
Дата:
Сообщение: Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns