Re: "unexpected duplicate for tablespace" problem in logical replication

Поиск
Список
Период
Сортировка
От Kyotaro Horiguchi
Тема Re: "unexpected duplicate for tablespace" problem in logical replication
Дата
Msg-id 20220412.114539.151610727144113536.horikyota.ntt@gmail.com
обсуждение исходный текст
Ответ на Re: "unexpected duplicate for tablespace" problem in logical replication  ("wangsh.fnst@fujitsu.com" <wangsh.fnst@fujitsu.com>)
Ответы Re: "unexpected duplicate for tablespace" problem in logical replication
Re: "unexpected duplicate for tablespace" problem in logical replication
Список pgsql-bugs
At Mon, 11 Apr 2022 10:24:26 +0000, "wangsh.fnst@fujitsu.com" <wangsh.fnst@fujitsu.com> wrote in 
> Please see the attachment

Sorry for being late, but this seems to be in the wrong direction.

In the problem case, as you know, GetNewRelFileNode does *not* check
oid uniqueness in pg_class.  The on-catalog duplicate check is done
only when the result is to be used as the *table oid* at creation.  In
other cases it chooses a relfilenode that does not duplicate in
storage file names irrelevantly to the relation oid.  As the result
Non-temporary and temporary tables have separate relfilenode oid
spaces, either intentional or not..

Thus, I think we don't want GetNewRelFileNode to check for duplicate
oid on pg_class if pg_class is not passed since that significantly
narrow the oid space for relfilenode.  If we did something like that,
we might do check the existence of file names of the both non-temp and
temp in GetNewRelFileNode(), but that also narrows the relfilenode oid
space.

RelidByRelfilenode is called by autoprewarm, logical replication, and
pg_filenode_relation. In the autoprewarm and logical replication
cases, it is called only for non-temprary relations so letting the
function ignore (oid duplicating) temp tables works.
pg_relfilienode_relation is somewhat dubious.  It is affected by this
bug.  In the attached PoC, to avoid API change (in case for
back-patching), RelidByRelfilenode ignores duplcates of differernt
persistence.  Also the PoC prioritize on persistent tables to
temporary ones but I'm not sure it's good decision, but otherwise we
need to change the signature of pg_filenode_relation.

The attached is an PoC of that.  The first one is a reproduction-aid
code. With it applied, the following steps create duplicate
relfilenode relations.

(apply the first diff)
$ rm /tmp/hoge
$ (start postgres)
# touch /tmp/hoge
$ psql
=# create table xp (a int);
=# truncate xp;
=# create temp table xt (a int);
=# truncate xt;
=# select oid, relname, relfilenode from pg_class where relname like 'x%';
  oid  | relname | relfilenode 
-------+---------+-------------
 16384 | xp      |       55555
 55558 | xt      |       55555
(2 rows)
=# select pg_filenode_relation(0, 55555);
ERROR:  unexpected duplicate for tablespace 0, relfilenode 55555


(apply the second diff)
=# create temp table xt (a int);
=# truncate xt;
=# select pg_filenode_relation(0, 55555);
 pg_filenode_relation 
----------------------
 xp
(1 row)


What do you think about this direction?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Вложения

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: BUG #17462: Invalid memory access in heapam_tuple_lock
Следующее
От: "David G. Johnston"
Дата:
Сообщение: Re: Docs for psql regarding default database name are incorrect