Re: Add 64-bit XIDs into PostgreSQL 15

Поиск
Список
Период
Сортировка
От Pavel Borisov
Тема Re: Add 64-bit XIDs into PostgreSQL 15
Дата
Msg-id CALT9ZEEsj54k9+xmcSAYLe7YsEHYbzFUuDwpVzkjmR6CCPHpww@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Add 64-bit XIDs into PostgreSQL 15  (Andres Freund <andres@anarazel.de>)
Ответы Re: Add 64-bit XIDs into PostgreSQL 15  (Aleksander Alekseev <aleksander@timescale.com>)
Список pgsql-hackers
Hi, Andres!

I've revised the README a little bit to address your corrections and questions. Thanks for this very much!
A patchset with changed README is attached as v8 here (the code is unchanged and identical to v7).
 
> +The downside of this is that we can not use tuple's XMIN and XMAX right away.
> +We often need to re-read t_xmin and t_xmax - which could actually be pointers
> +into a page in shared buffers and therefore they could be updated by any other
> +backend.

Ugh, that's not great.
Agree. This part is one of the candidates for revision as per proposals above [1] i.e :
"2A. Probably refactor it to store precalculated XMIN/XMAX in memory
tuple representation instead of t_xid_base/t_multi_base". 

We are working on this change.
 
What happens if the first access happens on a replica?

What is the approach for dealing with multixact files? They have xids
embedded?  And currently the SLRUs will break if you just let the offsets SLRU
grow without bounds.

Wait. So you just modify the page without WAL logging or marking it dirty on a
standby? I fail to see how that can be correct.

Imagine the cluster is promoted, the page is dirtied, and we write it
out. You'll have written out a completely changed page, without any WAL
logging. There's plenty other scenarios.
In this part, I suppose you've found a definite bug. Thanks! There are a couple 
of ways how it could be fixed:

1. If we enforce checkpoint at replica promotion then we force full-page writes after each page modification afterward.

2. Maybe it's worth using BufferDesc bit to mark the page as converted to 64xid but not yet written to disk? For example, one of four bits from BUF_USAGECOUNT.
BM_MAX_USAGE_COUNT  = 5 so it will be enough 3 bits to store it. This will change in-memory page representation but will not need WAL-logging which is impossible on a replica. 

What do you think about it? 

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Ensure that STDERR is empty during connect_ok
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Server-side base backup: why superuser, not pg_write_server_files?