Re: [HACKERS] Challenges preventing us moving to 64 bit transactionid (XID)?

Поиск
Список
Период
Сортировка
От Alexander Korotkov
Тема Re: [HACKERS] Challenges preventing us moving to 64 bit transactionid (XID)?
Дата
Msg-id CAPpHfdtOv_kuPXz7=ixA=m91oCaR-Y6EHO=94H2J7zTJB5_0qw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Challenges preventing us moving to 64 bit transactionid (XID)?  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Ответы Re: [HACKERS] Challenges preventing us moving to 64 bit transaction id (XID)?
Re: [HACKERS] Challenges preventing us moving to 64 bit transactionid (XID)?
Re: [HACKERS] Challenges preventing us moving to 64 bit transactionid (XID)?
Список pgsql-hackers
On Wed, Jun 7, 2017 at 11:33 AM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:
On Tue, Jun 6, 2017 at 4:05 PM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
On 6/6/17 08:29, Bruce Momjian wrote:
> On Tue, Jun  6, 2017 at 06:00:54PM +0800, Craig Ringer wrote:
>> Tom's point is, I think, that we'll want to stay pg_upgrade
>> compatible. So when we see a pg10 tuple and want to add a new page
>> with a new page header that has an epoch, but the whole page is full
>> so there isn't 32 bits left to move tuples "down" the page, what do we
>> do?
>
> I guess I am missing something.  If you see an old page version number,
> you know none of the tuples are from running transactions so you can
> just freeze them all, after consulting the pg_clog.  What am I missing?
> If the page is full, why are you trying to add to the page?

The problem is if you want to delete from such a page.  Then you need to
update the tuple's xmax and stick the new xid epoch somewhere.

We had an unconference session at PGCon about this.  These issues were
all discussed and some ideas were thrown around.  We can expect a patch
to appear soon, I think.

Right.  I'm now working on splitting my large patch for 64-bit xids into patchset.
I'm planning to post patchset in the beginning of next week.

Work on this patch took longer than I expected.  It is still in not so good shape, but I decided to publish it anyway in order to not stop progress in this area.
I also tried to split this patch into several.  But actually I manage to separate few small pieces, while most of changes are remaining in the single big diff.
Long story short, patchset is attached.

0001-64bit-guc-relopt-1.patch
This patch implements 64 bit GUCs and relation options which are used in further patches.

0002-heap-page-special-1.patch
Putting xid and multixact bases into PageHeaderData would take extra 16 bytes on index pages too.  That would be waste of space for indexes.  This is why I decided to put bases into special area of heap pages.
This patch adds special area for heap pages contaning prune xid and magic number.  Magic number is different for regular heap page and sequence page.

0003-64bit-xid-1.patch
It's the major patch.  It redefines TransactionID ad 64-bit integer and defines 32-bit ShortTransactionID which is used for t_xmin and t_xmax.  Transaction id comparison becomes straight instead of circular. Base values for xids and multixact ids are stored in heap page special.  SLRUs also became 64-bit and non-circular.   To be able to calculate xmin/xmax without accessing heap page, base values are copied into HeapTuple.  Correspondingly HeapTupleHeader(Get|Set)(Xmin|Xmax) becomes just HeapTuple(Get|Set)(Xmin|Xmax) whose require HeapTuple not just HeapTupleHeader.  heap_page_prepare_for_xid() is used to ensure that given xid fits particular page base.  If it doesn't fit then base of page is shifted, that could require single-page freeze.  Format for wal is changed in order to prevent unaligned access to TransactionId.  *_age GUCs and relation options are changed to 64-bit.  Forced "autovacuum to prevent wraparound" is removed, but there is still freeze to truncate SLRUs.

0004-base-values-for-testing-1.patch
This patch is used for testing that calculations using 64-bit bases and short 32-bit xid values are correct.  It provides initdb options for initial xid, multixact id and multixact offset values.  Regression tests initialize cluster with large (more than 2^32) values.

There are a lot of open items, but I would like to notice some of them:
 * WAL becomes significantly larger due to storage 8 byte xids instead of 4 byte xids.  Probably, its needed to use base approach in WAL too.
 * As discussed in developer unconference, we need to write special background worker which would ensure that each heap page can fit bases.  This background worker should finish its work before database could be pg_upgraded.  Alternatively, we could find a way to store bases in the existing page header.
 * BTPageOpaqueData contains TransactionID in special area.  BTPageOpaqueData should be changed to some pg_upgradable format.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Daniel Gustafsson
Дата:
Сообщение: Re: [HACKERS] Multiple TO version in ALTER EXTENSION UPDATE
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: [HACKERS] [patch] pg_dump/pg_restore zerror() and strerror()mishap