Re: Make relfile tombstone files conditional on WAL level

Поиск
Список
Период
Сортировка
От Dilip Kumar
Тема Re: Make relfile tombstone files conditional on WAL level
Дата
Msg-id CAFiTN-sJqzCnXOVFb0QVAKidXdV1QzA9tR9EDwM65JAScGr0VQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Make relfile tombstone files conditional on WAL level  (Dilip Kumar <dilipbalaut@gmail.com>)
Ответы Re: Make relfile tombstone files conditional on WAL level  (Dilip Kumar <dilipbalaut@gmail.com>)
Список pgsql-hackers
On Wed, Jan 19, 2022 at 10:37 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Jan 6, 2022 at 7:22 PM Robert Haas <robertmhaas@gmail.com> wrote:
>>
>> On Thu, Jan 6, 2022 at 3:47 AM Thomas Munro <thomas.munro@gmail.com> wrote:
>> > Another problem is that relfilenodes are normally allocated with
>> > GetNewOidWithIndex(), and initially match a relation's OID.  We'd need
>> > a new allocator, and they won't be able to match the OID in general
>> > (while we have 32 bit OIDs at least).
>>
>> Personally I'm not sad about that. Values that are the same in simple
>> cases but diverge in more complex cases are kind of a trap for the
>> unwary. There's no real reason to have them ever match. Yeah, in
>> theory, it makes it easier to tell which file matches which relation,
>> but in practice, you always have to double-check in case the table has
>> ever been rewritten. It doesn't seem worth continuing to contort the
>> code for a property we can't guarantee anyway.
>
>
> Make sense, I have started working on this idea, I will try to post the first version by early next week.

Here is the first working patch, with that now we don't need to
maintain the TombStone file until the next checkpoint.  This is still
a WIP patch with this I can see my problem related to ALTER DATABASE
SET TABLESPACE WAL-logged problem is solved which Robert reported a
couple of mails above in the same thread.

General idea of the patch:
- Change the RelFileNode.relNode to be 64bit wide, out of which 8 bits
for fork number and 56 bits for the relNode as shown below. [1]
- GetNewRelFileNode() will just generate a new unique relfilenode and
check the file existence and if it already exists then throw an error,
so no loop.  We also need to add the logic for preserving the
nextRelNode across restart and also WAL logging it but that is similar
to the preserving nextOid.
- mdunlinkfork, will directly forget the relfilenode, so we get rid of
all unlinking code from the code.
- Now, we don't need any post checkpoint unlinking activity.

[1]
/*
* RelNodeId:
*
* this is a storage type for RelNode. The reasoning behind using this is same
* as using the BlockId so refer comment atop BlockId.
*/
typedef struct RelNodeId
{
      uint32 rn_hi;
      uint32 rn_lo;
} RelNodeId;
typedef struct RelFileNode
{
   Oid spcNode; /* tablespace */
   Oid dbNode; /* database */
   RelNodeId relNode; /* relation */
} RelFileNode;

TODO:

There are a couple of TODOs and FIXMEs which I am planning to improve
by next week.  I am also planning to do the testing where relfilenode
consumes more than 32 bits, maybe for that we can set the
FirstNormalRelfileNode to higher value for the testing purpose.  And,
Improve comments.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Aleksander Alekseev
Дата:
Сообщение: Re: Add 64-bit XIDs into PostgreSQL 15
Следующее
От: Robert Haas
Дата:
Сообщение: Re: warn if GUC set to an invalid shared library