Обсуждение: Remove custom redundant full page write description from GIN
Hi hackers! While reading waldump output for the GIN index I noticed $subj. Here is an example: ``` rmgr: Gin len (rec/tot): 53/ 113, tx: 788, lsn: 0/01D2B700, prev 0/01D2B590, desc: INSERT isdata: F isleaf: T (full page image), blkref #0: rel 1663/16384/32857 blk 1 FPW ``` Notice we have "(full page image)" from gindesc and "FPW" from generic waldump code I suggest removing this custom FPW support. -- Best regards, Kirill Reshke
Вложения
> On 15 Sep 2025, at 17:56, Kirill Reshke <reshkekirill@gmail.com> wrote: > > I suggest removing this custom FPW support. I agree that extra message adds no value. Generic FPW message has the same "for verification" details too. I've checked if there are any other similar cases, but found non FPW indications in other resource managers. Maybe a litter comment about why we don't describe anything in presence of FPW would be good. But nearby code is not veryverbose... Best regards, Andrey Borodin.
On Mon, 15 Sept 2025 at 18:27, Andrey Borodin <x4mmm@yandex-team.ru> wrote: > > > > > On 15 Sep 2025, at 17:56, Kirill Reshke <reshkekirill@gmail.com> wrote: > > > > I suggest removing this custom FPW support. > > I agree that extra message adds no value. Generic FPW message has the same "for verification" details too. > I've checked if there are any other similar cases, but found non FPW indications in other resource managers. Thanks for review > Maybe a litter comment about why we don't describe anything in presence of FPW would be good. But nearby code is not veryverbose... > I dunno. Looks like after removing code, this comment about FPW will be just out of place. We already have documentation about pg_waldump in general, don't we? By the way, I have spotted a few new places which can be enhanced in GIN redo & waldump. PFA V2 series. 0001: Basically same as v1-0001 0002: During my work, I have to modify GIN pg_waldump to get more information about some wal records. In v2-0002 there are two modifications in XLOG_GIN_UPDATE_META_PAGE and XLOG_GIN_INSERT_LISTPAGE, which has added the most value (for me). I also used to display meta-page info, but I discarded this custom info display logic, as it adds little value(is it?) 0003: Small nitpick. I copy-pasted this code from pg core to my cpp project and the compiler noticed the `payload` variable was not used after the last modification. I find this true. 0004: Remove the RelFileLocator field of ginxlogUpdateMeta walrecord. It is not used in relay logic. 0005: Small nitpicky patch to document ginxlogInsertListPage's backup blocks. 0006: CREATE_PTREE always includes page contents in its walrecord on HEAD (correct me if i'm wrong, but this is what I see in source and in by-hand testing). In this patch I completely removes XLOG_GIN_CREATE_PTREE custom logic and replace it with a general FPW mechanism. This patch reduces wal record size in case wal_compression is enabled. Before 0006: ``` reshke@yezzey-cbdb-bench:~$ /home/reshke/pg19/bin/bin/pg_waldump -f -r GIN db/pg_wal/000000010000000000000005 | grep create_p -i rmgr: Gin len (rec/tot): 8117/ 8117, tx: 831, lsn: 0/05AF9B18, prev 0/05AF9A98, desc: CREATE_PTREE size: 8064, blkref #0: rel 1663/16384/16541 blk 517 rmgr: Gin len (rec/tot): 8117/ 8117, tx: 831, lsn: 0/05B13078, prev 0/05B13030, desc: CREATE_PTREE size: 8064, blkref #0: rel 1663/16384/16541 blk 525 rmgr: Gin len (rec/tot): 8117/ 8117, tx: 831, lsn: 0/05B2C5D8, prev 0/05B2C590, desc: CREATE_PTREE size: 8064, blkref #0: rel 1663/16384/16541 blk 533 rmgr: Gin len (rec/tot): 8117/ 8117, tx: 831, lsn: 0/05B45B20, prev 0/05B45AD8, desc: CREATE_PTREE size: 8064, blkref #0: rel 1663/16384/16541 blk 541 ``` After 0006: ``` reshke@yezzey-cbdb-bench:~$ /home/reshke/pg19/bin/bin/pg_waldump -f -r GIN db/pg_wal/000000010000000000000004 | grep CREATE rmgr: Gin len (rec/tot): 51/ 410, tx: 795, lsn: 0/041FE560, prev 0/041FE4E0, desc: CREATE_PTREE , blkref #0: rel 1663/16384/16527 blk 517 FPW rmgr: Gin len (rec/tot): 51/ 410, tx: 795, lsn: 0/042000B8, prev 0/04200070, desc: CREATE_PTREE , blkref #0: rel 1663/16384/16527 blk 525 FPW rmgr: Gin len (rec/tot): 51/ 410, tx: 795, lsn: 0/04201BF8, prev 0/04201BB0, desc: CREATE_PTREE , blkref #0: rel 1663/16384/16527 blk 533 FPW rmgr: Gin len (rec/tot): 51/ 410, tx: 795, lsn: 0/04203750, prev 0/04203708, desc: CREATE_PTREE , blkref #0: rel 1663/16384/16527 blk 541 FPW ``` Size reduced 8117-> 410 bytes. WDYT? -- Best regards, Kirill Reshke
Вложения
> On 26 Sep 2025, at 16:39, Kirill Reshke <reshkekirill@gmail.com> wrote: > > > WDYT? > > <v2-0002-Better-gin-pg_waldump.patch><v2-0004-Remove-locator.patch><v2-0003-fix-oversight-of-631118fe1e8f.patch><v2-0001-Remove-custom-full-page-write-decribption-from-GI.patch><v2-0005-Add-comment-to-ginxlogInsertListPage.patch><v2-0006-Remove-custom-logic-for-ginxlogCreatePostingTree-.patch> Hi! I think this thread should stick to patches 1 and 2. Which look good to me, but need few words in commit messages. Patches 3, 4 and 5 deserve their own thread about GIN code beautification, but may be reviewed here. But perhaps, it's OKfor them to live here too. They also look correct to me, but, again, lack appropriate commit messages. Patch 6 is significant improvement and for sure must be discussed in another thread. The idea seems straightforward to me,but it's WAL logging, there might be some unforeseen consequences of any small change. Best regards, Andrey Borodin.
On Tue, Sep 30, 2025 at 05:59:49PM +0500, Andrey Borodin wrote: > I think this thread should stick to patches 1 and 2. Which look good > to me, but need few words in commit messages. With XLogRecGetBlockRefInfo() already doing the same job with much more information about the block, 0001 makes sense. + int32 ntuples; + ntuples = ((ginxlogUpdateMeta *) rec)->ntuples; + appendStringInfo(buf, "ntuples: %d", ntuples); In 0002. We have a routine for this type of assignments: XLogRecGetBlockData(). Let's use it here for clarity. prevTail and newRightlink are block numbers, meaning %u not %d when printed. 0001 and 0002 can just be merged together. > Patches 3, 4 and 5 deserve their own thread about GIN code > beautification, but may be reviewed here. But perhaps, it's OK for > them to live here too. They also look correct to me, but, again, > lack appropriate commit messages. In 0003, "payload" is local to its block or code in ginRedoInsert(), still it's true that the extra assignment is useless. I don't think that I'm going to agree with 0004, this removes a RelFileLocator that could be useful. 0005 looks OK. > Patch 6 is significant improvement and for sure must be discussed in > another thread. The idea seems straightforward to me, but it's WAL > logging, there might be some unforeseen consequences of any small > change. Yep, not sure that I see the full benefit of what you are doing here. A thread describing problems around the WAL descriptions is not suited for what you are qualifying as an optimization. Same for 0004. 0003 is small enough that I would not bother with a different thread, now that you've posted a patch.. -- Michael
Вложения
On Wed, Oct 01, 2025 at 03:39:35PM +0900, Michael Paquier wrote: > Yep, not sure that I see the full benefit of what you are doing here. > A thread describing problems around the WAL descriptions is not suited > for what you are qualifying as an optimization. Same for 0004. 0003 > is small enough that I would not bother with a different thread, now > that you've posted a patch.. Worth a note. I had some time and applied patches 0001, 0003 and 0005, as they are worth their own. -- Michael
Вложения
On Wed, Oct 01, 2025 at 03:39:35PM +0900, Michael Paquier wrote: > + int32 ntuples; > + ntuples = ((ginxlogUpdateMeta *) rec)->ntuples; > + appendStringInfo(buf, "ntuples: %d", ntuples); > > In 0002. We have a routine for this type of assignments: > XLogRecGetBlockData(). Let's use it here for clarity. This should be XLogRecGetData(), not XLogRecGetBlockData(), as the data is assigned to the record. My apologies for the mistake. -- Michael
Вложения
On Tue, 7 Oct 2025 at 11:46, Michael Paquier <michael@paquier.xyz> wrote: > > On Wed, Oct 01, 2025 at 03:39:35PM +0900, Michael Paquier wrote: > > + int32 ntuples; > > + ntuples = ((ginxlogUpdateMeta *) rec)->ntuples; > > + appendStringInfo(buf, "ntuples: %d", ntuples); > > > > In 0002. We have a routine for this type of assignments: > > XLogRecGetBlockData(). Let's use it here for clarity. > > This should be XLogRecGetData(), not XLogRecGetBlockData(), as the > data is assigned to the record. My apologies for the mistake. > -- > Michael Yes, thank you. Turns out we already use XLogRecGetData in assignment: char *rec = XLogRecGetData(record). But I still did alter v2 patch a bit, introducing new variable xlrec, to avoid clunky casts between char * and ginxlogUpdateMeta. FPA v3. -- Best regards, Kirill Reshke
Вложения
On Tue, Oct 07, 2025 at 02:08:02PM +0500, Kirill Reshke wrote: > Turns out we already use XLogRecGetData in assignment: char *rec = > XLogRecGetData(record). But I still did alter v2 patch a bit, > introducing new variable xlrec, to avoid clunky casts between char * > and ginxlogUpdateMeta. For UPDATE_META_PAGE, ntuples == 0 could also mean that we may show a lot of invalid block numbers, as well, which feels a bit pointless. I have switched that to check InvalidBlockNumber instead. While reviewing the whole, I have noticed that rightlink was missing for INSERT_LISTPAGE, as well as the right/left children pages for SPLIT. I have added this information, applied the result. Thanks! -- Michael
Вложения
On Wed, 8 Oct 2025 at 10:14, Michael Paquier <michael@paquier.xyz> wrote: > > On Tue, Oct 07, 2025 at 02:08:02PM +0500, Kirill Reshke wrote: > > Turns out we already use XLogRecGetData in assignment: char *rec = > > XLogRecGetData(record). But I still did alter v2 patch a bit, > > introducing new variable xlrec, to avoid clunky casts between char * > > and ginxlogUpdateMeta. > > For UPDATE_META_PAGE, ntuples == 0 could also mean that we may show > a lot of invalid block numbers, as well, which feels a bit pointless. > I have switched that to check InvalidBlockNumber instead. Thank you for pushing! > While reviewing the whole, I have noticed that rightlink was missing > for INSERT_LISTPAGE, as well as the right/left children pages for > SPLIT. I have added this information, applied the result. Thanks! After this change, I was wondering what is the purpose of leftChildBlkno/rightChildBlkno fields of SPLIT record. Turns out all of ginxlogSplit's fields except flags are of no use. Should we remove them, reducing overall cognitive complexity of GIN internals and reducing WAL footprint? PFA patch for that. If this topic is too out-of-scope of this thread, I can start another one, just let me know. -- Best regards, Kirill Reshke
Вложения
> On 9 Oct 2025, at 17:33, Kirill Reshke <reshkekirill@gmail.com> wrote: > > Should we remove them, reducing overall cognitive complexity of GIN > internals and reducing WAL footprint? The patch does not add a single line... that's impressive :) Why not wipe ginxlogSplit entirely? Will the code be clearer with XLogRegisterData(&flags, sizeof(uint16))? Best regards, Andrey Borodin.
Hi, On Fri, Oct 10, 2025 at 10:00 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote: > > > > > On 9 Oct 2025, at 17:33, Kirill Reshke <reshkekirill@gmail.com> wrote: > > > > Should we remove them, reducing overall cognitive complexity of GIN > > internals and reducing WAL footprint? > > The patch does not add a single line... that's impressive :) > > Why not wipe ginxlogSplit entirely? Will the code be clearer with XLogRegisterData(&flags, sizeof(uint16))? > > Looks like we will not be able to process old split records after this, as 'flags' field offset was changed. So probably these fields are for backward compatibility. Does it make sense? Best regards, Arseniy Mukhin
On Tue, 14 Oct 2025, 01:24 Arseniy Mukhin, <arseniy.mukhin.dev@gmail.com> wrote:
Hi,
On Fri, Oct 10, 2025 at 10:00 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:
>
>
>
> > On 9 Oct 2025, at 17:33, Kirill Reshke <reshkekirill@gmail.com> wrote:
> >
> > Should we remove them, reducing overall cognitive complexity of GIN
> > internals and reducing WAL footprint?
>
> The patch does not add a single line... that's impressive :)
>
> Why not wipe ginxlogSplit entirely? Will the code be clearer with XLogRegisterData(&flags, sizeof(uint16))?
>
>
Looks like we will not be able to process old split records after
this, as 'flags' field offset was changed. So probably these fields
are for backward compatibility. Does it make sense?
Best regards,
Arseniy Mukhin
Hi! We do not need to support anything WAL related in new major version, since we do new initdb. There are couple of threads nearby that change WAL record layout or even drop them entirely, and that OK.
Also, we have WAL magic number for this purpose
On Mon, Oct 13, 2025 at 11:59 PM Kirill Reshke <reshkekirill@gmail.com> wrote: > > > On Tue, 14 Oct 2025, 01:24 Arseniy Mukhin, <arseniy.mukhin.dev@gmail.com> wrote: >> >> Hi, >> >> On Fri, Oct 10, 2025 at 10:00 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote: >> > >> > >> > >> > > On 9 Oct 2025, at 17:33, Kirill Reshke <reshkekirill@gmail.com> wrote: >> > > >> > > Should we remove them, reducing overall cognitive complexity of GIN >> > > internals and reducing WAL footprint? >> > >> > The patch does not add a single line... that's impressive :) >> > >> > Why not wipe ginxlogSplit entirely? Will the code be clearer with XLogRegisterData(&flags, sizeof(uint16))? >> > >> > >> >> Looks like we will not be able to process old split records after >> this, as 'flags' field offset was changed. So probably these fields >> are for backward compatibility. Does it make sense? >> >> >> Best regards, >> Arseniy Mukhin > > > Hi! We do not need to support anything WAL related in new major version, since we do new initdb. There are couple of threadsnearby that change WAL record layout or even drop them entirely, and that OK. > Also, we have WAL magic number for this purpose I see, sorry for the noise then and thanks for the explanation! Best regards, Arseniy Mukhin