Обсуждение: get rid of RM_HEAP2_ID

Поиск
Список
Период
Сортировка

get rid of RM_HEAP2_ID

От
John Naylor
Дата:
Making the XLogRecord header nicer has been proposed several times
[1][2][3]. In [2], Robert wondered if with 2 bytes of xl_info, we
could get rid of the separate xl_xact_info. There could be other
possibilities as well, but I haven't done anything in that direction.
The attached just gets as far as the modest goal mentioned in the
subject.

0001: Split xl_info into xl_info for rmgr-specific info and xl_geninfo
for generic flags. I used the XLogInsertExtended() idea from one of
Matthias's patches in [3], but wrote the rest a different way to keep
churn small.

0002 and 0003: To simplify the rmgrs that have an opmask and separate
flag (I saw two obvious ones but I didn't make sure they were the only
ones), reserve the high 4 bits of xl_info for the "record type" (I see
xlogstats.c calls it a "recid", so that might be a better name) and
the lower 4 bits for the flag. Using the same scheme everywhere
simplifies things.

I've put the mask into these macros to reduce the places that know about them:

#define XLogRecGetRecType(decoder) ((decoder)->record->header.xl_info
& XLR_REC_TYPE_MASK)
#define XLogRecGetRecFlags(decoder) ((decoder)->record->header.xl_info
& XLR_REC_FLAGS_MASK)

The former is for callers that explicitly want to know about the
record type only, without flags. The latter is unused, since checking
a mask doesn't need to isolate all the masks first, so it may not end
up in the final version.

There are new XLR_* masks whose names reflect their new purpose
better, but they are the same values as the old ones. XLR_INFO_MASK is
kept around for compatibility, but it doesn't do anything useful since
the rmgr has the full byte available. In order for ~XLR_INFO_MASK to
become a no-op on a full byte, XLR_INFO_MASK has to be zero, which
looks very odd. Removing it would be clearer, at the cost of more
churn.

There is a small wrinkle in that heap_identify does different things
depending on presence of the  XLOG_HEAP_INIT_PAGE flag, but
xact_identify() doesn't do that with XLOG_XACT_HAS_INFO, so the latter
still masks out the flags.

0004: get rid of RM_HEAP2_ID. This was simple once the prerequisites
were in place. All of the HEAP2_* macros keep the same name and only
differ in value. Heap rmgr is completely full so it might be good to
increase to 5 bits for the record type to give it some breathing room.

I wasn't sure about this comment in heapam_xlog.c -- it seems strange
that this property would occur exactly at the point where the first
heap rmgr id ran out of bits, but check world passes without doing
anything additional:

/*
 * These operations don't overwrite MVCC data so no conflict processing is
 * required. The ones in heap2 rmgr do.
 */

[1] https://www.postgresql.org/message-id/20220715173731.6t3km5cww3f5ztfq%40awork3.anarazel.de
[2] https://www.postgresql.org/message-id/CA%2BTgmoa7pNxxe_K%3D3mTHHZGSmnrc_YgApArx3OFHN2g57nzLNw%40mail.gmail.com
[3] https://www.postgresql.org/message-id/CAEze2Wjd3jY_UhhOGdGGnC6NO%3D%2BNmtNOmd%3DJaYv-v-nwBAiXXA%40mail.gmail.com

--
John Naylor
Amazon Web Services

Вложения

Re: get rid of RM_HEAP2_ID

От
Nathan Bossart
Дата:
On Mon, Oct 06, 2025 at 04:04:00PM +0700, John Naylor wrote:
> 0004: get rid of RM_HEAP2_ID. This was simple once the prerequisites
> were in place. All of the HEAP2_* macros keep the same name and only
> differ in value. Heap rmgr is completely full so it might be good to
> increase to 5 bits for the record type to give it some breathing room.

I haven't looked at the patch set in depth yet, but +1 to adding some
breathing room if possible.  In my early attempts at partial HOT, I had to
add a third heap rmgr because the two existing ones were already full.
It'd be nice to avoid going back to multiple heap rmgrs.

-- 
nathan



Re: get rid of RM_HEAP2_ID

От
Michael Paquier
Дата:
On Mon, Oct 06, 2025 at 04:04:00PM +0700, John Naylor wrote:
> Making the XLogRecord header nicer has been proposed several times
> [1][2][3]. In [2], Robert wondered if with 2 bytes of xl_info, we
> could get rid of the separate xl_xact_info. There could be other
> possibilities as well, but I haven't done anything in that direction.
> The attached just gets as far as the modest goal mentioned in the
> subject.

Splitting the generic flags by stealing one of the padding bytes in
XLogRecord sounds like an improvement on clarity grounds, at least.

> 0001: Split xl_info into xl_info for rmgr-specific info and xl_geninfo
> for generic flags. I used the XLogInsertExtended() idea from one of
> Matthias's patches in [3], but wrote the rest a different way to keep
> churn small.

Removing XLR_RMGR_INFO_MASK becomes natural here, causing
XLR_INFO_MASK to also become unnecessary.  One benefit of this
proposal is that we would not need a heap3 once the opcodes are
completely full for the two existing ones, which should be fine enough
for many years to come.  I have noticed after writing this sentence
that it is what Matthias has done in his patch to separate these
fields.  More code churn, but I'd be in favor of fully biting the
bullet and do as Matthias is proposing, getting rid of XLR_INFO_MASK
entirely if we let the full byte at disposal of the RMGRs.

> 0002 and 0003: To simplify the rmgrs that have an opmask and separate
> flag (I saw two obvious ones but I didn't make sure they were the only
> ones), reserve the high 4 bits of xl_info for the "record type" (I see
> xlogstats.c calls it a "recid", so that might be a better name) and
> the lower 4 bits for the flag. Using the same scheme everywhere
> simplifies things.

XLR_REC_TYPE_MASK and XLR_REC_FLAGS_MASK seem a bit confusing to me.
This is an attempt to generalize a rule that two RMGRs have been using
locally.  In short I am not sure what we are gaining here, coming back
to the previous point that we should get rid of XLR_INFO_MASK, IMO.
That's the kind of rules I'd leave up to the RMGRs, so this does not
like a necessary abstraction, at least IMO.

> There are new XLR_* masks whose names reflect their new purpose
> better, but they are the same values as the old ones. XLR_INFO_MASK is
> kept around for compatibility, but it doesn't do anything useful since
> the rmgr has the full byte available. In order for ~XLR_INFO_MASK to
> become a no-op on a full byte, XLR_INFO_MASK has to be zero, which
> looks very odd. Removing it would be clearer, at the cost of more
> churn.

I think that it would be weird to keep XLR_INFO_MASK for compatibility
reasons, with it set at 0, so I would bite the bullet and just get rid
of it entirely.  There was not so long ago a proposal to get entirely
rid of most the footprint of XLR_INFO_MASK in the code, switching to
an equivalent macro:
https://www.postgresql.org/message-id/CAGjhLkNYhNfbhsZhR_sEJ=1VqJmFCzx1BVfWWZ0+it2ucqG-pw@mail.gmail.com

That's also what you are doing here..

> 0004: get rid of RM_HEAP2_ID. This was simple once the prerequisites
> were in place. All of the HEAP2_* macros keep the same name and only
> differ in value. Heap rmgr is completely full so it might be good to
> increase to 5 bits for the record type to give it some breathing room.

This is making an increase in size a prerequisite for anybody who
would want to add a new heap record, which is a problem because this
is pushing this responsibility to the one who would like to add a new
record.  In short, I am not sure that we will ever get rid of heap2,
but I see a pretty good set of arguments to do things so as we'll
never need an equivalent for heap3, or an extra RMGR for any of the
existing RMGRs that use already all of their opcodes (Gin seems to be
using 8 opcodes currently, as one example, so it's full).
--
Michael

Вложения

Re: get rid of RM_HEAP2_ID

От
John Naylor
Дата:
On Thu, Oct 9, 2025 at 1:43 PM Michael Paquier <michael@paquier.xyz> wrote:
> > 0001: Split xl_info into xl_info for rmgr-specific info and xl_geninfo
> > for generic flags. I used the XLogInsertExtended() idea from one of
> > Matthias's patches in [3], but wrote the rest a different way to keep
> > churn small.
>
> Removing XLR_RMGR_INFO_MASK becomes natural here, causing
> XLR_INFO_MASK to also become unnecessary.  One benefit of this
> proposal is that we would not need a heap3 once the opcodes are
> completely full for the two existing ones, which should be fine enough
> for many years to come.  I have noticed after writing this sentence
> that it is what Matthias has done in his patch to separate these
> fields.  More code churn, but I'd be in favor of fully biting the
> bullet and do as Matthias is proposing, getting rid of XLR_INFO_MASK
> entirely if we let the full byte at disposal of the RMGRs.

Okay.

> > 0002 and 0003: To simplify the rmgrs that have an opmask and separate
> > flag (I saw two obvious ones but I didn't make sure they were the only
> > ones), reserve the high 4 bits of xl_info for the "record type" (I see
> > xlogstats.c calls it a "recid", so that might be a better name) and
> > the lower 4 bits for the flag. Using the same scheme everywhere
> > simplifies things.
>
> XLR_REC_TYPE_MASK and XLR_REC_FLAGS_MASK seem a bit confusing to me.
> This is an attempt to generalize a rule that two RMGRs have been using
> locally.  In short I am not sure what we are gaining here, coming back
> to the previous point that we should get rid of XLR_INFO_MASK, IMO.
> That's the kind of rules I'd leave up to the RMGRs, so this does not
> like a necessary abstraction, at least IMO.

Leaving that up to the rmgr makes sense. One consideration I didn't
mention was for xlogstats.c -- "record_stats[rmid][recid]" would get
16x larger if recid could take up the full byte. Maybe that's
harmless, but I didn't want to assume. Any thoughts on that?

--
John Naylor
Amazon Web Services



Re: get rid of RM_HEAP2_ID

От
Michael Paquier
Дата:
On Thu, Oct 09, 2025 at 03:15:11PM +0700, John Naylor wrote:
> Leaving that up to the rmgr makes sense. One consideration I didn't
> mention was for xlogstats.c -- "record_stats[rmid][recid]" would get
> 16x larger if recid could take up the full byte. Maybe that's
> harmless, but I didn't want to assume. Any thoughts on that?

I've missed this interaction, thanks for mentioning it.  XLogStats is
a local state that's only used by pg_walinspect and pg_waldump, so
this extra memory consumed does worry me much; this stuff interacts
with no critical paths.
--
Michael

Вложения

Re: get rid of RM_HEAP2_ID

От
John Naylor
Дата:
On Fri, Oct 10, 2025 at 9:47 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Thu, Oct 09, 2025 at 03:15:11PM +0700, John Naylor wrote:
> > Leaving that up to the rmgr makes sense. One consideration I didn't
> > mention was for xlogstats.c -- "record_stats[rmid][recid]" would get
> > 16x larger if recid could take up the full byte. Maybe that's
> > harmless, but I didn't want to assume. Any thoughts on that?
>
> I've missed this interaction, thanks for mentioning it.  XLogStats is
> a local state that's only used by pg_walinspect and pg_waldump, so
> this extra memory consumed does worry me much; this stuff interacts
> with no critical paths.

Okay, v2 gets rid of the general info mask (split out into 0002 for
readability) and leaves alone the RMGR-specific masks (i.e. leaves out
v1 0002/3). It runs fine with installcheck while streaming to a
standby with wal_consistency_checking. I've also left out the removal
of HEAP2 for now. Giving existing RMGRs more breathing room seems like
a better motivator, going by Nathan's comment and yours.

It's worth thinking about backward compatibility if we did go as far
as a 2-byte xl_info (upthread: to allow more RMGR-specific flags, so
e.g. XACT wouldn't need xl_xact_info) In that case, we'd probably
still want a convention that only the lowest byte can contain the
record type. XLogStats could simply assume that in most cases. For
XACT 8 flags in the upper byte still won't be enough, and it will
still need to have its own opcode mask, but that's no worse than the
situation we have already.

--
John Naylor
Amazon Web Services

Вложения

Re: get rid of RM_HEAP2_ID

От
Heikki Linnakangas
Дата:
On 14/10/2025 11:13, John Naylor wrote:
> Okay, v2 gets rid of the general info mask (split out into 0002 for
> readability) and leaves alone the RMGR-specific masks (i.e. leaves out
> v1 0002/3). It runs fine with installcheck while streaming to a
> standby with wal_consistency_checking. I've also left out the removal
> of HEAP2 for now. Giving existing RMGRs more breathing room seems like
> a better motivator, going by Nathan's comment and yours.
> 
> It's worth thinking about backward compatibility if we did go as far
> as a 2-byte xl_info (upthread: to allow more RMGR-specific flags, so
> e.g. XACT wouldn't need xl_xact_info) In that case, we'd probably
> still want a convention that only the lowest byte can contain the
> record type. XLogStats could simply assume that in most cases. For
> XACT 8 flags in the upper byte still won't be enough, and it will
> still need to have its own opcode mask, but that's no worse than the
> situation we have already.

First let me say that I'm not objecting to this patch. It makes the code 
a little bit more clear, which is good, so I think I'm +0.5 on this 
overall. With that said:

I'm not sure I agree with the premise that we should try to get rid of 
RM_HEAP2_ID. There's nothing wrong with that scheme as such. As an 
alternative, we could easily teach e.g pg_waldump to treat RM_HEAP_ID 
and RM_HEAP2_ID the same for statistics purposes.

This patch consumes one of the padding bytes. That's not entirely free, 
as there is an opportunity cost: we could squeeze out the padding bytes 
and save 2 bytes on every WAL record instead.


> typedef struct XLogRecord
> {
>     uint32        xl_tot_len;        /* total len of entire record */
>     TransactionId xl_xid;        /* xact id */
>     XLogRecPtr    xl_prev;        /* ptr to previous record in log */
>     uint8        xl_info;        /* RMGR-specific info */
>     RmgrId        xl_rmid;        /* resource manager for this record */
>     uint8        xl_geninfo;        /* flag bits, see below */
>     /* 1 byte of padding here, initialize to zero */
>     pg_crc32c    xl_crc;            /* CRC for this record */
> 
>     /* XLogRecordBlockHeaders and XLogRecordDataHeader follow, no padding */
> 
> } XLogRecord;

I'd suggest some minor reordering and renaming:

typedef struct XLogRecord
{
     uint32        xl_tot_len;    /* total len of entire record */
     TransactionId xl_xid;        /* xact id */
     XLogRecPtr    xl_prev;       /* ptr to previous record in log */
     uint8         xl_flags;      /* flag bits, see below */
     RmgrId        xl_rmid;       /* resource manager for this record */
     uint8         xl_rminfo;     /* RMGR-specific info */
     /* 1 byte of padding here, initialize to zero */
     pg_crc32c     xl_crc;        /* CRC for this record */

     /* XLogRecordBlockHeaders and XLogRecordDataHeader follow, no 
padding */

} XLogRecord;


In summary:

- Rename 'xl_info' to 'xl_rminfo' to make it more clear that it's 
RMGR-specific.

- Rename 'xl_geninfo' to 'xl_flags'. I guess the 'gen' meant 'generic' 
or 'general' to distinguish from the rmgr-specific field. But we don't 
use a 'gen' prefix like that for any of the other non-rmgr-specific 
fields. We could keep it under the old 'xl_info' name instead, but it's 
nice to rename it to avoid confusion with the old xl_info field. It now 
only contains flags, so 'xl_flags' seems appropriate.

- Reorder the fields so that 'xl_rmid' comes before 'xl_rminfo'. I find 
that more intuitive, because the contents of 'xl_rminfo' depends on 
'xl_rmid'.

While we're at it, I wonder if it would be more clear to have explicit 
'xl_padding' field for the unused byte.

- Heikki




Re: get rid of RM_HEAP2_ID

От
Michael Paquier
Дата:
On Tue, Oct 14, 2025 at 03:20:16PM +0300, Heikki Linnakangas wrote:
> I'm not sure I agree with the premise that we should try to get rid of
> RM_HEAP2_ID. There's nothing wrong with that scheme as such. As an
> alternative, we could easily teach e.g pg_waldump to treat RM_HEAP_ID and
> RM_HEAP2_ID the same for statistics purposes.

Yeah, I'd rather keep heap2 as well.  As long as there is more room
for the record IDs, we're still going to need it in the long run.

> This patch consumes one of the padding bytes. That's not entirely free, as
> there is an opportunity cost: we could squeeze out the padding bytes and
> save 2 bytes on every WAL record instead.

Do you recall an alternative where it would have been possible to save
2 bytes for each record by removing the padding, and still have the
full byte of xl_info be usable freely by each RMGR?  I cannot recall
any magic based on how XLogRecord is designed now, but perhaps I
have missed an argument.

We could move out xl_xid, which should not be required for all
records, shaving 4 bytes from the base XLogRecord.  I'm afraid of the
duplication this would create if we push this data to each RMGR, which
would, I guess, require a new RMGR callback to retrieve this field on
a per-record basis.  But perhaps it would not be that bad.

> In summary:
>
> - Rename 'xl_info' to 'xl_rminfo' to make it more clear that it's
> RMGR-specific.
>
> - Rename 'xl_geninfo' to 'xl_flags'. I guess the 'gen' meant 'generic' or
> 'general' to distinguish from the rmgr-specific field. But we don't use a
> 'gen' prefix like that for any of the other non-rmgr-specific fields. We
> could keep it under the old 'xl_info' name instead, but it's nice to rename
> it to avoid confusion with the old xl_info field. It now only contains
> flags, so 'xl_flags' seems appropriate.
>
> - Reorder the fields so that 'xl_rmid' comes before 'xl_rminfo'. I find that
> more intuitive, because the contents of 'xl_rminfo' depends on 'xl_rmid'.

Okay.

> While we're at it, I wonder if it would be more clear to have explicit
> 'xl_padding' field for the unused byte.

Adding an extra field to document the padding sounds like a good idea
here.  It should always be zero, so perhaps it could even be used as a
sanity check?
--
Michael

Вложения

Re: get rid of RM_HEAP2_ID

От
John Naylor
Дата:
On Tue, Oct 14, 2025 at 7:20 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> This patch consumes one of the padding bytes. That's not entirely free,
> as there is an opportunity cost: we could squeeze out the padding bytes
> and save 2 bytes on every WAL record instead.

I must be misunderstanding, because I'm not sure how that would work
with the alignment requirement of later parts.

If you mean put some other parts of certain records into header (like
the xl_xact_info idea mentioned earlier, or something else), it's
worth thinking about that -- maybe some things would be helped by this
patch, and others would be closed off by it.

> I'd suggest some minor reordering and renaming:
>
> typedef struct XLogRecord
> {
>      uint32        xl_tot_len;    /* total len of entire record */
>      TransactionId xl_xid;        /* xact id */
>      XLogRecPtr    xl_prev;       /* ptr to previous record in log */
>      uint8         xl_flags;      /* flag bits, see below */
>      RmgrId        xl_rmid;       /* resource manager for this record */
>      uint8         xl_rminfo;     /* RMGR-specific info */
>      /* 1 byte of padding here, initialize to zero */
>      pg_crc32c     xl_crc;        /* CRC for this record */

Works for me.

> - Rename 'xl_info' to 'xl_rminfo' to make it more clear that it's
> RMGR-specific.

That makes sense here in the definition. There are also lots of local
variables named "info", and it would be less invasive to keep those as
they are. Likewise for the macro XLogRecGetInfo.

--
John Naylor
Amazon Web Services



Re: get rid of RM_HEAP2_ID

От
John Naylor
Дата:
On Wed, Oct 15, 2025 at 6:55 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Tue, Oct 14, 2025 at 03:20:16PM +0300, Heikki Linnakangas wrote:
> > I'm not sure I agree with the premise that we should try to get rid of
> > RM_HEAP2_ID. There's nothing wrong with that scheme as such. As an
> > alternative, we could easily teach e.g pg_waldump to treat RM_HEAP_ID and
> > RM_HEAP2_ID the same for statistics purposes.
>
> Yeah, I'd rather keep heap2 as well.  As long as there is more room
> for the record IDs, we're still going to need it in the long run.

Okay, I'll drop that aspect.

> We could move out xl_xid, which should not be required for all
> records, shaving 4 bytes from the base XLogRecord.  I'm afraid of the
> duplication this would create if we push this data to each RMGR, which
> would, I guess, require a new RMGR callback to retrieve this field on
> a per-record basis.  But perhaps it would not be that bad.

I've wondered if it would be possible to make xl_tot_len a varint that
starts in the last byte of the header, with the next bytes being like
XLogRecordDataHeader[Short|Long], but likewise using a varint.

--
John Naylor
Amazon Web Services



Re: get rid of RM_HEAP2_ID

От
Michael Paquier
Дата:
On Wed, Oct 15, 2025 at 12:01:44PM +0700, John Naylor wrote:
> On Wed, Oct 15, 2025 at 6:55 AM Michael Paquier <michael@paquier.xyz> wrote:
>> We could move out xl_xid, which should not be required for all
>> records, shaving 4 bytes from the base XLogRecord.  I'm afraid of the
>> duplication this would create if we push this data to each RMGR, which
>> would, I guess, require a new RMGR callback to retrieve this field on
>> a per-record basis.  But perhaps it would not be that bad.
>
> I've wondered if it would be possible to make xl_tot_len a varint that
> starts in the last byte of the header, with the next bytes being like
> XLogRecordDataHeader[Short|Long], but likewise using a varint.

This suggestion gives me some shivers, TBH.  We've had a couple of
issues with the read of records that spawn across multiple pages,
relying now very strongly on xl_tot_len being the first field in
XLogRecord.  Making this field variable may be really tricky while the
code we have now in xlogreader.h is pretty stable.
--
Michael

Вложения