Обсуждение: Clean up NamedLWLockTranche stuff
While working on the new shmem allocation functions, I looked at how the NamedLWLockTranche stuff works now in lwlock.c, and I have to say it's a bit of a mess. What is a "named tranche"? It usually means tranches requested with RequestNamedLWLockTranche() at postmaster startup. But all tranches have a name. LWLockTrancheNames includes all user-defined tranches, also ones assigned with LWLockNewTrancheId(), and MAX_NAMED_TRANCHES is the maximum for all of them. At postmaster startup, NamedLWLockTrancheRequests points to a backend-private array. But after startup, and always in backends, it points to a copy in shared memory and LocalNamedLWLockTrancheRequestArray is used to hold the original. It took me a while to realize that NamedLWLockTrancheRequests in shared memory is *not* updated when you call LWLockNewTrancheId(), it only holds the requests made with RequestNamedLWLockTranche() before startup. I propose the attached refactorings to make this less confusing. See commit messages for details. - Heikki
Вложения
- v1-0001-Rename-MAX_NAMED_TRANCHES-to-MAX_USER_DEFINED_TRA.patch
- v1-0002-Refactor-how-user-defined-LWLock-tranches-are-sto.patch
- v1-0003-Use-a-separate-spinlock-to-protect-LWLockTranches.patch
- v1-0004-Use-ShmemInitStruct-to-allocate-lwlock.c-s-shared.patch
- v1-0005-Move-ShmemIndexLock-into-ShmemAllocator.patch
On Thu, Mar 26, 2026 at 02:16:52PM +0200, Heikki Linnakangas wrote: > At postmaster startup, NamedLWLockTrancheRequests points to a > backend-private array. But after startup, and always in backends, it points > to a copy in shared memory and LocalNamedLWLockTrancheRequestArray is used > to hold the original. It took me a while to realize that > NamedLWLockTrancheRequests in shared memory is *not* updated when you call > LWLockNewTrancheId(), it only holds the requests made with > RequestNamedLWLockTranche() before startup. Right. LocalNamedLWLockTrancheRequestArray is needed so that we can re-initialize shared memory after a crash. See commit c3cc2ab87d. > I propose the attached refactorings to make this less confusing. See commit > messages for details. Thanks for doing this, Heikki. I agree that we ought to make this stuff cleaner. I've asked Sami Imseih, who worked on LWLocks with me last year, to look at this patch set, too. > Subject: [PATCH v1 1/5] Rename MAX_NAMED_TRANCHES to MAX_USER_DEFINED_TRANCHES Seems fine to me. 0002: > + foreach(lc, NamedLWLockTrancheRequests) nitpick: These foreach loops seem like good opportunities to use foreach_ptr. The comment atop NumLWLocksForNamedTranches might benefit from mentioning RequestNamedLWLockTranche() and the fact that it only works in the postmaster. Perhaps an assertion is warranted, too. + SpinLockAcquire(ShmemLock); + LocalNumUserDefinedTranches = LWLockTranches->num_user_defined; + SpinLockRelease(ShmemLock); Not critical, but it might be worth making num_user_defined an atomic. Overall, 0002 looks reasonable to me upon a first read-through. > Subject: [PATCH v1 3/5] Use a separate spinlock to protect LWLockTranches Seems fine to me. 0004: > +++ b/src/backend/storage/ipc/shmem.c > @@ -379,7 +379,8 @@ ShmemInitStruct(const char *name, Size size, bool *foundPtr) > > Assert(ShmemIndex != NULL); > > - LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE); > + if (IsUnderPostmaster) > + LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE); Am I understanding that we assume ShmemInitStruct() is only called by the postmaster when there are no other backends yet? 0005: > - if (IsUnderPostmaster) > - LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE); > + LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE); Oh, this reverts many of these changes from 0004. Maybe the patches could be reordered to avoid this? -- nathan
On 26/03/2026 16:37, Nathan Bossart wrote:
> On Thu, Mar 26, 2026 at 02:16:52PM +0200, Heikki Linnakangas wrote:
>> 0002:
>>
>> + foreach(lc, NamedLWLockTrancheRequests)
>
> nitpick: These foreach loops seem like good opportunities to use
> foreach_ptr.
>
> The comment atop NumLWLocksForNamedTranches might benefit from mentioning
> RequestNamedLWLockTranche() and the fact that it only works in the
> postmaster. Perhaps an assertion is warranted, too.
There's already this check in RequestNamedLWLockTranche():
if (!process_shmem_requests_in_progress)
elog(FATAL, "cannot request additional LWLocks outside
shmem_request_hook");
shmem_request_hooks are only called early at postmaster startup.
> + SpinLockAcquire(ShmemLock);
> + LocalNumUserDefinedTranches = LWLockTranches->num_user_defined;
> + SpinLockRelease(ShmemLock);
>
> Not critical, but it might be worth making num_user_defined an atomic.
Yeah I considered that. The lock is still needed in
LWLockNewTrancheId(), though, to prevent two concurrent
LWLockNewTrancheId() calls from running concurrently. Using an atomic
would allow the extra optimization of reading the value without
acquiring spinlock, but it seems more clear to have a clear-cut rule
that you must always hold the spinlock whenever accessing the field.
> 0004:
>
>> +++ b/src/backend/storage/ipc/shmem.c
>> @@ -379,7 +379,8 @@ ShmemInitStruct(const char *name, Size size, bool *foundPtr)
>>
>> Assert(ShmemIndex != NULL);
>>
>> - LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
>> + if (IsUnderPostmaster)
>> + LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
>
> Am I understanding that we assume ShmemInitStruct() is only called by the
> postmaster when there are no other backends yet?
Yeah. LWLockAcquire has this:
/*
* We can't wait if we haven't got a PGPROC. This should only occur
* during bootstrap or shared memory initialization. Put an Assert
here
* to catch unsafe coding practices.
*/
Assert(!(proc == NULL && IsUnderPostmaster));
To be honest I didn't realize we tolerate that, calling LWLockAcquire in
postmaster, until I started to work on this. It might be worth having
some extra sanity checks here, to e.g. to throw an error if
LWLockAcquire is called from postmaster after startup. But this isn't new.
> 0005:
>
>> - if (IsUnderPostmaster)
>> - LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
>> + LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
>
> Oh, this reverts many of these changes from 0004. Maybe the patches could
> be reordered to avoid this?
Makes sense.
Thanks for the review!
- Heikki
Hi,
Thanks for the patches!
> I propose the attached refactorings to make this less confusing. See
> commit messages for details.
I only took a look at 0001 so far, and I do agree with this statement
in the commit message:
"The "user defined" term was already used in LWTRANCHE_FIRST_USER_DEFINED,
so let's standardize on that to mean tranches allocated with either
RequestNamedLWLockTranche() or LWLockNewTrancheId()."
I do wonder if 0001 is going far enough though.
Instead of just standardizing that "user defined" could mean tranches allocated
with RequestNamedLWLockTranche() or LWLockNewTrancheId(), how about we also
rename these APIs to reflect that as well? This way we remove all concept of
"named tranche" which is what it sounds like to me you are proposing.
rename RequestNamedLWLockTranche() to RequestUserDefinedLWLockTranche()
and LWLockNewTrancheId() to RegisterUserDefinedLWLockTranche()
RequestNamedLWLockTranche() requests the lwlock at shmem_request time,
which is later registered via LWLockNewTrancheId() when lwlocks are
initialized by the postmaster.
Also, the name LWLockNewTrancheId() is selling what this function does
too short.
It does return a new tranche ID, but it also takes in a user-defined tranche
name and copies ("registers") that name into LWLockTrancheNames.
v19 is already changing the signature of LWLockNewTrancheId(), so maybe
improving the names of these APIs makes sense to do.
--
Sami Imseih
Amazon Web Services (AWS)
Thanks!
On 26/03/2026 18:34, Sami Imseih wrote:
>> I propose the attached refactorings to make this less confusing. See
>> commit messages for details.
>
> I only took a look at 0001 so far, and I do agree with this statement
> in the commit message:
>
> "The "user defined" term was already used in LWTRANCHE_FIRST_USER_DEFINED,
> so let's standardize on that to mean tranches allocated with either
> RequestNamedLWLockTranche() or LWLockNewTrancheId()."
>
> I do wonder if 0001 is going far enough though.
>
> Instead of just standardizing that "user defined" could mean tranches allocated
> with RequestNamedLWLockTranche() or LWLockNewTrancheId(), how about we also
> rename these APIs to reflect that as well? This way we remove all concept of
> "named tranche" which is what it sounds like to me you are proposing.
>
> rename RequestNamedLWLockTranche() to RequestUserDefinedLWLockTranche()
> and LWLockNewTrancheId() to RegisterUserDefinedLWLockTranche()
I'd rather not change RequestNamedLWLockTranche(), because I think
LWLockNewTrancheId() is better and should be used in new code. I
consider RequestNamedLWLockTranche() to be a legacy function, for
backwards compatibility.
> RequestNamedLWLockTranche() requests the lwlock at shmem_request time,
> which is later registered via LWLockNewTrancheId() when lwlocks are
> initialized by the postmaster.
>
> Also, the name LWLockNewTrancheId() is selling what this function does
> too short.
> It does return a new tranche ID, but it also takes in a user-defined tranche
> name and copies ("registers") that name into LWLockTrancheNames.
>
> v19 is already changing the signature of LWLockNewTrancheId(), so maybe
> improving the names of these APIs makes sense to do.
Oh, I didn't realize we changed the LWLockNewTrancheId() signature!
Yeah, if we're changing it anyway, we might as well rename it. I'm not
sure if I like RegisterUserDefinedLWLockTranche() better, but let's
think it through.
- Heikki
On 26/03/2026 18:57, Heikki Linnakangas wrote: > Thanks! > > On 26/03/2026 18:34, Sami Imseih wrote: >>> I propose the attached refactorings to make this less confusing. See >>> commit messages for details. >> >> I only took a look at 0001 so far, and I do agree with this statement >> in the commit message: I committed these now, but I'm all ears if you still have comments on the rest of the patches. - Heikki
Hi, > > Thanks! > > > > On 26/03/2026 18:34, Sami Imseih wrote: > >>> I propose the attached refactorings to make this less confusing. See > >>> commit messages for details. > >> > >> I only took a look at 0001 so far, and I do agree with this statement > >> in the commit message: > > I committed these now, but I'm all ears if you still have comments on > the rest of the patches. Sorry for the delay. I see you committed the rest. The only issue I found is with d6eba30 +/* backend-local copy of NamedLWLockTranches->num_user_defined */ +static int LocalNumUserDefinedTranches; The comment here should reference "LWLockTranches->num_user_defined " instead. >> rename RequestNamedLWLockTranche() to RequestUserDefinedLWLockTranche() >> and LWLockNewTrancheId() to RegisterUserDefinedLWLockTranche() > I'd rather not change RequestNamedLWLockTranche(), because I think > LWLockNewTrancheId() is better and should be used in new code. That's fair. >> v19 is already changing the signature of LWLockNewTrancheId(), so maybe >> improving the names of these APIs makes sense to do. > Oh, I didn't realize we changed the LWLockNewTrancheId() signature! > Yeah, if we're changing it anyway, we might as well rename it. I'm not > sure if I like RegisterUserDefinedLWLockTranche() better, but let's > think it through. Maybe, RegisterNewLWLockTrancheId() could be more meaningful? Also, there are a few places in lwlock.c where "named tranches" is mentioned. Maybe we should just say "user-defined tranches" instead? -- Sami Imseih Amazon Web Services (AWS)
> +/* backend-local copy of NamedLWLockTranches->num_user_defined */ > +static int LocalNumUserDefinedTranches; > The comment here should reference "LWLockTranches->num_user_defined " > instead. > Also, there are a few places in lwlock.c where "named tranches" is mentioned. > Maybe we should just say "user-defined tranches" instead? Like the attached. -- Sami
Вложения
On 27/03/2026 06:49, Sami Imseih wrote: >> +/* backend-local copy of NamedLWLockTranches->num_user_defined */ >> +static int LocalNumUserDefinedTranches; > >> The comment here should reference "LWLockTranches->num_user_defined " >> instead. > >> Also, there are a few places in lwlock.c where "named tranches" is mentioned. >> Maybe we should just say "user-defined tranches" instead? > > Like the attached. > @@ -460,7 +460,7 @@ LWLockShmemInit(void) > } > > /* > - * Initialize LWLocks that are fixed and those belonging to named tranches. > + * Initialize LWLocks that are fixed and those belonging to user-defined tranches. > */ > static void > InitializeLWLocks(int numLocks) Only tranches requested with RequestNamedLWLockTranche() have locks in the main array, so I reworded this some more to: /* * Initialize LWLocks for built-in tranches and those requested with * RequestNamedLWLockTranche(). */ Committed with that little change, thanks! - Heikki
> Committed with that little change, thanks! Thanks! I think there is one more comment cleanup in lwlock.c /* - * This points to the main array of LWLocks in shared memory. Backends inherit - * the pointer by fork from the postmaster (except in the EXEC_BACKEND case, - * where we have special measures to pass it down). + * This points to the main array of LWLocks in shared memory. */ we no longer need to take special measures to pass down MainLWLockArray through the BackendParameters. -- Sami
Вложения
Hi, On 2026-03-27 11:45:56 +0200, Heikki Linnakangas wrote: > Committed with that little change, thanks! This seems to have broken buildfarm animal batta: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=batta&dt=2026-03-27%2002%3A05%3A01 # Running: pg_rewind --debug --source-pgdata /home/admin/batta/buildroot/HEAD/pgsql.build/src/bin/pg_rewind/tmp_check/t_001_basic_standby_local_data/pgdata --target-pgdata /home/admin/batta/buildroot/HEAD/pgsql.build/src/bin/pg_rewind/tmp_check/t_001_basic_primary_local_data/pgdata--no-sync --config-file /home/admin/batta/buildroot/HEAD/pgsql.build/src/bin/pg_rewind/tmp_check/tmp_test_QbsG/primary-postgresql.conf.tmp pg_rewind: executing "/home/admin/batta/buildroot/HEAD/pgsql.build/tmp_install/home/admin/batta/buildroot/HEAD/inst/bin/postgres"for target serverto complete crash recovery TRAP: failed Assert("MemoryContextIsValid(context)"), File: "mcxt.c", Line: 1270, PID: 230491 /home/admin/batta/buildroot/HEAD/pgsql.build/tmp_install/home/admin/batta/buildroot/HEAD/inst/bin/postgres(ExceptionalCondition+0x54)[0xaaaae186c204] /home/admin/batta/buildroot/HEAD/pgsql.build/tmp_install/home/admin/batta/buildroot/HEAD/inst/bin/postgres(MemoryContextAllocExtended+0x0)[0xaaaae18a2a24] /home/admin/batta/buildroot/HEAD/pgsql.build/tmp_install/home/admin/batta/buildroot/HEAD/inst/bin/postgres(RequestNamedLWLockTranche+0x6c)[0xaaaae16e7310] /home/admin/batta/buildroot/HEAD/pgsql.build/tmp_install/home/admin/batta/buildroot/HEAD/inst/bin/postgres(process_shmem_requests+0x28)[0xaaaae1881628] /home/admin/batta/buildroot/HEAD/pgsql.build/tmp_install/home/admin/batta/buildroot/HEAD/inst/bin/postgres(PostgresSingleUserMain+0xc4)[0xaaaae1701a34] /home/admin/batta/buildroot/HEAD/pgsql.build/tmp_install/home/admin/batta/buildroot/HEAD/inst/bin/postgres(main+0x6ac)[0xaaaae12a2adc] /lib/aarch64-linux-gnu/libc.so.6(__libc_start_main+0xe8)[0xffff99713dd8] /home/admin/batta/buildroot/HEAD/pgsql.build/tmp_install/home/admin/batta/buildroot/HEAD/inst/bin/postgres(+0xf2b98)[0xaaaae12a2b98] Aborted pg_rewind: error: postgres single-user mode in target cluster failed pg_rewind: detail: Command was: /home/admin/batta/buildroot/HEAD/pgsql.build/tmp_install/home/admin/batta/buildroot/HEAD/inst/bin/postgres--single -F -D/home/admin/batta/buildroot/HEAD/pgsql.build/src/bin/pg_rewind/tmp_check/t_001_basic_primary_local_data/pgdata -c config_file=/home/admin/batta/buildroot/HEAD/pgsql.build/src/bin/pg_rewind/tmp_check/tmp_test_QbsG/primary-postgresql.conf.tmp template1< /dev/null Presumably the reason that batta failed is its special configuration: shared_preload_libraries = 'pg_stat_statements'; regress_dump_restore; wal_consistency_checking; compute_query_id = regress;--enable-injection-points Greetings, Andres Freund
On Fri, Mar 27, 2026 at 05:22:33PM -0400, Andres Freund wrote:
> TRAP: failed Assert("MemoryContextIsValid(context)"), File: "mcxt.c", Line: 1270, PID: 230491
> [...](ExceptionalCondition+0x54)[0xaaaae186c204]
> [...](MemoryContextAllocExtended+0x0)[0xaaaae18a2a24]
> [...](RequestNamedLWLockTranche+0x6c)[0xaaaae16e7310]
> [...](process_shmem_requests+0x28)[0xaaaae1881628]
> [...](PostgresSingleUserMain+0xc4)[0xaaaae1701a34]
> [...](main+0x6ac)[0xaaaae12a2adc]
> /lib/aarch64-linux-gnu/libc.so.6(__libc_start_main+0xe8)[0xffff99713dd8]
> [...](+0xf2b98)[0xaaaae12a2b98]
> Aborted
> pg_rewind: error: postgres single-user mode in target cluster failed
Hm. AFAICT PostmasterContext isn't created in single-user mode, and the
commit in question has RequestNamedLWLockTranche() allocate requests there.
I guess the idea is to allow backends to free that memory after forking
from postmaster, but we don't do that for the NamedLWLockTrancheRequests
list. Maybe we should surround the last part of that function with
MemoryContextSwitchTo(...) to either TopMemoryContext or PostmasterContext
depending on whether we're in single-user mode.
--
nathan
On Fri, Mar 27, 2026 at 04:50:12PM -0500, Nathan Bossart wrote:
> On Fri, Mar 27, 2026 at 05:22:33PM -0400, Andres Freund wrote:
>> TRAP: failed Assert("MemoryContextIsValid(context)"), File: "mcxt.c", Line: 1270, PID: 230491
>> [...](ExceptionalCondition+0x54)[0xaaaae186c204]
>> [...](MemoryContextAllocExtended+0x0)[0xaaaae18a2a24]
>> [...](RequestNamedLWLockTranche+0x6c)[0xaaaae16e7310]
>> [...](process_shmem_requests+0x28)[0xaaaae1881628]
>> [...](PostgresSingleUserMain+0xc4)[0xaaaae1701a34]
>> [...](main+0x6ac)[0xaaaae12a2adc]
>> /lib/aarch64-linux-gnu/libc.so.6(__libc_start_main+0xe8)[0xffff99713dd8]
>> [...](+0xf2b98)[0xaaaae12a2b98]
>> Aborted
>> pg_rewind: error: postgres single-user mode in target cluster failed
>
> Hm. AFAICT PostmasterContext isn't created in single-user mode, and the
> commit in question has RequestNamedLWLockTranche() allocate requests there.
> I guess the idea is to allow backends to free that memory after forking
> from postmaster, but we don't do that for the NamedLWLockTrancheRequests
> list. Maybe we should surround the last part of that function with
> MemoryContextSwitchTo(...) to either TopMemoryContext or PostmasterContext
> depending on whether we're in single-user mode.
Concretely, like the attached.
--
nathan
Вложения
On 28/03/2026 00:05, Nathan Bossart wrote:
> On Fri, Mar 27, 2026 at 04:50:12PM -0500, Nathan Bossart wrote:
>> On Fri, Mar 27, 2026 at 05:22:33PM -0400, Andres Freund wrote:
>>> TRAP: failed Assert("MemoryContextIsValid(context)"), File: "mcxt.c", Line: 1270, PID: 230491
>>> [...](ExceptionalCondition+0x54)[0xaaaae186c204]
>>> [...](MemoryContextAllocExtended+0x0)[0xaaaae18a2a24]
>>> [...](RequestNamedLWLockTranche+0x6c)[0xaaaae16e7310]
>>> [...](process_shmem_requests+0x28)[0xaaaae1881628]
>>> [...](PostgresSingleUserMain+0xc4)[0xaaaae1701a34]
>>> [...](main+0x6ac)[0xaaaae12a2adc]
>>> /lib/aarch64-linux-gnu/libc.so.6(__libc_start_main+0xe8)[0xffff99713dd8]
>>> [...](+0xf2b98)[0xaaaae12a2b98]
>>> Aborted
>>> pg_rewind: error: postgres single-user mode in target cluster failed
>>
>> Hm. AFAICT PostmasterContext isn't created in single-user mode, and the
>> commit in question has RequestNamedLWLockTranche() allocate requests there.
>> I guess the idea is to allow backends to free that memory after forking
>> from postmaster, but we don't do that for the NamedLWLockTrancheRequests
>> list. Maybe we should surround the last part of that function with
>> MemoryContextSwitchTo(...) to either TopMemoryContext or PostmasterContext
>> depending on whether we're in single-user mode.
>
> Concretely, like the attached.
LGTM, thanks! Will you commit or want me to pick it up?
- Heikki
On Sat, Mar 28, 2026 at 12:07:26AM +0200, Heikki Linnakangas wrote: > LGTM, thanks! Will you commit or want me to pick it up? I'm not able to commit it right this second, so feel free to take it. Else it'll probably be a day or two before I can get to it. -- nathan
On 28/03/2026 00:10, Nathan Bossart wrote: > On Sat, Mar 28, 2026 at 12:07:26AM +0200, Heikki Linnakangas wrote: >> LGTM, thanks! Will you commit or want me to pick it up? > > I'm not able to commit it right this second, so feel free to take it. Else > it'll probably be a day or two before I can get to it. Ok, committed, thanks! - Heikki
Hi Heikki, Just raising this again to make sure it doesn’t get overlooked [1]. Thanks! [1] [https://www.postgresql.org/message-id/CAA5RZ0vPWNMvTBqyH7nqDRrHd6Y4Et5iNqXFuwpbsPOk3cL4rQ%40mail.gmail.com] -- Sami
On 28/03/2026 19:20, Sami Imseih wrote: > Hi Heikki, > > Just raising this again to make sure it doesn’t get overlooked [1]. Fixed, thanks! - Heikki