Обсуждение: Lock tag of relation extend lock

Поиск
Список
Период
Сортировка

Lock tag of relation extend lock

От
Jingtang Zhang
Дата:
Hi all,

In a recent debug I found two process conflict on relation extension lock,
one is holding it for MAIN fork extension, while the other one is trying to
do FSM extension. It seems that the extension lock is using the logical relid
of a table as lock tag, but smgrextend is independant among each fork.

LockRelationForExtension is used to lock out concurrent extension to get an
accurate smgrnblocks (of MAIN fork, mostly) for where to extend the fork from.
Except for that in bufmgr.c, where the forknum is passed in as parameter,
so main/fsm/vm extension shares the code.

Would it be more reasonable to use physical identifier as the lock tag, like
rlocator + fork? In that case, smgr*extend will not block on separate forks.
And also it is easier to share code between recovery and normal operations,
(see what definition of struct BufferManagerRelation says), because currently
relation extension lock needs a relcache to be passed in, and we have to
build a fake relcache during xlog recovery. The lockinfo of the fake relcache
may be wrong actually, although it's not a problem. If we use the physical
information as lock tag, the lockinfo of fake relcache won't be that hack.

Good thing is that different forks of the same relfile can be extended
concurrently by different processes. Not sure about any side effect.

Any thoughts?

--
Regards, Jingtang




Re: Lock tag of relation extend lock

От
Andres Freund
Дата:
Hi,

On 2025-10-06 19:39:18 +0800, Jingtang Zhang wrote:
> In a recent debug I found two process conflict on relation extension lock,
> one is holding it for MAIN fork extension, while the other one is trying to
> do FSM extension. It seems that the extension lock is using the logical relid
> of a table as lock tag, but smgrextend is independant among each fork.
>
> LockRelationForExtension is used to lock out concurrent extension to get an
> accurate smgrnblocks (of MAIN fork, mostly) for where to extend the fork from.
> Except for that in bufmgr.c, where the forknum is passed in as parameter,
> so main/fsm/vm extension shares the code.

What workload actually has significant enough extension workload on the VM/FSM
to make this a problem?

Greetings,

Andres Freund



Re: Lock tag of relation extend lock

От
Jingtang Zhang
Дата:
Hi~

> What workload actually has significant enough extension workload on the VM/FSM
> to make this a problem?


The workload I was running is a concurrent INSERT into the same table. I'm
running PostgreSQL with direct I/O on a distributed file system, where the
latency of extend is significantly higher than the local storage, making
conflict of extend lock really serious (it can be mimic by adding
pg_usleep at the end of smgrzeroextend), about 200us once extend.

Most of the conflicts happen on main fork extend, however sometimes the
conflict may happen on FSM because the bulk extended pages need to be added
into FSM. It may be conflict with both main fork extend or fsm fork extend
of other processes. But actually the fsm fork extend does not need to be
conflict with main fork extend?

#5  0x00000000008a717e in WaitOnLock
#6  0x00000000008a7d4b in LockAcquireExtended
#7  0x00000000008a876e in LockAcquire
#8  0x00000000008a584f in LockRelationForExtension
#9  0x000000000087fd6b in ExtendBufferedRelShared
#10 0x00000000008816ab in ExtendBufferedRelCommon
#11 ExtendBufferedRelTo
#12 0x000000000088f538 in fsm_extend
#13 fsm_readbuf
#14 0x000000000088f627 in fsm_set_and_search
#15 0x0000000000501e81 in RelationAddBlocks

--
Regards, Jingtang