Обсуждение: Is element access after HASH_REMOVE ever OK?
Hi, After hearing from a couple of directions about systems spending too much time scanning the local lock hash table, I wrote the trivial patch to put them in a linked list, before learning that people have considered that before, so I should probably go and read some history on that and find out why it hasn't been done... However, I noticed in passing that RemoveLocalLock() accesses *locallock after removing it from the hash table (in assertion builds only). So one question I have is whether it's actually a programming rule that you can't do that (at most you can compare the pointer against NULL), or whether it's supposed to be safe-if-you-know-what-you're-doing, as the existing comments hints. Here also is a patch that does wipe_mem on removed elements, as threatened last time this topic came up[1], which reveals the problem. I'm also not exactly sure why it's only a WARNING if your local lock table is out of sync, but perhaps that's in the archives too. [1] https://www.postgresql.org/message-id/flat/CAHut%2BPs-pL%2B%2Bf6CJwPx2%2BvUqXuew%3DXt-9Bi-6kCyxn%2BFwi2M7w%40mail.gmail.com
Вложения
Thomas Munro <thomas.munro@gmail.com> writes:
> However, I noticed in passing that RemoveLocalLock() accesses
> *locallock after removing it from the hash table (in assertion builds
> only). So one question I have is whether it's actually a programming
> rule that you can't do that (at most you can compare the pointer
> against NULL), or whether it's supposed to be
> safe-if-you-know-what-you're-doing, as the existing comments hints.
I'd say it's, at best, unwarranted familiarity with the dynahash
implementation ...
> Here also is a patch that does wipe_mem on removed elements, as
> threatened last time this topic came up[1], which reveals the problem.
... one good reason being that it'll fail under this sort of
entirely-reasonable debugging aid. Can we get rid of the unsafe
access easily?
regards, tom lane
I wrote:
> ... Can we get rid of the unsafe
> access easily?
Oh, shoulda read your second patch first. Looking at that,
I fear it might not be quite that simple, because the
comment on CheckAndSetLockHeld says very clearly
* It is callers responsibility that this function is called after
* acquiring/releasing the relation extension/page lock.
so your proposed patch violates that specification.
I'm inclined to think that this API spec is very poorly thought out
and should be changed --- why is it that the flags should change
*after* the lock change in both directions? But we'd have to take
a look at the usage of these flags to understand what's going on
exactly.
regards, tom lane
Hi, On 2021-05-10 20:15:41 -0400, Tom Lane wrote: > I wrote: > > ... Can we get rid of the unsafe > > access easily? > > Oh, shoulda read your second patch first. Looking at that, > I fear it might not be quite that simple, because the > comment on CheckAndSetLockHeld says very clearly > > * It is callers responsibility that this function is called after > * acquiring/releasing the relation extension/page lock. > > so your proposed patch violates that specification. It wouldn't be too hard to fix this though - we can just copy the locktag into a local variable. Or use one of the existing local copies, higher in the stack. But: > I'm inclined to think that this API spec is very poorly thought out > and should be changed --- why is it that the flags should change > *after* the lock change in both directions? But we'd have to take > a look at the usage of these flags to understand what's going on > exactly. I can't see a need to do it after the HASH_REMOVE at least - as we don't return early if that fails, there's no danger getting out of sync if we reverse the order. I think the comment could just be changed to say that the function has to be called after it is inevitable that the lock is acquired/released. Greetings, Andres Freund