Re: buildfarm: could not read block 3 in file "base/16384/2662":read only 0 of 8192 bytes

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: buildfarm: could not read block 3 in file "base/16384/2662":read only 0 of 8192 bytes
Дата
Msg-id 20180831221331.3l22nwxrokqeujav@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: buildfarm: could not read block 3 in file "base/16384/2662": read only 0 of 8192 bytes  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: buildfarm: could not read block 3 in file "base/16384/2662": read only 0 of 8192 bytes  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On 2018-08-29 17:58:19 -0400, Tom Lane wrote:
> I wrote:
> > We could perhaps fix this with a less invasive change than what you
> > suggest here, by attacking the missed-call-due-to-recursion aspect
> > rather than monkeying with how relcache rebuild itself works.
> 
> Seeing that rearranging the relcache rebuild logic is looking less than
> trivial, I thought it'd be a good idea to investigate this alternative
> approach.
> 
> Essentially, the problem here is that lmgr.c supposes that
> AcceptInvalidationMessages is an atomic operation, which it most
> certainly isn't.  In reality, it's unsafe to suppose that we can skip
> reading incoming inval messages until we have *completed*
> AcceptInvalidationMessages, not just invoked it.
> 
> However, we can fix that very easily, by retaining one additional bit
> of state in each LOCALLOCK entry, which records whether we know we have
> completed AcceptInvalidationMessages at least once subsequent to obtaining
> that lock.  In the attached draft patch, I refer to that state as having
> "cleared" the lock, but I'm not super pleased with that terminology ---
> anybody have a better idea?
> 
> A small problem with the lock.c API as it stands is that we'd have to do
> an additional hashtable lookup to re-find the relevant LOCALLOCK entry.
> In the attached, I fixed that by extending LockAcquireExtended's API
> to allow the caller to obtain a pointer to the LOCALLOCK entry, making
> it trivially cheap to set the flag after we finish
> AcceptInvalidationMessages.  LockAcquireExtended has only one external
> caller right now, which makes me think it's OK to change its API rather
> than introduce YA variant entry point, but that could be argued.
> 
> There are two ways we could adjust the return value from
> LockAcquire(Extended) to cope with this requirement.  What I did here
> was to add an additional LockAcquireResult code, so that "already held"
> can be distinguished from "already held and cleared".  But we don't
> really need to make that distinction, at least not in the core code.
> So another way would be to redefine LOCKACQUIRE_ALREADY_HELD as meaning
> "held and cleared", and then just return LOCKACQUIRE_OK if you haven't
> called MarkLockClear for the lock.  I'm not entirely sure which way is
> less likely to break any third-party code that might be inspecting the
> result of LockAcquire.  You could probably argue it either way depending
> on what you think the third-party code is doing with the result.
> 
> Anyway, the attached appears to fix the problem: Andres' test script
> fails in fractions of a second with HEAD on my workstation, but it's
> run error-free for quite awhile now with this patch.  And this is sure
> a lot simpler than any relcache rebuild refactoring we're likely to come
> up with.
> 
> Discuss.

This is a neat idea.

I've one intuitive correctness concern that I can't quite nail down, I
plan to spend more time on those aspects (Partially around previously
held locks and invalidations generated with non-conflicting lock-modes).

Leaving that aside, I think there's one architectural aspect of my
approach that I prefer over yours: Deduplicating eager cache rebuilds
like my approach does seems quite advantageous. There's a lot of cases
where we end up sending out many cache rebuilds that invalidate the same
entries - as we do not defer rebuilds, that leads to repeated
rebuilds. My approach avoids overhead associated with that, as long as
the invalidations are queued close enough together.

We could obviously just do both, but that seems unnecessary.

Greetings,

Andres Freund


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexander Korotkov
Дата:
Сообщение: Re: Dimension limit in contrib/cube (dump/restore hazard?)
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: pg_verify_checksums and -fno-strict-aliasing