Re: BUG #15900: `executor could not find named tuplestore` intriggers with transition table and row locks

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: BUG #15900: `executor could not find named tuplestore` intriggers with transition table and row locks
Дата
Msg-id 20190711005202.nq4hdomsdi7ml4fr@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: BUG #15900: `executor could not find named tuplestore` in triggers with transition table and row locks  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
Hi,

On 2019-07-09 11:38:13 -0400, Tom Lane wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
> >> Alex's repro doesn't work on 11 though,
> >> because EPQ is not entered at all.  Which raises the question: why do
> >> we need to enter EPQ after commit ad0bda5d on 12/master, for a row
> >> that hasn't been updated by anyone else?
> 
> > Explanation: since ad0bda5d24ea, ExecLockRows() always calls
> > EvalPlanQualBegin() which initialises the plan state, and in this case
> > ExecInitNamedTuplestoreScan() errors out due to the bug.  Before, you
> > needed the right concurrency scenario (epq_needed) before we did that,
> > as the reporter of bug #15720 discovered.
> 
> I'm quite desperately unhappy about this observation, because
> EvalPlanQualBegin is a *large* amount of overhead that is usually
> unnecessary, and is now going to be paid for *every locked row*
> whether there's any conflict on it or not.  I do not find that
> acceptable.  Why is it necessary to do this before finding that
> there's an update conflict?

Two main reasons:

Previously we referenced tuples from LockRowsState->lr_curtuples, and
the management of that was pretty tightly interlinked with EPQ, and only
worked for heap tuples.Keeping that scheme would have been somewhat
complicated to continue to maintain.

Secondly, previously the tuple fetched by heap_lock_tuple() was just
stored in a local variable - but now that happens via a slot (as we
otherwise cannot reasonably handle things like heap wanting to return a
pinned buffer, and others not).  As we potentially need to lock rows
from multiple tables, we'd need multiple slots suitable to lock/fetch those
rows.

EPQ already needed similar infrastructure internally - and those tuples
were fetched and retained for EPQ's benefit. So it seemed sensible to
just use the slots from EPQ.  Note that we don't need
EvalPlanQualFetch() anymore, as it's work is done inside the AM -
obviously there's no AM independent way to perform correct ctid chasing.

Which large overhead you mean is going to be paid for "*every locked
row*"? EvalPlanQualBegin() ought to be fairly fast for repeated calls in
the same node - although I do admit that it'd be a lot nicer if the
ExecSetParamPlanMulti() work wouldn't need be redone.

I think it might be reasonable to split EvalPlanQualBegin() (and perhaps
EvalPlanQualStart()) into two pieces: One to just have a minimal
EPQState, with valid slots, and one to get ready to actually run EPQ?

Unfortunately I'm going to be unreachable pretty soon till Monday
(hiking without any network access), so I won't be immediately able to respond.

Greetings,

Andres Freund



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: BUG #15888: Bogus "idle in transaction" state for logicaldecoding client after creating a slot
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: BUG #15896: pg_upgrade from 10-or-earlier: TRAP: FailedAssertion(»!(metad->btm_version >= 3)«