Обсуждение: BUG #15900: `executor could not find named tuplestore` in triggers with transition table and row locks
BUG #15900: `executor could not find named tuplestore` in triggers with transition table and row locks
От
PG Bug reporting form
Дата:
The following bug has been logged on the website: Bug reference: 15900 Logged by: Alex Aktsipetrov Email address: alex.akts@gmail.com PostgreSQL version: 12beta2 Operating system: Ubuntu 16.04 Description: SELECT FOR UPDATE query that references a transition table in AFTER INSERT/UPDATE triggers produces an unexpected error. The same query with FOR UPDATE omitted finishes without any error, which is my expectation for the original one as well. AFTER DELETE triggers were not tested. For example, the following query: create table testtr (a int, b text); create function testtr_trigger() returns trigger language plpgsql as $$begin perform( select array_agg(a) from (select testtr.a from testtr join new_table on testtr.a = new_table.a for update) as tmp ); return new; end$$; create trigger testtr_trigger after insert on testtr referencing new table as new_table for each statement execute procedure testtr_trigger(); insert into testtr values (1, 'one'), (2, 'two'); produces the following error: ERROR: executor could not find named tuplestore "new_table" CONTEXT: SQL statement "SELECT ( select array_agg(a) from (select testtr.a from testtr join new_table on testtr.a = new_table.a for update) as tmp )" I think the issue was introduced in ad0bda5d24ea2bcc72b5e50020e3c79bab10836b as the query finishes successfully in its ancestor.
Re: BUG #15900: `executor could not find named tuplestore` intriggers with transition table and row locks
От
Александр Акципетров
Дата:
On Tue, 9 Jul 2019 at 01:22, PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:
Bug reference: 15900
Logged by: Alex Aktsipetrov
Email address: alex.akts@gmail.com
PostgreSQL version: 12beta2
Operating system: Ubuntu 16.04
Description:
SELECT FOR UPDATE query that references a transition table in AFTER
INSERT/UPDATE triggers produces an unexpected error. The same query with FOR
UPDATE omitted finishes without any error, which is my expectation for the
original one as well. AFTER DELETE triggers were not tested.
For example, the following query:
create table testtr (a int, b text);
create function testtr_trigger() returns trigger language plpgsql as
$$begin
perform(
select array_agg(a) from
(select testtr.a from testtr join new_table on testtr.a = new_table.a
for update)
as tmp
);
return new;
end$$;
create trigger testtr_trigger
after insert on testtr
referencing new table as new_table
for each statement execute procedure testtr_trigger();
insert into testtr values (1, 'one'), (2, 'two');
produces the following error:
ERROR: executor could not find named tuplestore "new_table"
CONTEXT: SQL statement "SELECT (
select array_agg(a) from
(select testtr.a from testtr join new_table on testtr.a = new_table.a
for update)
as tmp
)"
I think the issue was introduced in ad0bda5d24ea2bcc72b5e50020e3c79bab10836b
as the query finishes successfully in its ancestor.
Sorry, the previous email was a fat finger error.
Attaching the patch that fixes the issue, although I am not familiar with the codebase to be sure that it is the right idea.
Attaching the patch that fixes the issue, although I am not familiar with the codebase to be sure that it is the right idea.
Вложения
On Tue, Jul 9, 2019 at 11:39 AM Alex Aktsipetrov <alex.akts@gmail.com> wrote: > Attaching the patch that fixes the issue, although I am not familiar with the codebase to be sure that it is the rightidea. Thanks for the report and the proposed fix! Hmm, I wonder if EPQ might be involved in bug report #15720 (version 11.2): https://www.postgresql.org/message-id/flat/15720-38c2b29e5d720187%40postgresql.org -- Thomas Munro https://enterprisedb.com
On Tue, Jul 9, 2019 at 11:49 AM Thomas Munro <thomas.munro@gmail.com> wrote: > On Tue, Jul 9, 2019 at 11:39 AM Alex Aktsipetrov <alex.akts@gmail.com> wrote: > > Attaching the patch that fixes the issue, although I am not familiar with the codebase to be sure that it is the rightidea. > > Thanks for the report and the proposed fix! The repro works for me, and I think the patch is correct. Here's a version with that line moved to a more natural place IMHO. > Hmm, I wonder if EPQ might be involved in bug report #15720 (version 11.2): > > https://www.postgresql.org/message-id/flat/15720-38c2b29e5d720187%40postgresql.org I think it's highly likely that bug #15720 was a case of this bug and would be fixed by this patch. Alex's repro doesn't work on 11 though, because EPQ is not entered at all. Which raises the question: why do we need to enter EPQ after commit ad0bda5d on 12/master, for a row that hasn't been updated by anyone else? -- Thomas Munro https://enterprisedb.com
Вложения
On Tue, Jul 9, 2019 at 1:13 PM Thomas Munro <thomas.munro@gmail.com> wrote: > > Hmm, I wonder if EPQ might be involved in bug report #15720 (version 11.2): > > > > https://www.postgresql.org/message-id/flat/15720-38c2b29e5d720187%40postgresql.org > > I think it's highly likely that bug #15720 was a case of this bug and > would be fixed by this patch. Alex's repro doesn't work on 11 though, > because EPQ is not entered at all. Which raises the question: why do > we need to enter EPQ after commit ad0bda5d on 12/master, for a row > that hasn't been updated by anyone else? Explanation: since ad0bda5d24ea, ExecLockRows() always calls EvalPlanQualBegin() which initialises the plan state, and in this case ExecInitNamedTuplestoreScan() errors out due to the bug. Before, you needed the right concurrency scenario (epq_needed) before we did that, as the reporter of bug #15720 discovered. I'm planning to commit that patch tomorrow. -- Thomas Munro https://enterprisedb.com
Thomas Munro <thomas.munro@gmail.com> writes: > On Tue, Jul 9, 2019 at 1:13 PM Thomas Munro <thomas.munro@gmail.com> wrote: >> I think it's highly likely that bug #15720 was a case of this bug and >> would be fixed by this patch. Agreed. I think your version of the fix is good, and you should mention #15720 too in the commit message. >> Alex's repro doesn't work on 11 though, >> because EPQ is not entered at all. Which raises the question: why do >> we need to enter EPQ after commit ad0bda5d on 12/master, for a row >> that hasn't been updated by anyone else? > Explanation: since ad0bda5d24ea, ExecLockRows() always calls > EvalPlanQualBegin() which initialises the plan state, and in this case > ExecInitNamedTuplestoreScan() errors out due to the bug. Before, you > needed the right concurrency scenario (epq_needed) before we did that, > as the reporter of bug #15720 discovered. I'm quite desperately unhappy about this observation, because EvalPlanQualBegin is a *large* amount of overhead that is usually unnecessary, and is now going to be paid for *every locked row* whether there's any conflict on it or not. I do not find that acceptable. Why is it necessary to do this before finding that there's an update conflict? regards, tom lane
On Wed, Jul 10, 2019 at 3:38 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thomas Munro <thomas.munro@gmail.com> writes: > > On Tue, Jul 9, 2019 at 1:13 PM Thomas Munro <thomas.munro@gmail.com> wrote: > >> I think it's highly likely that bug #15720 was a case of this bug and > >> would be fixed by this patch. > > Agreed. I think your version of the fix is good, and you should > mention #15720 too in the commit message. Thanks. Pushed. > >> Alex's repro doesn't work on 11 though, > >> because EPQ is not entered at all. Which raises the question: why do > >> we need to enter EPQ after commit ad0bda5d on 12/master, for a row > >> that hasn't been updated by anyone else? > > > Explanation: since ad0bda5d24ea, ExecLockRows() always calls > > EvalPlanQualBegin() which initialises the plan state, and in this case > > ExecInitNamedTuplestoreScan() errors out due to the bug. Before, you > > needed the right concurrency scenario (epq_needed) before we did that, > > as the reporter of bug #15720 discovered. > > I'm quite desperately unhappy about this observation, because > EvalPlanQualBegin is a *large* amount of overhead that is usually > unnecessary, and is now going to be paid for *every locked row* > whether there's any conflict on it or not. I do not find that > acceptable. Why is it necessary to do this before finding that > there's an update conflict? I haven't seriously looked into it and haven't succeeded in finding the discussion of why this is absolutely necessary in the commit's thread, but you'd think it should be possible to defer slot creation a bit, or if not, do something cheaper than EPQBegin() that just initialises the slots. Andres, Haribabu, Ashutosh? -- Thomas Munro https://enterprisedb.com
Hi, On 2019-07-09 11:38:13 -0400, Tom Lane wrote: > Thomas Munro <thomas.munro@gmail.com> writes: > >> Alex's repro doesn't work on 11 though, > >> because EPQ is not entered at all. Which raises the question: why do > >> we need to enter EPQ after commit ad0bda5d on 12/master, for a row > >> that hasn't been updated by anyone else? > > > Explanation: since ad0bda5d24ea, ExecLockRows() always calls > > EvalPlanQualBegin() which initialises the plan state, and in this case > > ExecInitNamedTuplestoreScan() errors out due to the bug. Before, you > > needed the right concurrency scenario (epq_needed) before we did that, > > as the reporter of bug #15720 discovered. > > I'm quite desperately unhappy about this observation, because > EvalPlanQualBegin is a *large* amount of overhead that is usually > unnecessary, and is now going to be paid for *every locked row* > whether there's any conflict on it or not. I do not find that > acceptable. Why is it necessary to do this before finding that > there's an update conflict? Two main reasons: Previously we referenced tuples from LockRowsState->lr_curtuples, and the management of that was pretty tightly interlinked with EPQ, and only worked for heap tuples.Keeping that scheme would have been somewhat complicated to continue to maintain. Secondly, previously the tuple fetched by heap_lock_tuple() was just stored in a local variable - but now that happens via a slot (as we otherwise cannot reasonably handle things like heap wanting to return a pinned buffer, and others not). As we potentially need to lock rows from multiple tables, we'd need multiple slots suitable to lock/fetch those rows. EPQ already needed similar infrastructure internally - and those tuples were fetched and retained for EPQ's benefit. So it seemed sensible to just use the slots from EPQ. Note that we don't need EvalPlanQualFetch() anymore, as it's work is done inside the AM - obviously there's no AM independent way to perform correct ctid chasing. Which large overhead you mean is going to be paid for "*every locked row*"? EvalPlanQualBegin() ought to be fairly fast for repeated calls in the same node - although I do admit that it'd be a lot nicer if the ExecSetParamPlanMulti() work wouldn't need be redone. I think it might be reasonable to split EvalPlanQualBegin() (and perhaps EvalPlanQualStart()) into two pieces: One to just have a minimal EPQState, with valid slots, and one to get ready to actually run EPQ? Unfortunately I'm going to be unreachable pretty soon till Monday (hiking without any network access), so I won't be immediately able to respond. Greetings, Andres Freund