Re: generic plans and "initial" pruning

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: generic plans and "initial" pruning
Дата
Msg-id CA+TgmoZ=92LxPeVJ20vf4vJ4o8dr7Q2seDQR=xLY04puLjDs_A@mail.gmail.com
обсуждение исходный текст
Ответ на generic plans and "initial" pruning  (Amit Langote <amitlangote09@gmail.com>)
Ответы Re: generic plans and "initial" pruning  (Amit Langote <amitlangote09@gmail.com>)
Re: generic plans and "initial" pruning  (Simon Riggs <simon.riggs@enterprisedb.com>)
Список pgsql-hackers
On Fri, Dec 24, 2021 at 10:36 PM Amit Langote <amitlangote09@gmail.com> wrote:
> However, using an idea that Robert suggested to me off-list a little
> while back, it seems possible to determine the set of partitions that
> we can safely skip locking.  The idea is to look at the "initial" or
> "pre-execution" pruning instructions contained in a given Append or
> MergeAppend node when AcquireExecutorLocks() is collecting the
> relations to lock and consider relations from only those sub-nodes
> that survive performing those instructions.   I've attempted
> implementing that idea in the attached patch.

Hmm. The first question that occurs to me is whether this is fully safe.

Currently, AcquireExecutorLocks calls LockRelationOid for every
relation involved in the query. That means we will probably lock at
least one relation on which we previously had no lock and thus
AcceptInvalidationMessages(). That will end up marking the query as no
longer valid and CheckCachedPlan() will realize this and tell the
caller to replan. In the corner case where we already hold all the
required locks, we will not accept invalidation messages at this
point, but must have done so after acquiring the last of the locks
required, and if that didn't mark the plan invalid, it can't be
invalid now either. Either way, everything is fine.

With the proposed patch, we might never lock some of the relations
involved in the query. Therefore, if one of those relations has been
modified in some way that would invalidate the plan, we will
potentially fail to discover this, and will use the plan anyway. For
instance, suppose there's one particular partition that has an extra
index and the plan involves an Index Scan using that index. Now
suppose that the scan of the partition in question is pruned, but
meanwhile, the index has been dropped. Now we're running a plan that
scans a nonexistent index. Admittedly, we're not running that part of
the plan. But is that enough for this to be safe? There are things
(like EXPLAIN or auto_explain) that we might try to do even on a part
of the plan tree that we don't try to run. Those things might break,
because for example we won't be able to look up the name of an index
in the catalogs for EXPLAIN output if the index is gone.

This is just a relatively simple example and I think there are
probably a bunch of others. There are a lot of kinds of DDL that could
be performed on a partition that gets pruned away: DROP INDEX is just
one example. The point is that to my knowledge we have no existing
case where we try to use a plan that might be only partly valid, so if
we introduce one, there's some risk there. I thought for a while, too,
about whether changes to some object in a part of the plan that we're
not executing could break things for the rest of the plan even if we
never do anything with the plan but execute it. I can't quite see any
actual hazard. For example, I thought about whether we might try to
get the tuple descriptor for the pruned-away object and get a
different tuple descriptor than we were expecting. I think we can't,
because (1) the pruned object has to be a partition, and tuple
descriptors have to match throughout the partitioning hierarchy,
except for column ordering, which currently can't be changed
after-the-fact and (2) IIRC, the tuple descriptor is stored in the
plan and not reconstructed at runtime and (3) if we don't end up
opening the relation because it's pruned, then we certainly can't do
anything with its tuple descriptor. But it might be worth giving more
thought to the question of whether there's any other way we could be
depending on the details of an object that ended up getting pruned.

> Note that "initial" pruning steps are now performed twice when
> executing generic plans: once in AcquireExecutorLocks() to find
> partitions to be locked, and a 2nd time in ExecInit[Merge]Append() to
> determine the set of partition sub-nodes to be initialized for
> execution, though I wasn't able to come up with a good idea to avoid
> this duplication.

I think this is something that will need to be fixed somehow. Apart
from the CPU cost, it's scary to imagine that the set of nodes on
which we acquired locks might be different from the set of nodes that
we initialize. If we do the same computation twice, there must be some
non-zero probability of getting a different answer the second time,
even if the circumstances under which it would actually happen are
remote. Consider, for example, a function that is labeled IMMUTABLE
but is really VOLATILE. Now maybe you can get the system to lock one
set of partitions and then initialize a different set of partitions. I
don't think we want to try to reason about what consequences that
might have and prove that somehow it's going to be OK; I think we want
to nail the door shut very tightly to make sure that it can't.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: sequences vs. synchronous replication
Следующее
От: Andrew Dunstan
Дата:
Сообщение: Re: sepgsql logging