Re: [HACKERS] Parallel Hash take II

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: [HACKERS] Parallel Hash take II
Дата
Msg-id CAEepm=2q6BnXdiySNWmf+5y3K_ZF+Kq-vgULka4HW6cjrJgj8g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Parallel Hash take II  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: [HACKERS] Parallel Hash take II  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Sat, Sep 2, 2017 at 5:13 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Aug 31, 2017 at 8:53 AM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> Check out ExecReScanGather(): it shuts down and waits for all workers
>> to complete, which makes the assumptions in ExecReScanHashJoin() true.
>> If a node below Gather but above Hash Join could initiate a rescan
>> then the assumptions would not hold.  I am not sure what it would mean
>> though and we don't generate any such plans today to my knowledge.  It
>> doesn't seem to make sense for the inner side of Nested Loop to be
>> partial.  Have I missed something here?
>
> I bet this could happen, although recent commits have demonstrated
> that my knowledge of how PostgreSQL handles rescans is less than
> compendious.  Suppose there's a Nested Loop below the Gather and above
> the Hash Join, implementing a join condition that can't give rise to a
> parameterized path, like a.x + b.x = 0.

Hmm.  I still don't see how that could produce a rescan of a partial
path without an intervening Gather, and I would really like to get to
the bottom of this.

At the risk of mansplaining the code that you wrote and turning out to
be wrong:  A Nested Loop can't ever have a partial path on the inner
side.  Under certain circumstances it can have a partial path on the
outer side, because its own results are partial, but for each outer
row it needs to do a total (non-partial) scan of the inner side so
that it can reliably find or not find matches.  Therefore we'll never
rescan partial paths directly, we'll only ever rescan partial paths
indirectly via a Gatheroid node that will synchronise the rescan of
all children to produce a non-partial result.

There may be more reasons to rescan that I'm not thinking of.  But the
whole idea of a rescan seems to make sense only for non-partial paths.
What would it even mean for a worker process to decide to rescan (say)
a Seq Scan without any kind of consensus?

Thought experiment: I suppose we could consider replacing Gather's
clunky shut-down-and-relaunch-workers synchronisation technique with a
new protocol where the Gather node sends a 'rescan!' message to each
worker and then discards their tuples until it receives 'OK, rescan
starts here', and then each parallel-aware node type supplies its own
rescan synchronisation logic as appropriate.  For example, Seq Scan
would somehow need to elect one participant to run
heap_parallelscan_reinitialize and others would wait until it has
done.  This might not be worth the effort, but thinking about this
problem helped me see that rescan of a partial plan without a Gather
node to coordinate doesn't make any sense.

Am I wrong?

-- 
Thomas Munro
http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeff Janes
Дата:
Сообщение: Re: [HACKERS] pg_basebackup throttling doesn't throttle as promised
Следующее
От: Robert Haas
Дата:
Сообщение: Re: [HACKERS] Parallel Hash take II