Re: explain analyze output with parallel workers - question aboutmeaning of information for explain.depesz.com

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: explain analyze output with parallel workers - question aboutmeaning of information for explain.depesz.com
Дата
Msg-id CAEepm=0OP4aBajUpsBZSs6O-SRTaA8Kt9ip5ABdm1P8ATrKhoA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: explain analyze output with parallel workers - question aboutmeaning of information for explain.depesz.com  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: explain analyze output with parallel workers - question aboutmeaning of information for explain.depesz.com  (Thomas Munro <thomas.munro@enterprisedb.com>)
Re: explain analyze output with parallel workers - question aboutmeaning of information for explain.depesz.com  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Tue, Dec 5, 2017 at 6:39 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> I have one another observation in the somewhat related area.  From the
> code, it looks like we might have some problem with displaying sort
> info for workers for rescans.  I think the problem with the sortinfo
> is that it initializes shared info with local memory in
> ExecSortRetrieveInstrumentation after which it won't be able to access
> the values in shared memory changed by workers in rescans.  We might
> be able to fix it by having some local_info same as sahred_info in
> sort node.  But the main problem is how do we accumulate stats for
> workers across rescans.  The type of sort method can change across
> rescans.  We might be able to accumulate the size of Memory though,
> but not sure if that is right.  I think though this appears to be
> somewhat related to the problem being discussed in this thread, it can
> be dealt separately if we want to fix it.

Yeah, that's broken.  ExecSortRetrieveInstrumentation() is run for
each loop, and after the first loop we've lost track of the pointer
into shared memory because we replaced it with palloc'd copy.  We
could do what you said, or we could reinstate the pointer into the DSM
in ExecSortReInitializeDSM() by looking it up in the TOC.

The reason I've popped up out of nowhere on this thread to say this is
that I just realised that my nearby patch that adds support for Hash
instrumentation has the same bug, because I copied the way
ExecSortRetrieveInstrumentation() works to make
ExecHashRetrieveInstrumentation().  I initially assumed this would run
just one after the final loop in a rescan situation but on testing I
see that it runs repeatedly, and of course looses intrumentation
during rescans.  I'll go and fix that.  We should choose one approach
for both cases.  Do you prefer a separate variable for the local copy,
or reinstating the pointer into the DSM during
ExecXXXReInitializeDSM()?

As for how to aggregate the information, isn't it reasonable to show
data from the last loop on the basis that it's representative?
Summing wouldn't make too much sense, because you didn't use that much
memory all at once.

-- 
Thomas Munro
http://www.enterprisedb.com


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Marina Polyakova
Дата:
Сообщение: Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Следующее
От: Rajkumar Raghuwanshi
Дата:
Сообщение: Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables