Re: pgsql: Add parallel-aware hash joins.

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: pgsql: Add parallel-aware hash joins.
Дата
Msg-id CAEepm=0WxwzpHVHt3PcWHBV=L3k3FDb6dvMq1A2Li49LGBa7TA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: pgsql: Add parallel-aware hash joins.  (Thomas Munro <thomas.munro@enterprisedb.com>)
Ответы Re: pgsql: Add parallel-aware hash joins.  (Andres Freund <andres@anarazel.de>)
Список pgsql-committers
On Fri, Dec 22, 2017 at 1:48 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> I don't think that's quite it, because it should never have set
> 'writing' for any batch number >= nbatch.
>
> It's late here, but I'll take this up tomorrow and either find a fix
> or figure out how to avoid antisocial noise levels on the build farm
> in the meantime.

Not there yet but I learned some things and am still working on it.  I
spent a lot of time trying to reproduce the assertion failure, and
succeeded exactly once.  Unfortunately the one time I managed do to
that I'd built with clang -O2 and got a core file that I couldn't get
much useful info out of, and I've been trying to do it again with -O0
ever since without luck.  The time I succeeded, I reproduced it by
creating the tables "simple" and "bigger_than_it_looks" from join.sql
and then doing this in a loop:

  set min_parallel_table_scan_size = 0;
  set parallel_setup_cost = 0;
  set work_mem = '192kB';

  explain analyze select count(*) from simple r join
bigger_than_it_looks s using (id);

The machine that it happened on is resource constrained, and exhibits
another problem: though the above query normally runs in ~20ms,
sometimes it takes several seconds and occasionally much longer.  That
never happens on fast development systems or test servers which run it
quickly every time, and it doesn't happen on my 2 core slow system if
I have only two workers (or one worker + leader).  I dug into that and
figured out what was going wrong and wrote that up separately[1],
because I think it's an independent bug needing to be fixed, not the
root cause here.  However, I think it could easily be contributing to
the timing required to trigger the bug we're looking for.

Andres, your machine francolin crashed -- got a core file?

[1] https://www.postgresql.org/message-id/CAEepm%3D0NWKehYw7NDoUSf8juuKOPRnCyY3vuaSvhrEWsOTAa3w%40mail.gmail.com

-- 
Thomas Munro
http://www.enterprisedb.com


В списке pgsql-committers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: pgsql: Minor edits to catalog files and scripts
Следующее
От: Andres Freund
Дата:
Сообщение: Re: pgsql: Add parallel-aware hash joins.