Re: [HACKERS] parallelize queries containing initplans

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: [HACKERS] parallelize queries containing initplans
Дата
Msg-id CA+TgmoZLmoSF_8HVC6A2a2vE9efGm9BPDFfi7VFohHvuKGu4WQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] parallelize queries containing initplans  (Kuntal Ghosh <kuntalghosh.2007@gmail.com>)
Ответы Re: [HACKERS] parallelize queries containing initplans
Список pgsql-hackers
On Thu, Nov 16, 2017 at 5:23 AM, Kuntal Ghosh
<kuntalghosh.2007@gmail.com> wrote:
> I've tested the above-mentioned scenario with this patch and it is
> working fine. Also, I've created a text column named 'vartext',
> inserted some random length texts(max length 100) and tweaked the
> above query as follows:
> select ten,count(*) from tenk1 a where a.ten in (select
> b.ten from tenk1 b where (select a.vartext from tenk1 c where c.ten =
> a.ten limit 1) = b.vartext limit 1) group by a.ten;
> This query is equivalent to select ten,count(*) from tenk1 group by
> a.ten. It also produced the expected result without throwing any
> error.

Great!  I have committed the patch; thanks for testing.

As I said in the commit message, there's a lot more work that could be
done here.  I think we should consider trying to revise this whole
system so that instead of serializing the values and passing them to
the workers, we allocate an array of slots where each slot has a Datum
flag, an isnull flag, and a dsa_pointer (which maybe could be union'd
to the Datum?).  If we're passing a parameter by value, we could just
store it in the Datum field; if it's null, we just set isnull.  If
it's being passed by reference, we dsa_allocate() space for it, copy
it into that space, and then store the dsa_pointer.

The advantage of this is that it would be possible to imagine the
contents of a slot changing while parallelism is running, which
doesn't really work with the current serialized-blob representation.
That would in turn allow us to imagine letting parallel-safe InitPlans
being evaluated by the first participant that needs the value rather
than before launching workers, which would be good, not only because
of the possibility of deferring work for InitPlans attached at or
above the Gather but also because it could be used for InitPlans below
the Gather (as long as they don't depend on any parameters computed
below the Gather).  I thought about trying to refactor this patch to
use a slot concept like what I just mentioned even before committing
it, but after poking at it a bit I decided that was going to require
too much new development to justify cramming it into the same patch.
Still, it seems like a good idea to do it at some point (when we have
lots of free time?).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fabrízio de Royes Mello
Дата:
Сообщение: Re: pgsql: Add hooks for session start and session end
Следующее
От: Antonin Houska
Дата:
Сообщение: Re: [HACKERS] Proposal: generic WAL compression