Re: Parallel Executors [was RE: Threaded Sorting]

Поиск
Список
Период
Сортировка
От Curtis Faith
Тема Re: Parallel Executors [was RE: Threaded Sorting]
Дата
Msg-id DMEEJMCDOJAKPPFACMPMGEFCCEAA.curtis@galtair.com
обсуждение исходный текст
Ответ на Re: Parallel Executors [was RE: Threaded Sorting]  (Jan Wieck <JanWieck@Yahoo.com>)
Список pgsql-hackers
> Curtis Faith wrote:
>
> > The current transaction/user state seems to be stored in process
> > global space. This could be changed to be a sointer to a struct
> > stored in a back-end specific shared memory area which would be
> > accessed by the executor process at execution start. The backend
> > would destroy and recreate the shared memory and restart execution
> > in the case where an executor process dies much like the postmaster
> > does with backends now.
> >
> > To the extent the executor process might make changes to the state,
> > which I'd try to avoid if possible (don't know if it is), the
> > executors could obtain locks, otherwise if the executions were
> > constrained to isolated elements (changes to different indexes for
> > example) it seems like it would be possible using an architecture
> > where you have:

Jan Wieck replied:
> Imagine there is a PL/Tcl function. On the first call in a session, the
> PL/Tcl interpreter get's created (that's during execution, okay?). Now
> the procedure that's called inside of that interpreter creates a
> "global" variable ... a global Tcl variable inside of that interpreter,
> which is totally unknown to the backend since it doesn't know what Tcl
> is at all and that variable is nothing than an entry in a private hash
> table inside of that interpreter. On a subsequent call to any PL/Tcl
> function during that session, it might be good if that darn hashtable
> entry exists.
>
> How do you propose to let this happen?
>
> And while at it, the Tcl procedure next calls spi_exec, causing the
> PL/Tcl function handler to call SPI_exec(), so your isolated executor
> all of the sudden becomes a fully operational backend, doing the
> parsing, planning and optimizing, or what?

You bring up a good point, we couldn't do what I propose for all
situations. I had never anticipated that splitting things up would be the
rule. For example, the optimizer would have to decide whether it made sense
to split up a query from a strictly performance perspective. So now, if we
consider the fact that some things could not be done with split backend
execution, the logic becomes:

if ( splitting is possible && splitting is faster )do the split execution;
elsedo the normal execution;

Since the design already splits the backend internally into a separate
execution phase, it seems like one could keep the current current
implementation for the typical case where splitting doesn't buy anything or
cases where there is complex state information that needs to be maintained.
If there are no triggers or functions that will be accessed by a given
query then I don't see your concerns applying.

If there are triggers or other conditions which preclude multi-process
execution, we can keep exactly the same behavior as now. The plan execution
entry could easily be a place where it either A) did the same thing it
currently does or B) passed execution off to a pool as per the original
proposal.

I have to believe that most SELECTs won't be affected by your concerns.
Additionally, even in the case of an UPDATE, many times there are large
portions of the operation's actual work that wouldn't be affected even if
there are lots of triggers on the tables being updated. The computation of
the inside of the WHERE could often be split out without causing any
problems with context or state information. The master executor could
always be the original backend as it is now and this would be the place
where the UPDATE part would be processed after the WHERE tuples had been
identified.

As with any optimization, it is more complicated and won't handle all the
cases. It's just an idea to handle common cases that would otherwise be
much slower.

That having been said, I'm sure there are much lower hanging fruit on the
performance tree and likely will be for a little while.

- Curtis



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Neil Conway
Дата:
Сообщение: Re: Hot Backup
Следующее
От: Tom Lane
Дата:
Сообщение: Re: BTree metapage lock and freelist structure