Re: Parallel queries in single transaction

Поиск
Список
Период
Сортировка
От Paul Muntyanu
Тема Re: Parallel queries in single transaction
Дата
Msg-id CACnYr+geQM1RmGArOrt5bimO6qNVk73zEjAhba7QV72Ez38dRQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parallel queries in single transaction  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers

> Well, sure. But you could just as well open multiple connections and
> make the queries concurrent that way. Or change the GUC to increase the
> number of workers for the nightly ETL.



This is an option right now for having permanent staging tables for future join. I mistakenly said ETL while it is ELT what means that most of operations are in the database so we try to keep all changes in db code instead of changing engine for execution. In PG11 we have parallel CTAS what is drammatical improvement for us, but there are still will be operations(query plans) which are not parallel.

Having postgresql completely ACID is amazing feature, so when we need to do ELT operation outside the transaction and guarantee that ELT job completed successfully by checking that all steps(multiple transactions with staging tables) are succeeded(with graceful rollback + cleanup in case of failure), makes things more complex. Indeed I still agree that it is possible to workaround by operating on application level.
-P

-P


On Mon, Jul 16, 2018 at 2:28 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:


On 07/16/2018 12:03 PM, Paul Muntyanu wrote:
> Hi Tomas, thanks for looking into. I am more talking about queries which
> can not be optimized, e.g.
> * fullscan of the table and heavy calculations for another one.
> * query through FDW for both queries(e.g. one query fetches data from
> Kafka and another one is fetching from remote Postgres. There are no
> bounds for both queries for anything except local CPU, network and
> remote machine)
>
> IO bound is not a problem in case if you have multiple tablesapces.

But it was you who mentioned "query stuck" not me. I merely pointed out
that in such cases running queries concurrently won't help.

> And CPU bound can be not the case when you have 32 cores and 6 max workers
> per query. Then, during nigtly ETL, I do not have anything except single
> query running) == 6 cores are occupied. If I can run queries in
> parallel, I would occupy two IO stacks(two tablespaces) + 12 cores
> instead of sequentially 6 and then again 6.
>

Well, sure. But you could just as well open multiple connections and
make the queries concurrent that way. Or change the GUC to increase the
number of workers for the nightly ETL.


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: patch to allow disable of WAL recycling
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Pluggable Storage - Andres's take