Re: Asynchronous Append on postgres_fdw nodes.

Поиск
Список
Период
Сортировка
От Etsuro Fujita
Тема Re: Asynchronous Append on postgres_fdw nodes.
Дата
Msg-id CAPmGK15CyEmstJyzkGsT_1OmArJwD6vQguBEwiz8oYx4zz6ssA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Asynchronous Append on postgres_fdw nodes.  (Andrey Lepikhov <a.lepikhov@postgrespro.ru>)
Ответы Re: Asynchronous Append on postgres_fdw nodes.  (Andrey Lepikhov <a.lepikhov@postgrespro.ru>)
Список pgsql-hackers
On Tue, May 11, 2021 at 11:58 AM Andrey Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> Your patch fixes the problem. But I found two more problems:
>
> EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF)
> SELECT * FROM (
>         (SELECT * FROM f1)
>                 UNION ALL
>         (SELECT * FROM f2)
>                 UNION ALL
>         (SELECT * FROM l3)
> ) q1 LIMIT 6709;
>                            QUERY PLAN
> --------------------------------------------------------------
>   Limit (actual rows=6709 loops=1)
>     ->  Append (actual rows=6709 loops=1)
>           ->  Async Foreign Scan on f1 (actual rows=1 loops=1)
>           ->  Async Foreign Scan on f2 (actual rows=1 loops=1)
>           ->  Seq Scan on l3 (actual rows=6708 loops=1)
>
> Here we scan 6710 tuples at low level but appended only 6709. Where did
> we lose one tuple?

The extra tuple, which is from f1 or f2, would have been kept in the
Append node's as_asyncresults, not returned from the Append node to
the Limit node.  The async Foreign Scan nodes would fetch tuples
before the Append node ask the tuples, so the fetched tuples may or
may not be used.

> 2.
> SELECT * FROM (
>         (SELECT * FROM f1)
>                 UNION ALL
>         (SELECT * FROM f2)
>                 UNION ALL
>         (SELECT * FROM f3 WHERE a > 0)
> ) q1 LIMIT 3000;
>                            QUERY PLAN
> --------------------------------------------------------------
>   Limit (actual rows=3000 loops=1)
>     ->  Append (actual rows=3000 loops=1)
>           ->  Async Foreign Scan on f1 (actual rows=0 loops=1)
>           ->  Async Foreign Scan on f2 (actual rows=0 loops=1)
>           ->  Foreign Scan on f3 (actual rows=3000 loops=1)
>
> Here we give preference to the synchronous scan. Why?

This would be expected behavior, and the reason is avoid performance
degradation; you might think it would be better to execute the async
Foreign Scan nodes more aggressively, but it would require
waiting/polling for file descriptor events many times, which is
expensive and might cause performance degradation.  I think there is
room for improvement, though.

Thanks!

Best regards,
Etsuro Fujita



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrey Lepikhov
Дата:
Сообщение: Re: Defer selection of asynchronous subplans until the executor initialization stage
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Multiple hosts in connection string failed to failover in non-hot standby mode