FDW does not push down LIMIT & ORDER BY with sharding (partitions)

Поиск
Список
Период
Сортировка
От Gert van Dijk
Тема FDW does not push down LIMIT & ORDER BY with sharding (partitions)
Дата
Msg-id CAFT+aqL1Tt0qfYqjHH+shwPoW8qdFjpJ8vBR5ABoXJDUcHyN1w@mail.gmail.com
обсуждение исходный текст
Ответы Re: FDW does not push down LIMIT & ORDER BY with sharding (partitions)  (Amit Langote <amitlangote09@gmail.com>)
Список pgsql-bugs
Hi,

First of all I want to thank Etsuro Fujita for implementing the exact
feature I was missing in
FDW 11.4, but now available in 12: pushing down of LIMIT & ORDER BY to
foreign tables
(commit d50d172e51). Now that I'm using PostgreSQL 12-beta2 from the
official Docker
image I noticed an omission that I wanted to report here which may be
relevant for those
like me, using FDW in a typical sharding setup.

By querying purely foreign tables, I can confirm pushing down LIMIT &
ORDER BY is
working as expected on my installation.
However, when I use a typical sharding setup where the main table is
located on the FDW
node, with partitions of foreign tables, this seems not to activate
the new code path. I can
understand that pushing this down is not possible in cases where
*multiple* foreign tables
are to be scanned. However, it also does not work in the case where my
WHERE clause
condition causes to only connect to a *single* foreign table.

Short version of my situation below.

Table definition, typical 'shard by user':
CREATE TABLE my_big_table (
    user_id bigint NOT NULL,
    [ omitted other columns for brevity ]
) PARTITION BY HASH (user_id) ;

create foreign table my_big_table_mod4_s0 partition of my_big_table
    FOR VALUES WITH (MODULUS 4, REMAINDER 0) server shardA
    OPTIONS (table_name 'my_big_table_mod4_s0');

Running

EXPLAIN VERBOSE
SELECT * from my_big_table
WHERE
  user_id = 12345     -- only 1 user --> single foreign table.
ORDER BY serial DESC
LIMIT 10;

yields

 Limit  (cost=927393.08..927395.58 rows=1000 width=32)
   Output: [...]
   ->  Sort  (cost=927393.08..931177.06 rows=1513592 width=32)
         Output: [...]
         Sort Key: my_big_table_mod4_s0.serial DESC
         ->  Foreign Scan on public.my_big_table_mod4_s0
(cost=5318.35..844404.46 rows=1513592 width=32)
               Output: [...]
               Remote SQL: SELECT [...] FROM
public.my_big_table_mod4_s0 WHERE ((user_id = 4560084))

As you can see this is sub-optimal compared to the case where I
directly query the foreign
table.

This started as a Question on DBA.SE, some more information included there:
https://dba.stackexchange.com/q/242358/13155

Full version string used:
PostgreSQL 12beta2 (Debian 12~beta2-1.pgdg100+1) on
x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit

Please let me know if I should provide more information or in what
other way I could
contribute. I'm very much willing to test patches.

(If this is already being worked on or discussed elsewhere on this or
another list, please
excuse me, it seems a bit hard to find relevant results searching the
mailing list archives,
and I'm fairly new to PostgreSQL in general too.)

Thanks,

Gert van Dijk



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: BUG #15900: `executor could not find named tuplestore` intriggers with transition table and row locks
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: BUG #15900: `executor could not find named tuplestore` intriggers with transition table and row locks