Re: PassDownLimitBound for ForeignScan/CustomScan [take-2]
От | Kouhei Kaigai |
---|---|
Тема | Re: PassDownLimitBound for ForeignScan/CustomScan [take-2] |
Дата | |
Msg-id | 9A28C8860F777E439AA12E8AEA7694F80125F306@BPXM15GP.gisp.nec.co.jp обсуждение исходный текст |
Ответ на | PassDownLimitBound for ForeignScan/CustomScan [take-2] (Kouhei Kaigai <kaigai@ak.jp.nec.com>) |
Список | pgsql-hackers |
Hello, The attached patch is a revised version of pass-down LIMIT to FDW/CSP. Below is the updates from the last version. 'ps_numTuples' of PlanState was declared as uint64, instead of long to avoid problems on 32bits machine when a large LIMIT clause is supplied. 'ps_numTuples' is re-interpreted; 0 means that its upper node wants to fetch all the tuples. It allows to eliminate a boring initialization on ExecInit handler for each executor node. Even though it was not suggested, estimate_path_cost_size() of postgres_fdw adjusts number of rows if foreign-path is located on top-level of the base-relations and LIMIT clause takes a constant value. It will make more adequate plan as follows: * WITHOUT this patch -------------------- postgres=# explain verbose select * from t_a, t_b where t_a.id = t_b.id and t_a.x < t_b.x LIMIT 100; QUERY PLAN ---------------------------------------------------------------------------------------- Limit (cost=261.17..274.43 rows=100 width=88) Output: t_a.id, t_a.x, t_a.y, t_b.id, t_b.x, t_b.y -> Hash Join (cost=261.17..581.50 rows=2416 width=88) Output: t_a.id, t_a.x, t_a.y, t_b.id, t_b.x, t_b.y Hash Cond: (t_a.id = t_b.id) Join Filter: (t_a.x < t_b.x) -> Foreign Scan on public.t_a (cost=100.00..146.12 rows=1204 width=44) Output: t_a.id, t_a.x, t_a.y Remote SQL: SELECT id, x, y FROM public.t -> Hash (cost=146.12..146.12 rows=1204 width=44) Output: t_b.id, t_b.x, t_b.y -> Foreign Scan on public.t_b (cost=100.00..146.12 rows=1204 width=44) Output: t_b.id, t_b.x, t_b.y Remote SQL: SELECT id, x, y FROM public.t (14 rows) * WITH this patch ----------------- postgres=# explain verbose select * from t_a, t_b where t_a.id = t_b.id and t_a.x < t_b.x LIMIT 100; QUERY PLAN -------------------------------------------------------------------------------------------------------------- Limit (cost=100.00..146.58 rows=100 width=88) Output: t_a.id, t_a.x, t_a.y, t_b.id, t_b.x, t_b.y -> Foreign Scan (cost=100.00..146.58 rows=100 width=88) Output: t_a.id, t_a.x, t_a.y, t_b.id, t_b.x, t_b.y Relations: (public.t_a) INNER JOIN (public.t_b) Remote SQL: SELECT r1.id, r1.x, r1.y, r2.id, r2.x, r2.y FROM (public.t r1 INNER JOIN public.t r2 ON (((r1.x < r2.x))AND ((r1.id = r2.id)))) (6 rows) On the other hands, I noticed it is not safe to attach LIMIT clause at the planner stage because root->limit_tuples is declared as double. Even if LIMIT clause takes a constant value, it is potentially larger than 2^53 which is the limitation we can represent accurately with float64 data type but LIMIT clause allows up to 2^63-1. So, postgres_fdw now attaches LIMIT clause on the remote query on execution time only. Thanks, ---- PG-Strom Project / NEC OSS Promotion Center KaiGai Kohei <kaigai@ak.jp.nec.com> > -----Original Message----- > From: Robert Haas [mailto:robertmhaas@gmail.com] > Sent: Thursday, November 10, 2016 3:08 AM > To: Kaigai Kouhei(海外 浩平) <kaigai@ak.jp.nec.com> > Cc: pgsql-hackers@postgresql.org; Jeevan Chalke > <jeevan.chalke@enterprisedb.com>; Etsuro Fujita > <fujita.etsuro@lab.ntt.co.jp>; Andres Freund <andres@anarazel.de> > Subject: ##freemail## Re: PassDownLimitBound for ForeignScan/CustomScan > [take-2] > > On Mon, Oct 31, 2016 at 10:20 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> > wrote: > > As an example, I enhanced postgres_fdw to understand the ps_numTuples > > if it is set. If and when remote ORDER BY is pushed down, the latest > > code tries to sort the entire remote table because it does not know > > how many rows to be returned. Thus, it took larger execution time. > > On the other hands, the patched one runs the remote query with LIMIT > > clause according to the ps_numTuples; which is informed by the Limit > > node on top of the ForeignScan node. > > So there are two cases here. If the user says LIMIT 12, we could in theory > know that at planner time and optimize accordingly. If the user says LIMIT > twelve(), however, we will need to wait until execution time unless twelve() > happens to be capable of being simplified to a constant by the planner. > > Therefore, it's possible to imagine having two mechanisms here. In the > simple case where the LIMIT and OFFSET values are constants, we could > implement a system to get hold of that information during planning and > use it for whatever we like. In addition, we can have an > execution-time system that optimizes based on values available at execution > (regardless of whether those values were also available during planning). > Those are, basically, two separate things, and this patch has enough to > do just focusing on one of them. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL > Company
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Craig RingerДата:
Сообщение: Re: Use procsignal_sigusr1_handler and RecoveryConflictInterrupt() from walsender?
Следующее
От: Craig RingerДата:
Сообщение: Re: Re: BUG #13755: pgwin32_is_service not checking if SECURITY_SERVICE_SID is disabled