Re: Async execution of postgres_fdw.
От | Kyotaro HORIGUCHI |
---|---|
Тема | Re: Async execution of postgres_fdw. |
Дата | |
Msg-id | 20150120.174713.187643362.horiguchi.kyotaro@lab.ntt.co.jp обсуждение исходный текст |
Ответ на | Re: Async execution of postgres_fdw. (Matt Kelly <mkellycs@gmail.com>) |
Список | pgsql-hackers |
Hello, thank you for the comment. I added experimental adaptive fetch size feature in this v6 patch. At Tue, 20 Jan 2015 04:51:13 +0000, Matt Kelly <mkellycs@gmail.com> wrote in <CA+KcUkhLUo+Vaj4xR8GVsof_nW79uDZTDYhOSdt13CFJkaEEdQ@mail.gmail.com> > I think its telling that varying the fetch size doubled the performance, > even on localhost. If you were to repeat this test across a network, the > performance difference would be far more drastic. I think so surely. > I understand the desire to keep the fetch size small by default, but I > think your results demonstrate how important the value is. At the very > least, it is worth reconsidering this "arbitrary" value. However, I think > the real solution is to make this configurable. It probably should be a > new option on the foreign server or table, but an argument could be made > for it to be global across the server just like work_mem. The optimal number of fetch_count varies depending on query. Only from the performance view, it should be the same as the table size when simple scan on a table. Most of joins also not need to read target relations simultaneously. (Local merge join on remote sorted results is not available since fdw is not aware of the sorted-ness). But it would be changed in near future. So I have found no appropriate policy to decide the number. The another point of view is memory requirement. This wouldn't matter using single-row mode of libpq but it doesn't allow multple simultaneous queries. The space needed for the fetch buffer widely varies in proportion to the average row length. If it is 1Kbytes, 10000 rows requires over 10MByes, which is larger than the default value of work_mem. I tried adaptive fetch_size based on fetch durtaion and required buffer size for the previous turn in this version. But hard limit cannot be imposed since we cannot know of the mean row length in advance. So, for example, the average row length suddenly grows 1KB->10KB when fetch_size is 10000, 100MB is required for the turn. I think, for the ordinary cases, maximum fetch size cannot exceeds 1000. The attatched is the new version implemented the adaptive fetch size. Simple test runs showed the values below. A single scan was boosted by about 5% (No effect?) and a join by 33%. The former case is ununderstandable so I'll examine it tomorrow. This doesn't seem so promising, though.. ===== master=# EXPLAIN (ANALYZE ON, COSTS OFF) SELECT * FROM ft1; QUERY PLAN -------------------------------------------------------------------------Foreign Scan on ft1 (actual time=1.741..10046.272rows=1000000 loops=1)Planning time: 0.084 msExecution time: 10145.730 ms (3 rows) patched=# EXPLAIN (ANALYZE ON, COSTS OFF) SELECT * FROM ft1; QUERY PLAN ------------------------------------------------------------------------Foreign Scan on ft1 (actual time=1.072..9582.980rows=1000000 loops=1)Planning time: 0.077 msExecution time: 9683.164 ms (3 rows) patched=# EXPLAIN (ANALYZE ON, COSTS OFF) SELECT x.a, x.c, y.c FROM ft1 AS x JOIN ft1 AS y on x.a = y.a; QUERY PLAN ================================ postgres=# EXPLAIN (ANALYZE ON, COSTS OFF) SELECT x.a, x.c, y.c FROM ft1 AS x JOIN ft1 AS y on x.a = y.a; QUERY PLAN ------------------------------------------------------------------------------- -------Merge Join (actual time=18191.739..19534.001 rows=1000000 loops=1) Merge Cond: (x.a = y.a) -> Sort (actual time=9031.155..9294.465rows=1000000 loops=1) Sort Key: x.a Sort Method: external sort Disk: 142728kB -> Foreign Scan on ft1 x (actual time=1.156..6486.632 rows=1000000 lo ops=1) -> Sort (actual time=9160.577..9479.076 rows=1000000 loops=1) Sort Key: y.a Sort Method: externalsort Disk: 146632kB -> Foreign Scan on ft1 y (actual time=0.641..6517.594 rows=1000000 lo ops=1)Planning time: 0.203 msExecution time: 19626.881 ms (12 rows) ------------------------------------------------------------------------------- -------Merge Join (actual time=11790.690..13134.071 rows=1000000 loops=1) Merge Cond: (x.a = y.a) -> Sort (actual time=8149.225..8413.611rows=1000000 loops=1) Sort Key: x.a Sort Method: external sort Disk: 142728kB -> Foreign Scan on ft1 x (actual time=0.679..3989.160 rows=1000000 lo ops=1) -> Sort (actual time=3641.457..3957.240 rows=1000000 loops=1) Sort Key: y.a Sort Method: externalsort Disk: 146632kB -> Foreign Scan on ft1 y (actual time=0.605..1852.655 rows=1000000 lo ops=1)Planning time: 0.203 msExecution time: 13226.414 ms (12 rows) > Obviously, this shouldn't block your current patch but its worth revisiting. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Michael PaquierДата:
Сообщение: Dereferenced pointers checked as NULL in btree_utils_var.c