Re: parallel foreign scan

Поиск
Список
Период
Сортировка
От Kyotaro HORIGUCHI
Тема Re: parallel foreign scan
Дата
Msg-id 20180516.110902.216853091.horiguchi.kyotaro@lab.ntt.co.jp
обсуждение исходный текст
Ответ на parallel foreign scan  (Manuel Kniep <m.kniep@web.de>)
Ответы Re: parallel foreign scan  (Manuel Kniep <m.kniep@web.de>)
Список pgsql-hackers
At Tue, 15 May 2018 23:09:31 +0200, Manuel Kniep <m.kniep@web.de> wrote in
<D84E3D72-2E83-482B-8EF8-D25F93F1CEA8@web.de>
> Dear hackers,
> 
> I’m working on a foreign database wrapper for Kafka [1]
> Now I am trying to make it parallel aware. Following 
> the documentation [2]
> However it seems that I can’t make it use more than a
> single worker with force_parallel_mode = on.
> 
> I wonder if I need to do more than just implementing the
> needed callback function to benefit from multiple workers.
> 
> Looking at create_foreignscan_path in path_nodes.c
> I found that the ForeignPath seems to always set
> 
> pathnode->path.parallel_aware = false;
> pathnode->path.parallel_safe = rel->consider_parallel;
> pathnode->path.parallel_workers = 0;
> 
> Do I need so set these in my GetForeignPaths callback manually?

Right. create_foreignscan_path is used by FDW drivers to create
the path struct. GetForeignPaths() needs to finish the path by
setting the parameters and partial paths.

# I myself haven't do that so I'm not sure the details.

> Is there anything else I need to do?

I think you are trying collecting data from multple kafka
server. This means each server has a dedicate foreign table on a
dedicate foreign server. Parallel execution doesn't fit in that
case since it works on single base relation (or a
table). Parallel append/merge append look a bit different but
actually is the same in the sense that one base relation is
scanned on multiple workers. Even if you are trying to fetch from
one kafka stream on multiple workers, I think the fdw driver
doesn't support parallel scanning anyway.

In any case it is inevitable to modify the fdw driver.

If you are trying to collect data from multple servers, the
following proposed PoC patch is a implement of asynchronous
execution of postgres_fdw and it might be helpful.

https://www.postgresql.org/message-id/20180515.202945.69332784.horiguchi.kyotaro@lab.ntt.co.jp

The postgres_fdw.c part in it is complicated since it supports
shared connection but not that complex ignoring that.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: [HACKERS] Planning counters in pg_stat_statements
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Cache lookup errors with functions manipulation object addresses