Re: Parallel Seq Scan

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Parallel Seq Scan
Дата
Msg-id CAA4eK1L0dk9D3hARoAb84v2pGvUw4B5YoS4x18ORQREwR+1VCg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parallel Seq Scan  (Stephen Frost <sfrost@snowman.net>)
Ответы Re: Parallel Seq Scan  (Stephen Frost <sfrost@snowman.net>)
Список pgsql-hackers
On Fri, Dec 19, 2014 at 7:57 PM, Stephen Frost <sfrost@snowman.net> wrote:
>
>
> There's certainly documentation available from the other RDBMS' which
> already support parallel query, as one source.  Other academic papers
> exist (and once you've linked into one, the references and prior work
> helps bring in others).  Sadly, I don't currently have ACM access (might
> have to change that..), but there are publicly available papers also,

I have gone through couple of papers and what some other databases
do in case of parallel sequential scan and here is brief summarization
of same and how I am planning to handle in the patch:

Costing:
In one of the paper's [1] suggested by you, below is the summarisation:
  a. Startup costs are negligible if processes can be reused
           rather than created afresh.
b. Communication cost consists of the CPU cost of sending
           and receiving messages.
  c. Communication costs can exceed the cost of operators such
            as scanning, joining or grouping
These findings lead to the important conclusion that
Query optimization should be concerned with communication costs
but not with startup costs.

In our case as currently we don't have a mechanism to reuse parallel
workers, so we need to account for that cost as well.  So based on that,
I am planing to add three new parameters cpu_tuple_comm_cost,
parallel_setup_cost, parallel_startup_cost
*  cpu_tuple_comm_cost - Cost of CPU time to pass a tuple from worker
    to master backend with default value
    DEFAULT_CPU_TUPLE_COMM_COST as 0.1, this will be multiplied
    with tuples expected to be selected
*  parallel_setup_cost - Cost of setting up shared memory for parallelism
   with default value as 100.0
 *  parallel_startup_cost  - Cost of starting up parallel workers with default
    value as 1000.0 multiplied by number of workers decided for scan.

I will do some experiments to finalise the default values, but in general,
I feel developing cost model on above parameters is good.

Execution:
Most other databases does partition level scan for partition on
different disks by each individual parallel worker.  However,
it seems amazon dynamodb [2] also works on something
similar to what I have used in patch which means on fixed
blocks.  I think this kind of strategy seems better than dividing
the blocks at runtime because dividing randomly the blocks
among workers could lead to random scan for a parallel
sequential scan.
Also I find in whatever I have read (Oracle, dynamodb) that most
databases divide work among workers and master backend acts
as coordinator, atleast that's what I could understand.

Let me know your opinion about the same?

I am planning to proceed with above ideas to strengthen the patch
in absence of any objection or better ideas.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Heikki Linnakangas
Дата:
Сообщение: Translating xlogreader.c
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [PATCH] server_version_num should be GUC_REPORT