Re: SPI Concurrency Precautions? Problems with Parallel Execution ofMultiple CREATE TABLE statements

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: SPI Concurrency Precautions? Problems with Parallel Execution ofMultiple CREATE TABLE statements
Дата
Msg-id CA+TgmoY=91PUrucg+BwJFF=_khUUPPCvPop+67nafrXbfus3uA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: SPI Concurrency Precautions? Problems with Parallel Execution ofMultiple CREATE TABLE statements  (Tom Mercha <mercha_t@hotmail.com>)
Список pgsql-hackers
On Sat, Feb 22, 2020 at 5:50 AM Tom Mercha <mercha_t@hotmail.com> wrote:
> I am spawning a number of dynamic background workers to execute each
> statement. When I spawn 4 workers on a quad-core machine, the resutling
> execution time per statement is {0.158s, 0.216s, 0.399s, 0.396s}.
> However, when I have just one worker, the results are {0.151s, 0.141s,
> 0.146s, 0.136s}.
>
> The way I am executing my statements is through SPI in each worker
> (using a PG extension) as follows:
>   SPI_connect();
>   SPI_exec(queryString, 0);
>   SPI_finish();
> In both test cases, SPI_connect/finish are executed 4 times.
>
> What I expect is that with 4 workers, each statements will take approx
> 0.15s to execute since they are independent from each other. This would
> result in approx a 4x speedup. Despite seeing concurrency, I am seeing
> that each invdividual statement will take longer to execute. I am
> struggling to understand this behavior, what this suggests to me is that
> there is a lock somewhere which completely defeats my purpose.
>
> I was wondering how I could execute my CREATE TABLE statements in a
> parallel fashion given that they are independent from each other. If the
> lock is the problem, what steps could I take to relax it? I would
> greatly appreciate any guidance or insights on this topic.

Well, I'm not altogether sure that your expectations are realistic.
Rarely do things parallelize perfectly. In a case like this, some time
is probably being spent doing disk I/O. When multiple processes do CPU
work at the same time, you should be able to see near-linear speedup,
but when multiple processes do disk I/O at the same time, you may see
no speedup at all, or even a slowdown, because of the way that disks
work. This seems especially likely given how short the queries are and
the fact that they create a new table, which involves an fsync()
operation.

It's possible that if you run a query to select the wait events from
pg_stat_activity, maybe using psql's \watch with a fractional value,
you might be able to see something about what those queries are
actually spending time on. It's also possible that you might get more
interesting results if you have things that run for longer than a few
hundred milliseconds. But in general I would question the assumption
that this ought to scale well.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Vik Fearing
Дата:
Сообщение: Re: progress reporting views and TimestampTz fields
Следующее
От: Konstantin Knizhnik
Дата:
Сообщение: Re: Yet another vectorized engine