Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?

Поиск
Список
Период
Сортировка
От Merlin Moncure
Тема Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?
Дата
Msg-id CAHyXU0zi3YLnKJ9c3Fqd2bwfbXtreLkedEae=bP_a_Vxz5neNw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?  (Craig James <cjames@emolecules.com>)
Список pgsql-performance
On Wed, Jul 8, 2015 at 5:38 PM, Craig James <cjames@emolecules.com> wrote:
> On Wed, Jul 8, 2015 at 1:27 PM, Andres Freund <andres@anarazel.de> wrote:
>>
>> On 2015-07-08 13:46:53 -0500, Merlin Moncure wrote:
>> > On Wed, Jul 8, 2015 at 12:48 PM, Craig James <cjames@emolecules.com>
>> > wrote:
>> > > On Tue, Jul 7, 2015 at 10:31 PM, Joshua D. Drake
>> > > <jd@commandprompt.com>
>> > >> Using Apache Fast-CGI, you are going to fork a process for each
>> > >> instance
>> > >> of the function being executed and that in turn will use all CPUs up
>> > >> to the
>> > >> max available resource.
>> > >>
>> > >> With PostgreSQL, that isn't going to happen unless you are running
>> > >> (at
>> > >> least) 8 functions across 8 connections.
>> > >
>> > >
>> > > Well, right, which is why I mentioned "even with dozens of clients."
>> > > Shouldn't that scale to at least all of the CPUs in use if the
>> > > function is
>> > > CPU intensive (which it is)?
>> >
>> > only in the absence of inter-process locking and cache line bouncing.
>>
>> And addititionally memory bandwidth (shared between everything, even in
>> the numa case), cross socket/bus bandwidth (absolutely performance
>> critical in multi-socket configurations), cache capacity (shared between
>> cores, and sometimes even sockets!).
>
>
> From my admittedly naive point of view, it's hard to see why any of this
> matters. I have functions that do purely CPU-intensive mathematical
> calculations ... you could imagine something like is_prime(N) that
> determines if N is a prime number. I have eight clients that connect to
> eight backends. Each client issues an SQL command like, "select is_prime(N)"
> where N is a simple number.
>
> Are you saying that in order to calculate is_prime(N), all of that stuff
> (inter-process locking, memory bandwith, bus bandwidth, cache capacity,
> etc.) is even relevant? And if so, how is it that Postgres is so different
> from an Apache fast-CGI program that runs the exact same is_prime(N)
> calculation?
>
> Just curious ... as I said, I've already implemented a different solution.

If your is_prime() was written in C and was written so that it did not
utilize the database API, it should scale up quite nicely.  This can
be easily proved.  On my quad core workstation,

postgres=# select 12345! * 0;
 ?column?
──────────
        0
(1 row)

Time: 10435.554 ms


...which is heavily cpu bound, takes about 10 seconds.  scaling out to
4 threads via:

time ~/pgdev/bin/pgbench -n -t1 -c4 -f <(echo "select 12345! * 0;")

yields:
real 0m11.317s
user 0m0.001s
sys 0m0.005s

...I'll call that pretty good scaling.  The reason why this scales so
good is that the numeric code is all operating on local data
structures and is not involving backend componentry with it's various
attached complexity such as having to be checked for being visible to
the current transaction.

I submit that toy benchmarks like factoring or pi digits are not
really good indicators of language scaling and performance because
just about all real world code involves data structures, i/o, memory
allocation, amateur coders, etc.   Java tends to approach C in
benchmark shootouts but woefully underperforms my expectations
relative to C in code that does things that's actually useful (aside:
if you think I'm knocking java, the situation is even worse with most
other languages I come across).

pl/pgsql is simply not optimized for that style of coding although if
you know postgres you can start to tickle the limits of what's
expected from the language.  If that isn't working for you, pl/v8
strikes me as the best alternative due to it's performance and good
integration with postgres data structures (in fact, I'd be arguing for
it to be moved to core if the v8 dependency wasn't so capricious).
Either way, I'll advocate any solution that allows you to code inside
the database environment as opposed to the client side.

merlin


В списке pgsql-performance по дате отправления:

Предыдущее
От: Nicolas Paris
Дата:
Сообщение: pg_stat_all_indexes understand
Следующее
От: Merlin Moncure
Дата:
Сообщение: Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?