Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?
Дата
Msg-id 15702.1436413118@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?  (andres@anarazel.de (Andres Freund))
Ответы Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?  (Andres Freund <andres@anarazel.de>)
Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?  ("Graeme B. Bell" <graeme.bell@nibio.no>)
Список pgsql-performance
andres@anarazel.de (Andres Freund) writes:
> On 2015-07-08 15:38:24 -0700, Craig James wrote:
>> From my admittedly naive point of view, it's hard to see why any of this
>> matters. I have functions that do purely CPU-intensive mathematical
>> calculations ... you could imagine something like is_prime(N) that
>> determines if N is a prime number. I have eight clients that connect to
>> eight backends. Each client issues an SQL command like, "select
>> is_prime(N)" where N is a simple number.

> I mostly replied to Merlin's general point (additionally in the context of
> plpgsql).

> But I have a hard time seing that postgres would be the bottleneck for a
> is_prime() function (or something with similar characteristics) that's
> written in C where the average runtime is more than, say, a couple
> thousand cyles.  I'd like to see a profile of that.

But that was not the case that Graeme was complaining about.  He's talking
about simple-arithmetic-and-looping written in plpgsql, in a volatile
function that is going to take a new snapshot for every statement, even if
that's only "n := n+1".  So it's going to spend a substantial fraction of
its runtime banging on the ProcArray, and that doesn't scale.  If you
write your is_prime function purely in plpgsql, and don't bother to mark
it nonvolatile, *it will not scale*.  It'll be slow even in single-thread
terms, but it'll be particularly bad if you're saturating a multicore
machine with it.

One of my Salesforce colleagues has been looking into ways that we could
decide to skip the per-statement snapshot acquisition even in volatile
functions, if we could be sure that a particular statement isn't going to
do anything that would need a snapshot.  Now, IMO that doesn't really do
much for properly written plpgsql; but there's an awful lot of bad plpgsql
code out there, and it can make a huge difference for that.

            regards, tom lane


В списке pgsql-performance по дате отправления:

Предыдущее
От: andres@anarazel.de (Andres Freund)
Дата:
Сообщение: Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?
Следующее
От: "Graeme B. Bell"
Дата:
Сообщение: Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?