Re: Memory Accounting v11

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Memory Accounting v11
Дата
Msg-id 55A5B3F2.7070404@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: Memory Accounting v11  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Memory Accounting v11  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Hi,

On 07/14/2015 10:19 PM, Robert Haas wrote:
> On Sat, Jul 11, 2015 at 2:28 AM, Jeff Davis <pgsql@j-davis.com> wrote:
>> After talking with a few people at PGCon, small noisy differences
>> in CPU timings can appear for almost any tweak to the code, and
>> aren't necessarily cause for major concern.
>
> I agree with that in general, but the concern is a lot bigger when the
> function is something that is called everywhere and accounts for a
> measurable percentage of our total CPU usage on almost any workload.
> If memory allocation got slower because, say, you added some code to
> regexp.c and it caused AllocSetAlloc to split a cache line where it
> hadn't previously, I wouldn't be worried about that; the next patch,
> like as not, will buy the cost back again.  But here you really are
> adding code to a hot path.

I think Jeff was suggesting that we should ignore changes measurably 
affecting performance - I'm one of those he discussed this patch with at 
pgcon, and I can assure you impact on performance was one of the main 
topics of the discussion.

Firstly, do we really have good benchmarks and measurements? I really 
doubt that. We do have some numbers for REINDEX, where we observed 
0.5-1% regression on noisy results from a Power machine (and we've been 
unable to reproduce that on x86). I don't think that's a representative 
benchmark, and I'd like to see more thorough measurements. And I agreed 
to do this, once Jeff comes up with a new version of the patch.

Secondly, the question is whether the performance is impacted more by 
the additional instructions, or by other things - say, random padding, 
as was explained by Andrew Gierth in [1].

I don't know whether that's happening in this patch, but if it is, it 
seems rather strange to use this against this patch and not the others 
(because there surely will be other patches causing similar issues).

[1] 
http://www.postgresql.org/message-id/87vbitb2zp.fsf@news-spur.riddles.org.uk

> tuplesort.c does its own accounting, and TBH that seems like the right
> thing to do here, too.  The difficulty is, I think, that some
> transition functions use an internal data type for the transition
> state, which might not be a single palloc'd chunk.  But since we can't
> spill those aggregates to disk *anyway*, that doesn't really matter.
> If the transition is a varlena or a fixed-length type, we can know how
> much space it's consuming without hooking into the memory context
> framework.

I respectfully disagree. Our current inability to dump/load the state 
has little to do with how we measure consumed memory, IMHO.

It's true that we do have two kinds of aggregates, depending on the 
nature of the aggregate state:

(a) fixed-size state (data types passed by value, variable length types    that do not grow once allocated, ...)

(b) continuously growing state (as in string_agg/array_agg)

Jeff's HashAgg patch already fixes (a) and can fix (b) once we get a 
solution for dump/load of the aggregate stats - which we need to 
implement anyway for parallel aggregate.

I know there was a proposal to force all aggregates to use regular data 
types as aggregate stats, but I can't see how that could work without a 
significant performance penalty. For example array_agg() is using 
internal to pass ArrayBuildState - how do you turn that to a regular 
data type without effectively serializing/deserializing the whole array 
on every transition?

And even if we come up with a solution for array_agg, do we really 
believe it's possible to do for all custom aggregates? Maybe I'm missing 
something but I doubt that. ISTM designing ephemeral data structure 
allows tweaks that are impossible otherwise.

What might be possible is extending the aggregate API with another 
custom function returning size of the aggregate state. So when defining 
an aggregate using 'internal' for aggregate state, you'd specify 
transfunc, finalfunc and sizefunc. That seems doable, I guess.

I find the memory accounting as a way more elegant solution, though.

kind regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kouhei Kaigai
Дата:
Сообщение: Re: ctidscan as an example of custom-scan (Re: [v9.5] Custom Plan API)
Следующее
От: Kouhei Kaigai
Дата:
Сообщение: Re: security labels on databases are bad for dump & restore