Re: Making aggregate deserialization (and WAL receive) functions slightly faster

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Making aggregate deserialization (and WAL receive) functions slightly faster
Дата
Msg-id 770055.1676183953@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Making aggregate deserialization (and WAL receive) functions slightly faster  (David Rowley <dgrowleyml@gmail.com>)
Ответы Re: Making aggregate deserialization (and WAL receive) functions slightly faster
Список pgsql-hackers
David Rowley <dgrowleyml@gmail.com> writes:
> While working on 16fd03e95, I noticed that in each aggregate
> deserialization function, in order to "receive" the bytea value that
> is the serialized aggregate state, appendBinaryStringInfo is used to
> append the bytes of the bytea value onto a temporary StringInfoData.
> Using  appendBinaryStringInfo seems a bit wasteful here. We could
> really just fake up a StringInfoData and point directly to the bytes
> of the bytea value.

Perhaps, but ...

> The best way I could think of to do this was to invent
> initStringInfoFromString() which initialises a StringInfoData and has
> the ->data field point directly at the specified buffer.  This will
> mean that it would be unsafe to do any appendStringInfo* operations on
> the resulting StringInfoData as enlargeStringInfo would try to
> repalloc the data buffer, which might not even point to a palloc'd
> string.

I find this patch horribly dangerous.

It could maybe be okay if we added the capability for StringInfoData
to understand (and enforce) that its "data" buffer is read-only.
However, that'd add overhead to every existing use-case.

> I've attached the benchmark results I got after testing how the
> modification changed the performance of string_agg_deserialize().
> I was hoping this would have a slightly more impressive performance
> impact, especially for string_agg() and array_agg() as the aggregate
> states of those can be large.  However, in the test I ran, there's
> only a very slight performance gain. I may just not have found the
> best case, however.

I do not think we should even consider this without solid evidence
for *major* performance improvements.  As it stands, it's a
quintessential example of a loaded foot-gun, and it seems clear
that making it safe enough to use would add more overhead than
it saves.

            regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Rowley
Дата:
Сообщение: Making aggregate deserialization (and WAL receive) functions slightly faster
Следующее
От: Ankit Kumar Pandey
Дата:
Сообщение: Re: Sort optimizations: Making in-memory sort cache-aware