Re: Making aggregate deserialization (and WAL receive) functions slightly faster

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: Making aggregate deserialization (and WAL receive) functions slightly faster
Дата
Msg-id CAApHDvorfO3iBZ=xpiZvp3uHtJVLyFaPBSvcAhAq2HPLnaNSwQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Making aggregate deserialization (and WAL receive) functions slightly faster  (David Rowley <dgrowleyml@gmail.com>)
Ответы Re: Making aggregate deserialization (and WAL receive) functions slightly faster  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Making aggregate deserialization (and WAL receive) functions slightly faster  (vignesh C <vignesh21@gmail.com>)
Список pgsql-hackers
On Mon, 9 Oct 2023 at 21:17, David Rowley <dgrowleyml@gmail.com> wrote:
> Here are some more thoughts on how we could improve this:
>
> 1. Adjust the definition of StringInfoData.maxlen to define that -1
> means the StringInfoData's buffer is externally managed.
> 2. Adjust enlargeStringInfo() to add a check for maxlen = -1 and have
> it palloc, say, pg_next_pow2(str->len * 2) bytes and memcpy the
> existing (externally managed string) into the newly palloc'd buffer.
> 3. Add a new function along the lines of what I originally proposed to
> allow init of a StringInfoData using an existing allocated string
> which sets maxlen = -1.
> 4. Update all the existing places, including the ones I just committed
> (plus the ones you committed in ba1e066e4) to make use of the function
> added in #3.

I just spent the past few hours playing around with the attached WIP
patch to try to clean up the various places where we manually build
StringInfoDatas around the tree.

While working on this, I added an Assert in the new
initStringInfoFromStringWithLen function to ensure that data[len] ==
'\0' per the "There is guaranteed to be a terminating '\0' at
data[len]" comment in stringinfo.h.  It looks like we have some
existing breakers of this rule.

If you apply the attached patch to 608fd198de~1 and ignore the
rejected hunks from the deserial functions, you'll see an Assert
failure during 023_twophase_stream.pl

023_twophase_stream_subscriber.log indicates:
TRAP: failed Assert("data[len] == '\0'"), File:
"../../../../src/include/lib/stringinfo.h", Line: 97, PID: 1073141
postgres: subscriber: logical replication parallel apply worker for
subscription 16396 (ExceptionalCondition+0x70)[0x56160451e9d0]
postgres: subscriber: logical replication parallel apply worker for
subscription 16396 (ParallelApplyWorkerMain+0x53c)[0x5616043618cc]
postgres: subscriber: logical replication parallel apply worker for
subscription 16396 (StartBackgroundWorker+0x20b)[0x56160434452b]

So it seems like we have some existing issues with
LogicalParallelApplyLoop(). The code there does not properly NUL
terminate the StringInfoData.data field. There are some examples in
exec_bind_message() of how that could be fixed. I've CC'd Amit to let
him know about this.

I'll also need to revert 608fd198 as this also highlights that setting
the StringInfoData.data to point to a bytea Datum can't be done either
as those aren't NUL terminated strings.

If people think it's worthwhile having something like the attached to
try to eliminate our need to manually build StringInfoDatas then I can
spend more time on it once LogicalParallelApplyLoop() is fixed.

David

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: shveta malik
Дата:
Сообщение: Re: Synchronizing slots from primary to standby
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: typo in couple of places