Re: Making aggregate deserialization (and WAL receive) functions slightly faster

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: Making aggregate deserialization (and WAL receive) functions slightly faster
Дата
Msg-id CAApHDvqi_L68q7+3XRzULfXQNd9g7Hod+n0y8oCkuSrP64uZCA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Making aggregate deserialization (and WAL receive) functions slightly faster  (David Rowley <dgrowleyml@gmail.com>)
Ответы Re: Making aggregate deserialization (and WAL receive) functions slightly faster  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Tue, 17 Oct 2023 at 20:39, David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Mon, 16 Oct 2023 at 05:56, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > I guess the next question is whether we want to stop here or
> > try to relax the requirement about NUL-termination.  I'd be inclined
> > to call that a separate issue deserving a separate commit, so maybe
> > we should go ahead and commit this much anyway.
>
> I am keen to see this relaxed. I agree that a separate effort is best.

I looked at the latest posted patch again today with thoughts about
pushing it but there's something I'm a bit unhappy with that makes me
think we should maybe do the NUL-termination relaxation in the same
commit.

The problem is in LogicalRepApplyLoop() the current patch adjusts the
manual building of the StringInfoData to make use of
initReadOnlyStringInfo() instead. The problem I have with that is that
the string that's given to initReadOnlyStringInfo() comes from
walrcv_receive() and on looking at the API spec for walrcv_receive_fn
I see:

/*
 * walrcv_receive_fn
 *
 * Receive a message available from the WAL stream.  'buffer' is a pointer
 * to a buffer holding the message received.  Returns the length of the data,
 * 0 if no data is available yet ('wait_fd' is a socket descriptor which can
 * be waited on before a retry), and -1 if the cluster ended the COPY.
 */

i.e, no mention that the buffer will be NUL terminated upon return.

Looking at pqGetCopyData3(), is see the buffer does get NUL
terminated, but without the API spec mentioning this I'm not feeling
good about going ahead with wrapping that up in
initReadOnlyStringInfo() which Asserts the buffer will be NUL
terminated.

I've attached a patch which builds on the previous patch and relaxes
the rule that the StringInfo must be NUL-terminated.   The rule is
only relaxed for StringInfos that are initialized with
initReadOnlyStringInfo.  On working on this I went over the locations
where we've added code to add a '\0' char to the buffer.  If you look
at, for example, record_recv() and array_agg_deserialize() in master,
we modify the StringInfo's data to set a \0 at the end of the string.
I've removed that code as I *believe* this isn't required for the
type's receive function.

There's also an existing confusing comment in logicalrep_read_tuple()
which seems to think we're just setting the NUL terminator to conform
to StringInfo's practises. This is misleading as the NUL is required
for LOGICALREP_COLUMN_TEXT mode as we use the type's input function
instead of the receive function.  You don't have to look very hard to
find an input function that needs a NUL terminator.

I'm a bit less confident that the type's receive function will never
need to be NUL terminated. cstring_recv() came to mind as one I should
look at, but on looking I see it's not required as it just reads the
remaining bytes from the input StringInfo.  Is it safe to assume this?
or could there be some UDF receive function which requires this?

David

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Show version of OpenSSL in ./configure output
Следующее
От: Merlin Moncure
Дата:
Сообщение: Re: MERGE ... RETURNING