Re: Making aggregate deserialization (and WAL receive) functions slightly faster

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Making aggregate deserialization (and WAL receive) functions slightly faster
Дата
Msg-id 441650.1696967525@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Making aggregate deserialization (and WAL receive) functions slightly faster  (David Rowley <dgrowleyml@gmail.com>)
Ответы Re: Making aggregate deserialization (and WAL receive) functions slightly faster  (David Rowley <dgrowleyml@gmail.com>)
Список pgsql-hackers
David Rowley <dgrowleyml@gmail.com> writes:
> I've attached a slightly more worked on patch that makes maxlen == 0
> mean read-only.  Unsure if a macro is worthwhile there or not.

A few thoughts:

* initStringInfoFromStringWithLen() is kind of a mouthful.
How about "initStringInfoWithBuf", or something like that?

* logicalrep_read_tuple is doing something different from these
other callers: it's creating a *fully valid* StringInfo that
could be enlarged via repalloc.  (Whether anything downstream
depends on that, I dunno.)  Is it worth having two new init
functions, one that has that spec and initializes maxlen
appropriately, and the other that sets maxlen to 0?

* I think that this bit in the new enlargeStringInfo code path
is wrong:

+        newlen = pg_nextpower2_32(str->len) * 2;
+        while (needed > newlen)
+            newlen = 2 * newlen;

In the admittedly-unlikely case that str->len is more than half a GB
to start with, pg_nextpower2_32() will round up to 1GB and then the *2
overflows.  I think you should make this just

+        newlen = pg_nextpower2_32(str->len);
+        while (needed > newlen)
+            newlen = 2 * newlen;

It's fairly likely that this path will never be taken at all,
so trying to shave a cycle or two seems unnecessary.

> One thought I had about this is that the memory context behaviour
> might catch someone out at some point.  Right now if you do
> initStringInfo() the memory context of the "data" field will be
> CurrentMemoryContext, but if someone does
> initStringInfoFromStringWithLen() and then changes to some other
> memory context before doing an appendStringInfo on that string, then
> we'll allocate "data" in whatever that memory context is. Maybe that's
> ok if we document it.  Fixing it would mean adding a MemoryContext
> field to StringInfoData which would be set to CurrentMemoryContext
> during initStringInfo() and initStringInfoFromStringWithLen().

I think documenting it is sufficient.  I don't really foresee use-cases
where the string would get enlarged, anyway.

On the whole, I wonder about the value of allowing such a StringInfo to be
enlarged at all.  If we are defining the case as being a "read only"
buffer, under what circumstances would it be useful to enlarge it?
I'm tempted to suggest that we should just Assert(maxlen > 0) in
enlargeStringInfo, and anywhere else in stringinfo.c that modifies
the buffer.  That also removes the concern about which context the
enlargement would happen in.

I'm not really happy with what you did documentation-wise in
stringinfo.h.  I suggest more like

 * StringInfoData holds information about an extensible string.
 *      data    is the current buffer for the string (allocated with palloc).
 *      len     is the current string length.  There is guaranteed to be
 *              a terminating '\0' at data[len], although this is not very
 *              useful when the string holds binary data rather than text.
 *      maxlen  is the allocated size in bytes of 'data', i.e. the maximum
 *              string size (including the terminating '\0' char) that we can
 *              currently store in 'data' without having to reallocate
-*              more space.  We must always have maxlen > len, except
+*              in the read-only case described below.
 *      cursor  is initialized to zero by makeStringInfo or initStringInfo,
 *              but is not otherwise touched by the stringinfo.c routines.
 *              Some routines use it to scan through a StringInfo.
+*
+* As a special case, a StringInfoData can be initialized with a read-only
+* string buffer.  In this case "data" does not necessarily point at a
+* palloc'd chunk, and management of the buffer storage is the caller's
+* responsibility.  maxlen is set to zero to indicate that this is the case.

Also, the following comment block asserting that there are "two ways"
to initialize a StringInfo needs work, and I guess so does the above-
cited comment about the cursor field.

            regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Zhang
Дата:
Сообщение: Re: Tab completion for ATTACH PARTITION
Следующее
От: Jeff Davis
Дата:
Сообщение: Re: Is this a problem in GenericXLogFinish()?