Re: Reducing output size of nodeToString

Поиск
Список
Период
Сортировка
От Peter Eisentraut
Тема Re: Reducing output size of nodeToString
Дата
Msg-id 9f8b880d-e040-4c37-9fdf-c7b55ac11892@eisentraut.org
обсуждение исходный текст
Ответ на Re: Reducing output size of nodeToString  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Список pgsql-hackers
On 20.03.24 13:03, Matthias van de Meent wrote:
> On Wed, 20 Mar 2024 at 12:49, Peter Eisentraut <peter@eisentraut.org> wrote:
>>
>> On 19.03.24 17:13, Peter Eisentraut wrote:
>>> On 11.03.24 21:52, Matthias van de Meent wrote:
>>>>> * v7-0003-gen_node_support.pl-Mark-location-fields-as-type-.patch
>>>>>
>>>>> This looks sensible, but maybe making Location a global type is a bit
>>>>> much?  Maybe something more specific like ParseLocation, or ParseLoc, to
>>>>> keep it under 12 characters.
>>>> I've gone with ParseLoc in the attached v8 patchset.
>>>
>>> I have committed this one.
>>
>> Next, I was looking at
>> v8-0003-pg_node_tree-Don-t-store-query-text-locations-in-.patch.
> 
> [...]
> 
>> So anyway, my idea was that we should turn this around and make
>> nodeToString() always drop location information, and instead add
>> nodeToStringWithLocations() for the few debugging uses.  And this would
>> also be nice because then it matches exactly with the existing
>> stringToNodeWithLocations().
> 
> That seems reasonable, yes.

I have committed that one.

This takes care of your patches v8-0002 and v8-0003.

About the rest of your patch set:

As long as we have only one output format, we need to balance several 
uses, including debugging, storage size, (de)serialization performance.

Your patches v8-0005 and up are clearly positive for storage size but 
negative for debugging.  So I think we can't consider them now.

Your patches v8-0001 ("pg_node_tree: Omit serialization of fields with 
default values.") and v8-0004 ("gen_node_support.pl: Add a TypMod type 
for signalling TypMod behaviour") are also good for storage size.  I 
don't know how they affect serialization performance.  I also don't know 
how good they are for debugging.  I have argued here and there that 
omitting unset fields can make node dumps more readable.  But that's 
just me.  I have looked at a lot of Query and RangeTblEntry nodes 
lately, which contain many rarely used fields.  But other people might 
have completely different experiences, with other node and tree types. 
We didn't get much feedback from anyone else in this thread, so I'm very 
hesitant to impose this on everyone without any consensus.

I could see "Omit serialization of fields with default values" as a 
separate toggle for debug node dumps.

Also, there is clearly some lingering interesting in a separate 
binary-ish serialization format for internal use.  This should probably 
also take a look at (de)serialization performance, which we haven't 
really done in this thread.  In a way, with the omit default values 
patch, the serialization and deserialization does more work, so it could 
have an adverse impact.  But we don't know.

I think to proceed we need more buy-in on the "what do I want from my 
node dumps" side, and more performance numbers on the other side. 
Saving a few hundred kilobytes on view storage is fine but isn't by 
itself that useful, I think, if it potentially negatively affects other 
uses.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michał Kłeczek
Дата:
Сообщение: Re: DRAFT: Pass sk_attno to consistent function
Следующее
От: Amit Langote
Дата:
Сообщение: Re: Memory consumed by child SpecialJoinInfo in partitionwise join planning