Обсуждение: [HACKERS] pgstattuple documentation clarification

Поиск
Список
Период
Сортировка

[HACKERS] pgstattuple documentation clarification

От
Andrew Dunstan
Дата:
Recently a client was confused because there was a substantial 
difference between the reported table_len of a table and the sum of the 
corresponding tuple_len, dead_tuple_len and free_space. The docs are 
fairly silent on this point, and I agree that in the absence of 
explanation it is confusing, so I propose that we add a clarification 
note along the lines of:

   The table_len will always be greater than the sum of the tuple_len,   dead_tuple_len and free_space. The difference
isaccounted for by   page overhead and space that is not free but cannot be attributed to   any particular tuple.
 


Or perhaps we should be more explicit and refer to the item pointers on 
the page.


Thoughts?


cheers


andrew




Re: [HACKERS] pgstattuple documentation clarification

От
Tom Lane
Дата:
Andrew Dunstan <andrew@dunslane.net> writes:
> Recently a client was confused because there was a substantial 
> difference between the reported table_len of a table and the sum of the 
> corresponding tuple_len, dead_tuple_len and free_space. The docs are 
> fairly silent on this point, and I agree that in the absence of 
> explanation it is confusing, so I propose that we add a clarification 
> note along the lines of:

>     The table_len will always be greater than the sum of the tuple_len,
>     dead_tuple_len and free_space. The difference is accounted for by
>     page overhead and space that is not free but cannot be attributed to
>     any particular tuple.

> Or perhaps we should be more explicit and refer to the item pointers on 
> the page.

I find "not free but cannot be attributed to any particular tuple"
to be entirely useless weasel wording, not to mention wrong with
respect to item pointers in particular.

Perhaps we should start counting the item pointers in tuple_len.
We'd still have to explain about page header overhead, but that
would be a pretty small and fixed-size discrepancy.
        regards, tom lane



Re: [HACKERS] pgstattuple documentation clarification

От
Andrew Dunstan
Дата:

On 12/20/2016 10:01 AM, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Recently a client was confused because there was a substantial
>> difference between the reported table_len of a table and the sum of the
>> corresponding tuple_len, dead_tuple_len and free_space. The docs are
>> fairly silent on this point, and I agree that in the absence of
>> explanation it is confusing, so I propose that we add a clarification
>> note along the lines of:
>>      The table_len will always be greater than the sum of the tuple_len,
>>      dead_tuple_len and free_space. The difference is accounted for by
>>      page overhead and space that is not free but cannot be attributed to
>>      any particular tuple.
>> Or perhaps we should be more explicit and refer to the item pointers on
>> the page.
> I find "not free but cannot be attributed to any particular tuple"
> to be entirely useless weasel wording, not to mention wrong with
> respect to item pointers in particular.


Well, the reason I put it like that was that in my experimentation, 
after I vacuumed the table after a large delete the item pointer table 
didn't seem to shrink (at least according to the pgstattuple output), so 
we had a page with 0 dead tuples but some non-live line pointer space. 
If that's not what's happening then something is going on that I don't 
understand. (Wouldn't be a first.)


>
> Perhaps we should start counting the item pointers in tuple_len.
> We'd still have to explain about page header overhead, but that
> would be a pretty small and fixed-size discrepancy.
>
>             

Sure, sounds like a good idea. Meanwhile it would be nice to explain to 
people exactly what we currently have. If you have a good formulation 
I'm all ears.

cheers

andrew




Re: [HACKERS] pgstattuple documentation clarification

От
Robert Haas
Дата:
On Tue, Dec 20, 2016 at 10:01 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Recently a client was confused because there was a substantial
>> difference between the reported table_len of a table and the sum of the
>> corresponding tuple_len, dead_tuple_len and free_space. The docs are
>> fairly silent on this point, and I agree that in the absence of
>> explanation it is confusing, so I propose that we add a clarification
>> note along the lines of:
>
>>     The table_len will always be greater than the sum of the tuple_len,
>>     dead_tuple_len and free_space. The difference is accounted for by
>>     page overhead and space that is not free but cannot be attributed to
>>     any particular tuple.
>
>> Or perhaps we should be more explicit and refer to the item pointers on
>> the page.
>
> I find "not free but cannot be attributed to any particular tuple"
> to be entirely useless weasel wording, not to mention wrong with
> respect to item pointers in particular.
>
> Perhaps we should start counting the item pointers in tuple_len.
> We'd still have to explain about page header overhead, but that
> would be a pretty small and fixed-size discrepancy.

It's pretty weird to count unused or dead line pointers as part of
tuple_len, and it would screw things up for anybody trying to
calculate the average width of their tuples, which is an entirely
reasonable thing to want to do.  I think if we're going to count item
pointers as anything, it needs to be some new category -- either item
pointers specifically, or an "other stuff" bucket.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] pgstattuple documentation clarification

От
Andrew Dunstan
Дата:

On 12/20/2016 11:41 PM, Robert Haas wrote:
> On Tue, Dec 20, 2016 at 10:01 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Andrew Dunstan <andrew@dunslane.net> writes:
>>> Recently a client was confused because there was a substantial
>>> difference between the reported table_len of a table and the sum of the
>>> corresponding tuple_len, dead_tuple_len and free_space. The docs are
>>> fairly silent on this point, and I agree that in the absence of
>>> explanation it is confusing, so I propose that we add a clarification
>>> note along the lines of:
>>>      The table_len will always be greater than the sum of the tuple_len,
>>>      dead_tuple_len and free_space. The difference is accounted for by
>>>      page overhead and space that is not free but cannot be attributed to
>>>      any particular tuple.
>>> Or perhaps we should be more explicit and refer to the item pointers on
>>> the page.
>> I find "not free but cannot be attributed to any particular tuple"
>> to be entirely useless weasel wording, not to mention wrong with
>> respect to item pointers in particular.
>>
>> Perhaps we should start counting the item pointers in tuple_len.
>> We'd still have to explain about page header overhead, but that
>> would be a pretty small and fixed-size discrepancy.
> It's pretty weird to count unused or dead line pointers as part of
> tuple_len, and it would screw things up for anybody trying to
> calculate the average width of their tuples, which is an entirely
> reasonable thing to want to do.  I think if we're going to count item
> pointers as anything, it needs to be some new category -- either item
> pointers specifically, or an "other stuff" bucket.
>


Yes, I agree. In any case, before we change anything can we agree on a 
description of what we currently do?

Here's a second attempt:
   The table_len will always be greater than the sum of the tuple_len,   dead_tuple_len and free_space. The difference
isaccounted for by   fixed page overhead, the per-page table of pointers to tuples, and   padding to ensure that tuples
arecorrectly aligned.
 

I don't think any of that is weaselish :-)

cheers

andrew



Re: [HACKERS] pgstattuple documentation clarification

От
Andrew Dunstan
Дата:

On 12/21/2016 09:04 AM, Andrew Dunstan wrote:
>
>
>
>
> Yes, I agree. In any case, before we change anything can we agree on a 
> description of what we currently do?
>
> Here's a second attempt:
>
>    The table_len will always be greater than the sum of the tuple_len,
>    dead_tuple_len and free_space. The difference is accounted for by
>    fixed page overhead, the per-page table of pointers to tuples, and
>    padding to ensure that tuples are correctly aligned.
>


In the absence of further comment I will proceed along these lines.

cheers

andrew