Обсуждение: Fast Default WIP patch for discussion

Поиск
Список
Период
Сортировка

Fast Default WIP patch for discussion

От
Serge Rielau
Дата:
As promised and requested find attached a work in progress patch for fast defaults.
This is my first patch, I hope I used the right format…..

The premise of the feature is to avoid a table rewrite when adding a column with a default.

This is done by remembering the default value in pg_attribute instead of updating all the rows.
When a tuple is read from disk which is shorter than the tuple descriptor mandates the default is “plugged in”
either when filling in the datum/null array of a virtual tuple, or by “expanding” the tuple to a complete heap or minimal tuple
with all attributes.

Example:
CREATE TABLE t (pk INT NOT NULL PRIMARY KEY);
INSERT INTO t VALUES (1);
=> t: (1)

ALTER TABLE t ADD COLUMN c1 INT DEFAULT 5;
=> 1. Stores the default expression in pg_attrdef.adbin (no news)
     2. a. Stores the default values (computed at time of ALTER TABLE) in pg_attribute.att_existdef
         b. Sets pg_attribute.att_hasexistdef to true

SELECT * FROM t;
=> Build a virtual tuple using getAttr() API (no news)
     but since tuple has fewer mats than table signature fill in extra attributes from pg_attribute.att_existdef 
     if att_hasexistdef is true
=> (1) becomes (1, 5)

INSERT INTO t VALUES (2);
=> Fill in DEFAULT for t.c1 from pg_attrdef.adbin as usual: (2, 5)
=> t: (1), (2, 5)

ALTER TABLE t ALTER COLUMN c1 SET DEFAULT -1;
=> 1. Drop row from pg_attrdef for c1   (no news)
     2. Add row to pg_attrdef for c1 DEFAULT -1  (no news) 
     3. Leave pg_atribute.att_existdef alone!!

INSERT INTO t VALUES (3);
=> Fill in DEFAULT for t.c1 from pg_attrdef.adbin as usual: (3, -1)
=> t: (1), (2, 5), (3, -1)

SELECT * FROM t;
=>  Build a virtual tuple using get*Attr() API (no news)
     but since tuple has fewer mats than table signature fill in extra attributes from pg_attribute.att_existdef 
     if att_hasexistdef is true
=> (1) becomes (1, 5)
     (2, 5)
     (3, -1)

ALTER TABLE t ALTER COLUMN c1 DROP DEFAULT;
=> 1. Drop row from pg_attrdef for c1   (no news)
     2. Leave pg_atribute.att_existdef alone!!

INSERT INTO t VALUES (4);
=> Fill in default DEFAULT (4, NULL)
=> t: (1), (2, 5), (3, -1), (4, NULL)

SELECT * FROM t;
=>  Build a virtual tuple using get*Attr() API (no news)
     but since tuple has fewer mats than table signature fill in extra attributes from pg_attribute.att_existdef 
     if att_hasexistdef is true
=> (1) becomes (1, 5)
     (2, 5)
     (3, -1)
     (4, NULL)

You can find a new (incomplete) test file fast_default.sql in regress/sql
 
Some key design points requiring discussion:
1. Storage of the “exist” (working name) default
   Right now the patch stores the default value in its binary form as it would be in the tuple into a BYTEA.
   It would be feasible to store the pretty printed literal, however this requires calling the io functions when a 
   tuple descriptor is built.
2. The exist default is cached alongside the “current” default in the tuple descriptor’s constraint structure.
    Seems most natural too me, but debatable.
3. Delayed vs. early expansion of the tuples.
    To avoid having to decide when to copy tuple descriptors with or without constraints and defaults
    I have opted to expand the tuple at the core entry points.
    How do I know I have them all? An omission means wrong results!
4. attisnull()
    This routine is used in many places, but to give correct result sit must now be accompanied
    by the tuple descriptor. This becomes moderately messy and it’s not always clear where to get that.
    Interestingly most usages are related to catalog lookups.
    Assuming we have no intention to support fast default for catalog tables we could keep using the
    existing attisnull() api for catalog lookups and use a new version (name tbd) for user tables.
5. My head hurts looking at the PK/FK code - it’s not always clear which tuple descriptor belongs 
    to which tuple
6. Performance of the expansion code.
    The current code needs to walk all defaults and then start padding by filling in values.
    But the outcome is always the same. We will produce the same payload and the name null map.
    It would be feasible to cache an “all defaults tuple”, remember the offsets (and VARWIDTH, HASNULL) 
    for each attribute and then simply splice the short and default tuples together.
    This ought to be faster, but the meta data to cache is not insignificant and the expansion code is messy enough
    without this already.
7. Grooming
    Obviously we can remove all exist defaults for a table from pg_attribute whenever the table is rewrittem.
    That’s easy.
    But could we/should we keep track of the short tuples and either eliminate them or drop exist defaults once they 
    become obsolete because there is no tuple short enough for them to matter. 
8. Do we need to worry about toasted defaults?


Cheers
Serge Rielau


Вложения

Re: Fast Default WIP patch for discussion

От
Serge Rielau
Дата:
Hackers,

Posting to this group on a Friday evening was obviously a Bad Idea(tm). :-)

Let me clarify that I’m at this point not looking for any detailed review.
Rather I’m hoping to drive towards design decisions on the below.
So any opining would be much appreciated.
> On Oct 21, 2016, at 4:15 PM, Serge Rielau <serge@rielau.com> wrote:
> ...
>
> Some key design points requiring discussion:
> 1. Storage of the “exist” (working name) default
>    Right now the patch stores the default value in its binary form as it would be in the tuple into a BYTEA.
>    It would be feasible to store the pretty printed literal, however this requires calling the io functions when a
>    tuple descriptor is built.
> 2. The exist default is cached alongside the “current” default in the tuple descriptor’s constraint structure.
>     Seems most natural too me, but debatable.
> 3. Delayed vs. early expansion of the tuples.
>     To avoid having to decide when to copy tuple descriptors with or without constraints and defaults
>     I have opted to expand the tuple at the core entry points.
>     How do I know I have them all? An omission means wrong results!
> 4. attisnull()
>     This routine is used in many places, but to give correct result sit must now be accompanied
>     by the tuple descriptor. This becomes moderately messy and it’s not always clear where to get that.
>     Interestingly most usages are related to catalog lookups.
>     Assuming we have no intention to support fast default for catalog tables we could keep using the
>     existing attisnull() api for catalog lookups and use a new version (name tbd) for user tables.
> 5. My head hurts looking at the PK/FK code - it’s not always clear which tuple descriptor belongs
>     to which tuple
> 6. Performance of the expansion code.
>     The current code needs to walk all defaults and then start padding by filling in values.
>     But the outcome is always the same. We will produce the same payload and the name null map.
>     It would be feasible to cache an “all defaults tuple”, remember the offsets (and VARWIDTH, HASNULL)
>     for each attribute and then simply splice the short and default tuples together.
>     This ought to be faster, but the meta data to cache is not insignificant and the expansion code is messy enough
>     without this already.
> 7. Grooming
>     Obviously we can remove all exist defaults for a table from pg_attribute whenever the table is rewrittem.
>     That’s easy.
>     But could we/should we keep track of the short tuples and either eliminate them or drop exist defaults once they
>     become obsolete because there is no tuple short enough for them to matter.
> 8. Do we need to worry about toasted defaults?
>
Thanks
Serge Rielau
salesforce.com




Re: Fast Default WIP patch for discussion

От
Euler Taveira
Дата:
On 26-10-2016 12:43, Serge Rielau wrote:
> Posting to this group on a Friday evening was obviously a Bad Idea(tm). :-)
> 
Serge, add your patch to the next commitfest [1] so we don't forget to
review it.


[1] https://commitfest.postgresql.org/11/


--   Euler Taveira                   Timbira - http://www.timbira.com.br/  PostgreSQL: Consultoria, Desenvolvimento,
Suporte24x7 e Treinamento
 



Re: Fast Default WIP patch for discussion

От
Serge Rielau
Дата:
Euler,

Thanks, I was previously told I should post a WIP patch here.
There are too many open issues to be near committing.
Anyway, I have created a patch.
https://commitfest.postgresql.org/11/843/#
Since this is my first time I do have a couple of questions:
There are entries for a git and a wiki link.
Should I push the patch to some branch, if so which repository?

Thanks
Serge

The form does ask for a git link rather
> On Oct 26, 2016, at 8:51 AM, Euler Taveira <euler@timbira.com.br> wrote:
>
> On 26-10-2016 12:43, Serge Rielau wrote:
>> Posting to this group on a Friday evening was obviously a Bad Idea(tm). :-)
>>
> Serge, add your patch to the next commitfest [1] so we don't forget to
> review it.
>
>
> [1] https://commitfest.postgresql.org/11/
>
>
> --
>   Euler Taveira                   Timbira - http://www.timbira.com.br/
>   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento




Re: Fast Default WIP patch for discussion

От
Petr Jelinek
Дата:
Hi,

On 26/10/16 18:22, Serge Rielau wrote:
> Euler,
> 
> Thanks, I was previously told I should post a WIP patch here.
> There are too many open issues to be near committing.
> Anyway, I have created a patch. 
> https://commitfest.postgresql.org/11/843/#

Adding to commitfest is act of asking for review, it does not mean it's
committable.

> Since this is my first time I do have a couple of questions:
> There are entries for a git and a wiki link.
> Should I push the patch to some branch, if so which repository?
>

That's optional, mostly useful for bigger (or complex) patches or
patches where there is code cooperation between multiple people. Wiki
page for patch sometimes contains basically what your first email said.

BTW it's customary to bottom post, not top post on this list.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: Fast Default WIP patch for discussion

От
Robert Haas
Дата:
On Wed, Oct 26, 2016 at 11:43 AM, Serge Rielau <serge@rielau.com> wrote:
> Posting to this group on a Friday evening was obviously a Bad Idea(tm). :-)

Not really.  It's just that complex patches often don't get an
immediate response because people are too busy to look at them and
think about them in detail.  If you've added it to the CommitFest, it
will eventually get some attention.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Fast Default WIP patch for discussion

От
Robert Haas
Дата:
On Fri, Oct 21, 2016 at 7:15 PM, Serge Rielau <serge@rielau.com> wrote:
> Some key design points requiring discussion:
> 1. Storage of the “exist” (working name) default
>    Right now the patch stores the default value in its binary form as it
> would be in the tuple into a BYTEA.
>    It would be feasible to store the pretty printed literal, however this
> requires calling the io functions when a
>    tuple descriptor is built.

A pretty-printed literal is a terrible idea because input functions
need not be immutable (e.g. timestamptz).  I would have expected a
pg_node_tree column containing a CONST node, but your bytea thing
might be OK.

> 2. The exist default is cached alongside the “current” default in the tuple
> descriptor’s constraint structure.
>     Seems most natural too me, but debatable.

No opinion (yet).

> 3. Delayed vs. early expansion of the tuples.
>     To avoid having to decide when to copy tuple descriptors with or without
> constraints and defaults
>     I have opted to expand the tuple at the core entry points.
>     How do I know I have them all? An omission means wrong results!

Early expansion seems right.  "How do I know I have them all?" - and
not only all of the current ones but all future ones - seems like a
core sticking point for this patch.

> 4. attisnull()
>     This routine is used in many places, but to give correct result sit must
> now be accompanied
>     by the tuple descriptor. This becomes moderately messy and it’s not
> always clear where to get that.
>     Interestingly most usages are related to catalog lookups.
>     Assuming we have no intention to support fast default for catalog tables
> we could keep using the
>     existing attisnull() api for catalog lookups and use a new version (name
> tbd) for user tables.

attisnull is not a thing.  There's heap_attisnull and slot_attisnull.

> 5. My head hurts looking at the PK/FK code - it’s not always clear which
> tuple descriptor belongs
>     to which tuple

I suggest ibuprofen and a stiff upper lip.

> 6. Performance of the expansion code.
>     The current code needs to walk all defaults and then start padding by
> filling in values.
>     But the outcome is always the same. We will produce the same payload and
> the name null map.
>     It would be feasible to cache an “all defaults tuple”, remember the
> offsets (and VARWIDTH, HASNULL)
>     for each attribute and then simply splice the short and default tuples
> together.
>     This ought to be faster, but the meta data to cache is not insignificant
> and the expansion code is messy enough
>     without this already.

You could experiment to figure out if it makes any difference.

> 7. Grooming
>     Obviously we can remove all exist defaults for a table from pg_attribute
> whenever the table is rewrittem.
>     That’s easy.

But might cause the table to expand a lot, which would suck.

>     But could we/should we keep track of the short tuples and either
> eliminate them or drop exist defaults once they
>     become obsolete because there is no tuple short enough for them to
> matter.

I wouldn't mess with it.

> 8. Do we need to worry about toasted defaults?

Presumably.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Fast Default WIP patch for discussion

От
Serge Rielau
Дата:
> On Oct 28, 2016, at 5:46 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Fri, Oct 21, 2016 at 7:15 PM, Serge Rielau <serge@rielau.com> wrote:
>> Some key design points requiring discussion:
>> 1. Storage of the “exist” (working name) default
>>   Right now the patch stores the default value in its binary form as it
>> would be in the tuple into a BYTEA.
>>   It would be feasible to store the pretty printed literal, however this
>> requires calling the io functions when a
>>   tuple descriptor is built.
>
> A pretty-printed literal is a terrible idea because input functions
> need not be immutable (e.g. timestamptz).
I did not know that! Interesting.

> I would have expected a
> pg_node_tree column containing a CONST node, but your bytea thing
> might be OK.
I was debating following the example given by the current DEFAULT.
But figured it’s overkill given that no-one user will ever have to look at it.
And in the end a the pretty printed CONST node is no more readable.

>
>> 3. Delayed vs. early expansion of the tuples.
>>    To avoid having to decide when to copy tuple descriptors with or without
>> constraints and defaults
>>    I have opted to expand the tuple at the core entry points.
>>    How do I know I have them all? An omission means wrong results!
>
> Early expansion seems right.  "How do I know I have them all?" - and
> not only all of the current ones but all future ones - seems like a
> core sticking point for this patch.
Yes, there is a certain amount of inherent, hard to control, risk towards the future that we must be willing to accept.


>
>> 4. attisnull()
>>    This routine is used in many places, but to give correct result sit must
>> now be accompanied
>>    by the tuple descriptor. This becomes moderately messy and it’s not
>> always clear where to get that.
>>    Interestingly most usages are related to catalog lookups.
>>    Assuming we have no intention to support fast default for catalog tables
>> we could keep using the
>>    existing attisnull() api for catalog lookups and use a new version (name
>> tbd) for user tables.
>
> attisnull is not a thing.  There's heap_attisnull and slot_attisnull.
My apologies.
Obviously slot_attisnull() is a non issue since slots have tuple descriptors.
It’s heap_attisnull() I am struggling with. It’s rather popular.
Yet, in the large majority of cases exist defaults do not apply, so digging up the descriptor
is rather wasteful.

>> 5. My head hurts looking at the PK/FK code - it’s not always clear which
>> tuple descriptor belongs
>>    to which tuple
>
> I suggest ibuprofen and a stiff upper lip.
Sound advise.

>> 6. Performance of the expansion code.
>>    The current code needs to walk all defaults and then start padding by
>> filling in values.
>>    But the outcome is always the same. We will produce the same payload and
>> the name null map.
>>    It would be feasible to cache an “all defaults tuple”, remember the
>> offsets (and VARWIDTH, HASNULL)
>>    for each attribute and then simply splice the short and default tuples
>> together.
>>    This ought to be faster, but the meta data to cache is not insignificant
>> and the expansion code is messy enough
>>    without this already.
>
> You could experiment to figure out if it makes any difference.
I think my first experiment will be to measure the performance impact of “expressed” vs. "non expressed" defaults
with the current design.
If that is considered excessive I will explore the more complex alternative.

>
>> 7. Grooming
>>    Obviously we can remove all exist defaults for a table from pg_attribute
>> whenever the table is rewrittem.
>>    That’s easy.
>
> But might cause the table to expand a lot, which would suck.
Well, a this point I must point back to the original goal which is O(1) ADD COLUMN performance.
The footprint of a rewritten table is no worse after this patch.
The reduced footprint of a table due to a rewrite avoided on ADD COLUMN is boon one must not rely upon.
The existing tuple layout would support an enhancement where trailing defaults may be avoided.
But I believe we would be better served with a more general approach to table compresion.

>
>>    But could we/should we keep track of the short tuples and either
>> eliminate them or drop exist defaults once they
>>    become obsolete because there is no tuple short enough for them to
>> matter.
>
> I wouldn't mess with it.
*phew*

>
>> 8. Do we need to worry about toasted defaults?
>
> Presumably.
Time for me to dig into that then.

Cheers
Serge RIelau
salesforce.com


Re: Fast Default WIP patch for discussion

От
Haribabu Kommi
Дата:


On Sat, Oct 29, 2016 at 2:28 AM, Serge Rielau <serge@rielau.com> wrote:
Time for me to dig into that then.

Closed in 2016-11 commitfest with "returned with feedback" status.
Please feel free to update the status once you submit the updated patch.

Regards,
Hari Babu
Fujitsu Australia

Re: Fast Default WIP patch for discussion

От
Andres Freund
Дата:
Hi Serge,

On 2016-10-28 08:28:11 -0700, Serge Rielau wrote:
> Time for me to dig into that then.

Are you planning to update your POC at some point? This'd be a very
welcome improvement.

Regards,

Andres



Re: Fast Default WIP patch for discussion

От
Serge Rielau
Дата:
Andres,

Yes, I still want to push this in. However I have not had time to get back to it.
I’m embarrassed to say that I don’t even know where the comments that were issued occurred.

Cheers
Serge


On Wed, Apr 5, 2017 at 4:47 PM, Andres Freund <andres@anarazel.de> wrote:
Hi Serge,

On 2016-10-28 08:28:11 -0700, Serge Rielau wrote:
> Time for me to dig into that then.

Are you planning to update your POC at some point? This'd be a very
welcome improvement.

Regards,

Andres

Re: [HACKERS] Fast Default WIP patch for discussion

От
Andres Freund
Дата:
Hi,

On 2017-04-05 22:31:15 -0700, Serge Rielau wrote:
> Andres,
> Yes, I still want to push this in. However I have not had time to get back to it. I’m embarrassed to say that I don’t
evenknow where the comments that were issued occurred.
 
> Cheers Serge

You mean
https://www.postgresql.org/message-id/CA+Tgmoa9hm1e+XB5Tc-ANyyZVc7CHFsA4_dZ_MdSJVpSL643Eg@mail.gmail.com
?

Andres