Обсуждение: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

Поиск

Список

Период

Сортировка

Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

"Jaime Casanova"

Дата:

22 октября 2008 г., 14:44:52

just remembering that -patches is a dead list, so i'm sending this to
-hackers where it will have more visibility...

---------- Forwarded message ----------
From: Jaime Casanova <jcasanov@systemguards.com.ec>
Date: Oct 22, 2008 9:43 AM
Subject: Re: [PATCHES] Auto Partitioning Patch - WIP version 1
To: Nikhil Sontakke <nikhil.sontakke@enterprisedb.com>
Cc: Bruce Momjian <bruce@momjian.us>, NikhilS <nikkhils@gmail.com>,
Simon Riggs <simon@2ndquadrant.com>, pgsql-patches@postgresql.org


On 10/22/08, Nikhil Sontakke <nikhil.sontakke@enterprisedb.com> wrote:
> Hi,
>
> > >
> > >> Thanks for taking a look. But if I am not mistaken Gavin and co. are
> working
> > >> on a much exhaustive proposal. In light of that maybe this patch might
> not
> > >> be needed in the first place?
> > >>
> > >> I will wait for discussion and a subsequent collective consensus here,
> > >> before deciding the further course of actions.
> > >
> > > I think it is unwise to wait on Gavin for a more complex implemention
> > > ---  we might end up with nothing for 8.4.  As long as your syntax is
> > > compatible with whatever Gavin proposed Gavin can add on to your patch
> > > once it is applied.
> > >
> >
> > seems like you're a prophet... or i miss something?
> >
>
> :)
>
> Maybe I will try to summarize the functionality of this patch, rebase it
> against latest CVS head and try to get it on the commitfest queue atleast
> for further feedback to keep the ball rolling on auto-partitioning...
>

yeah! i was thinking on doing that but still have no time... and
frankly you're the best man for the job ;)

one thing i was thinking of is to use triggers instead of rules just
as our current docs recommends
http://www.postgresql.org/docs/8.3/static/ddl-partitioning.html

with the benefit that a trigger can check if the child table exists
for the range being inserted and if not it can create it first...
haven't looked at the code in the detail but seems that your patch is
still missing the "create rule" part so we are in time to change
that... no?

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

Emmanuel Cecchet

Дата:

22 октября 2008 г., 21:49:58

Another advantage of triggers over rules is that it would work with COPY 
which is probably a desired feature.

Emmanuel

Jaime Casanova wrote:
> just remembering that -patches is a dead list, so i'm sending this to
> -hackers where it will have more visibility...
>
> ---------- Forwarded message ----------
> From: Jaime Casanova <jcasanov@systemguards.com.ec>
> Date: Oct 22, 2008 9:43 AM
> Subject: Re: [PATCHES] Auto Partitioning Patch - WIP version 1
> To: Nikhil Sontakke <nikhil.sontakke@enterprisedb.com>
> Cc: Bruce Momjian <bruce@momjian.us>, NikhilS <nikkhils@gmail.com>,
> Simon Riggs <simon@2ndquadrant.com>, pgsql-patches@postgresql.org
>
>
> On 10/22/08, Nikhil Sontakke <nikhil.sontakke@enterprisedb.com> wrote:
>   
>> Hi,
>>
>>     
>>>>> Thanks for taking a look. But if I am not mistaken Gavin and co. are
>>>>>           
>> working
>>     
>>>>> on a much exhaustive proposal. In light of that maybe this patch might
>>>>>           
>> not
>>     
>>>>> be needed in the first place?
>>>>>
>>>>> I will wait for discussion and a subsequent collective consensus here,
>>>>> before deciding the further course of actions.
>>>>>           
>>>> I think it is unwise to wait on Gavin for a more complex implemention
>>>> ---  we might end up with nothing for 8.4.  As long as your syntax is
>>>> compatible with whatever Gavin proposed Gavin can add on to your patch
>>>> once it is applied.
>>>>
>>>>         
>>> seems like you're a prophet... or i miss something?
>>>
>>>       
>> :)
>>
>> Maybe I will try to summarize the functionality of this patch, rebase it
>> against latest CVS head and try to get it on the commitfest queue atleast
>> for further feedback to keep the ball rolling on auto-partitioning...
>>
>>     
>
> yeah! i was thinking on doing that but still have no time... and
> frankly you're the best man for the job ;)
>
> one thing i was thinking of is to use triggers instead of rules just
> as our current docs recommends
> http://www.postgresql.org/docs/8.3/static/ddl-partitioning.html
>
> with the benefit that a trigger can check if the child table exists
> for the range being inserted and if not it can create it first...
> haven't looked at the code in the detail but seems that your patch is
> still missing the "create rule" part so we are in time to change
> that... no?
>   

-- 
Emmanuel Cecchet
FTO @ Frog Thinker 
Open Source Development & Consulting
--
Web: http://www.frogthinker.org
email: manu@frogthinker.org
Skype: emmanuel_cecchet

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

"Nikhil Sontakke"

Дата:

23 октября 2008 г., 12:41:49

Hi,

On Wed, Oct 22, 2008 at 8:14 PM, Jaime Casanova <jcasanov@systemguards.com.ec> wrote:

just remembering that -patches is a dead list, so i'm sending this to
-hackers where it will have more visibility...

---------- Forwarded message ----------
On 10/22/08, Nikhil Sontakke <nikhil.sontakke@enterprisedb.com> wrote:
> Hi,
>
> > >
> > >> Thanks for taking a look. But if I am not mistaken Gavin and co. are
> working
> > >> on a much exhaustive proposal. In light of that maybe this patch might
> not
> > >> be needed in the first place?
> > >>
> > >> I will wait for discussion and a subsequent collective consensus here,
> > >> before deciding the further course of actions.
> > >
> > > I think it is unwise to wait on Gavin for a more complex implemention
> > > --- we might end up with nothing for 8.4. As long as your syntax is
> > > compatible with whatever Gavin proposed Gavin can add on to your patch
> > > once it is applied.
> > >
> >
> > seems like you're a prophet... or i miss something?
> >
>
> :)
>
> Maybe I will try to summarize the functionality of this patch, rebase it
> against latest CVS head and try to get it on the commitfest queue atleast
> for further feedback to keep the ball rolling on auto-partitioning...
>

yeah! i was thinking on doing that but still have no time... and
frankly you're the best man for the job ;)

one thing i was thinking of is to use triggers instead of rules just
as our current docs recommends
http://www.postgresql.org/docs/8.3/static/ddl-partitioning.html

with the benefit that a trigger can check if the child table exists
for the range being inserted and if not it can create it first...
haven't looked at the code in the detail but seems that your patch is
still missing the "create rule" part so we are in time to change
that... no?

Yes triggers should be used instead of rules. Automatic generation of rules/triggers would be kind of hard and needs some looking into. Also there are issues like checking mutual exclusivity of the partition clauses specified too (I have been maintaining that the onus of ensuring sane partition ranges/clauses should rest with the users atleast initially..).

I will take a stab at this again whenever I get some free cycles.

Regards,
Nikhils

--
http://www.enterprisedb.com

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

"Nikhil Sontakke"

Дата:

31 октября 2008 г., 14:50:56

Hi,

> > >
> > >> Thanks for taking a look. But if I am not mistaken Gavin and co. are
> working
> > >> on a much exhaustive proposal. In light of that maybe this patch might
> not
> > >> be needed in the first place?
> > >>
> > >> I will wait for discussion and a subsequent collective consensus here,
> > >> before deciding the further course of actions.
> > >
> > > I think it is unwise to wait on Gavin for a more complex implemention
> > > --- we might end up with nothing for 8.4. As long as your syntax is
> > > compatible with whatever Gavin proposed Gavin can add on to your patch
> > > once it is applied.
> > >
> >
> > seems like you're a prophet... or i miss something?
> >
>
> :)
>
> Maybe I will try to summarize the functionality of this patch, rebase it
> against latest CVS head and try to get it on the commitfest queue atleast
> for further feedback to keep the ball rolling on auto-partitioning...
>

yeah! i was thinking on doing that but still have no time... and
frankly you're the best man for the job ;)

one thing i was thinking of is to use triggers instead of rules just
as our current docs recommends
http://www.postgresql.org/docs/8.3/static/ddl-partitioning.html

with the benefit that a trigger can check if the child table exists
for the range being inserted and if not it can create it first...
haven't looked at the code in the detail but seems that your patch is
still missing the "create rule" part so we are in time to change
that... no?

Yes triggers should be used instead of rules. Automatic generation of rules/triggers would be kind of hard and needs some looking into. Also there are issues like checking mutual exclusivity of the partition clauses specified too (I have been maintaining that the onus of ensuring sane partition ranges/clauses should rest with the users atleast initially..).

I will take a stab at this again whenever I get some free cycles.

I have synced up and modified the patch against latest CVS sources. Am attaching the latest WIP patch here.

Am restating that its a WIP patch, more so because we really need feedback on this before trying to expend any energy trying to come up with a commit-able patch.

As per me, the syntax introduced by this patch should be similar to what was proposed by Gavin quite a while back and this patch essentially tries to bring together a bunch of ddl that would otherwise have been performed step-by-step in a manual fashion earlier. To summarize this patch provides a one-shot mechanism to:

--   * create master table
--   * create several child tables that inherit from this master table
--   * add appropriate constraints to each of the child tables
--   * create a trigger function to redirect insert, updates, deletes to
--     appropriate child tables (plpgsql language)
--   * create the trigger using the trigger function

I have created a new file (src/test/regress/sql/partition.sql) to show a couple of examples of the grammar and the working functionality:

There are TODOs like:
-- logic to ensure unique trigger function and trigger names
-- The trigger function body could raise an exception if the insert/update/delete operation does not fit into any single partition
-- logic to check mutual exclusivity of ranges/lists
-- misc. issues to convert it from wip to commit-ready

If we think this is ok as a first step towards auto-partitioning then we can do something more with this patch.

Regards,
Nikhils

--
http://www.enterprisedb.com

Вложения

partitioning-nov-commitfest-wip-v1.0.patch.tar.gz

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

Emmanuel Cecchet

Дата:

31 октября 2008 г., 18:42:10

Hi Nikhil,

Here are a couple of questions:
- How do you ALTER the table to repartition it?
- The trigger function for inserts could be improved by using ELSE 
instead of independent IFs. This would ensure that the row is inserted 
in at most 1 partition. The last ELSE should raise an exception if there 
was no match (that would solve point 2 of your TODO list).
- Another option is to have a separate trigger per child table and chain 
them to the master table. For example something like:
CREATE OR REPLACE FUNCTION child_insert_trigger()
RETURNS TRIGGER AS $$
BEGIN IF (NEW.date >= DATE(TG_ARGV[1]) AND NEW.date < DATE(TG_ARGV[2]) ) THEN   INSERT INTO TG_ARGV[0] VALUES (NEW.*);
RETURN NULL; END IF; RETURN NEW;
 
END;
$$
LANGUAGE plpgsql;
DROP TRIGGER IF EXISTS insert_child_trigger ON master;
CREATE TRIGGER insert_child_trigger_y2008m01   BEFORE INSERT ON master   FOR EACH ROW EXECUTE PROCEDURE 
child_insert_trigger_date(child_y2008m01, '2008-01-01', '2008-02-01');

CREATE TRIGGER insert_child_trigger_y2008m02   BEFORE INSERT ON master   FOR EACH ROW EXECUTE PROCEDURE 
child_insert_trigger_date(child_y2008m02, '2008-02-01', '2008-03-01');

CREATE TRIGGER insert_child_trigger_y2008m03   BEFORE INSERT ON master   FOR EACH ROW EXECUTE PROCEDURE 
child_insert_trigger_date(child_y2008m03, '2008-03-01', '2008-04-01');

This might make it easier when you want to alter a specific partition 
rather than rewriting the whole trigger. Performance-wise, I am not sure 
how chained triggers will compare to the big if/then/else trigger.

- In the case of an insert, could it be possible to avoid the cost of a 
new INSERT statement (parser, planner, executor, etc...) by moving 
directly the tuple in the right table like the COPY code does? If we had 
an INSERT trigger code in C, given a HeapTuple and a target Relation we 
should be able to call heap_insert_tuple directly, with no parsing, 
planning, etc. required.

Thanks for your time,
Emmanuel

> Hi,
>  
>
>         > > >
>         > > >> Thanks for taking a look. But if I am not mistaken
>         Gavin and co. are
>         > working
>         > > >> on a much exhaustive proposal. In light of that maybe
>         this patch might
>         > not
>         > > >> be needed in the first place?
>         > > >>
>         > > >> I will wait for discussion and a subsequent collective
>         consensus here,
>         > > >> before deciding the further course of actions.
>         > > >
>         > > > I think it is unwise to wait on Gavin for a more complex
>         implemention
>         > > > ---  we might end up with nothing for 8.4.  As long as
>         your syntax is
>         > > > compatible with whatever Gavin proposed Gavin can add on
>         to your patch
>         > > > once it is applied.
>         > > >
>         > >
>         > > seems like you're a prophet... or i miss something?
>         > >
>         >
>         > :)
>         >
>         > Maybe I will try to summarize the functionality of this
>         patch, rebase it
>         > against latest CVS head and try to get it on the commitfest
>         queue atleast
>         > for further feedback to keep the ball rolling on
>         auto-partitioning...
>         >
>
>         yeah! i was thinking on doing that but still have no time... and
>         frankly you're the best man for the job ;)
>
>         one thing i was thinking of is to use triggers instead of
>         rules just
>         as our current docs recommends
>         http://www.postgresql.org/docs/8.3/static/ddl-partitioning.html
>
>         with the benefit that a trigger can check if the child table
>         exists
>         for the range being inserted and if not it can create it first...
>         haven't looked at the code in the detail but seems that your
>         patch is
>         still missing the "create rule" part so we are in time to change
>         that... no?
>
>
>     Yes triggers should be used instead of rules. Automatic generation
>     of rules/triggers would be kind of hard and needs some looking
>     into. Also there are issues like checking mutual exclusivity of
>     the partition clauses specified too (I have been maintaining that
>     the onus of ensuring sane partition ranges/clauses should rest
>     with the users atleast initially..).
>
>     I will take a stab at this again whenever I get some free cycles.
>
>
> I have synced up and modified the patch against latest CVS sources. Am 
> attaching the latest WIP patch here.
>
> Am restating that its a WIP patch, more so because we really need 
> feedback on this before trying to expend any energy trying to come up 
> with a commit-able patch.
>
> As per me, the syntax introduced by this patch should be similar to 
> what was proposed by Gavin quite a while back and this patch 
> essentially tries to bring together a bunch of ddl that would 
> otherwise have been performed step-by-step in a manual fashion 
> earlier. To summarize this patch provides a one-shot mechanism to:
>
> --   * create master table
> --   * create several child tables that inherit from this master table
> --   * add appropriate constraints to each of the child tables
> --   * create a trigger function to redirect insert, updates, deletes to
> --     appropriate child tables (plpgsql language)
> --   * create the trigger using the trigger function
>
> I have created a new file (src/test/regress/sql/partition.sql) to show 
> a couple of examples of the grammar and the working functionality:
>
> There are TODOs like:
> -- logic to ensure unique trigger function and trigger names
> -- The trigger function body could raise an exception if the 
> insert/update/delete operation does not fit into any single partition
> -- logic to check mutual exclusivity of ranges/lists
> -- misc. issues to convert it from wip to commit-ready
>
> If we think this is ok as a first step towards auto-partitioning then 
> we can do something more with this patch.
>
> Regards,
> Nikhils
> -- 
> http://www.enterprisedb.com
> ------------------------------------------------------------------------
>
>
>   


-- 
Emmanuel Cecchet
FTO @ Frog Thinker 
Open Source Development & Consulting
--
Web: http://www.frogthinker.org
email: manu@frogthinker.org
Skype: emmanuel_cecchet

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

"Jaime Casanova"

Дата:

06 ноября 2008 г., 03:47:31

On Fri, Oct 31, 2008 at 7:42 PM, Emmanuel Cecchet <manu@frogthinker.org> wrote:
> Hi Nikhil,
>

i'm looking at this one:
http://archives.postgresql.org/message-id/a301bfd90810310750pf108c69x36499546f406650f@mail.gmail.com

> Here are a couple of questions:
> - How do you ALTER the table to repartition it?

fair question. but the patch was advertized to only automate some
tasks that we do manually... so keeping the same limitations seems
reasonably to me...


> - Another option is to have a separate trigger per child table and chain
> them to the master table. For example something like:

that sounds like a lot of overhead...

---------------

Now, about the patch...

- seems strange the need to create plpgsql language before we can
create any partitioned table but given that the trigger is a plpgsql
function (and a c function can't be used because we could need to add
new partitions) it seems necesary... ideas?

- the update part of the trigger looks very simplistic... if the new
values isn't in the range accepted by the partition it errors out
because of the check constraint... can't we be a little smarter,
delete from the actual partition and insert in the new one...

for the rest, the patch passes all regression tests and seems to work
as advertized....

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

"Jaime Casanova"

Дата:

26 ноября 2008 г., 20:35:27

On Wed, Nov 5, 2008 at 11:47 PM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:
> On Fri, Oct 31, 2008 at 7:42 PM, Emmanuel Cecchet <manu@frogthinker.org> wrote:
>> Hi Nikhil,
>>
>
> i'm looking at this one:
> http://archives.postgresql.org/message-id/a301bfd90810310750pf108c69x36499546f406650f@mail.gmail.com
>

'cause the great interest this one has (i'm being ironic, just in case
;) can we safely say this was returned with feedback and remove it
from the list of pending patches?

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

"Robert Haas"

Дата:

26 ноября 2008 г., 20:52:11

> 'cause the great interest this one has (i'm being ironic, just in case
> ;) can we safely say this was returned with feedback and remove it
> from the list of pending patches?

Um...  are you referring to lack of interest from the patch author, or
from the community?

If the patch author is no longer interested in the patch, of course it
should be withdrawn.  But as for the community, the patch is on the
commitfest wiki[1] and you are listed as the reviewer, so I wouldn't
necessarily expect anyone else to comment at this point - although, in
fact, Emmanuel Cecchet wrote in as well, so I would say you have
exactly the opposite of a lack of interest.

If you think the patch needs further review from another reviewer, say
so.  I'm sure someone else can be assigned to do an additional review.

If you think the patch is ready to commit, say so, and update the wiki
accordingly.

...Robert

[1] http://wiki.postgresql.org/wiki/CommitFest_2008-11

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

"Jaime Casanova"

Дата:

26 ноября 2008 г., 23:13:46

On Wed, Nov 26, 2008 at 10:52 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> 'cause the great interest this one has (i'm being ironic, just in case
>> ;) can we safely say this was returned with feedback and remove it
>> from the list of pending patches?
>
>  the patch is on the
> commitfest wiki[1] and you are listed as the reviewer, so I wouldn't
> necessarily expect anyone else to comment at this point - although, in
> fact, Emmanuel Cecchet wrote in as well, so I would say you have
> exactly the opposite of a lack of interest.
>

i review it on nov 6, and there were open questions by me and by
Emmanuel none of those has been answered:
http://archives.postgresql.org/pgsql-hackers/2008-11/msg00362.php

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

"Robert Haas"

Дата:

27 ноября 2008 г., 03:09:50

> i review it on nov 6, and there were open questions by me and by
> Emmanuel none of those has been answered:
> http://archives.postgresql.org/pgsql-hackers/2008-11/msg00362.php

Hmm, there's only one actual question in that email, which is a
request for ideas about PL/pgsql vs. C.  I suspect you didn't get any
responses because the rest of the email seems to indicate that the
patch is not very mature at this point: for example, being able to
handle updates that move rows between partitions would seem to me to
be an essential feature for a project of this type, even though there
are many practical scenarios were it's unimportant.  Likewise, being
able to repartition sounds important.

With respect to the specific question about PL/pgsql vs C, I suspect
it's very unlikely that any patch of this type that relies on PL/pgsql
being loaded would be accepted into core.  However, it's possible that
a useful contrib module or pgfoundry project could be spawned on that
basis, and that might be a good place to start.

I think having a useful toolkit, or a core language feature, that
supports table partitioning would be awesome and would find very broad
application...  but it sounds like there is quite a bit of work left
to be done to get there.

...Robert

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

"Nikhil Sontakke"

Дата:

27 ноября 2008 г., 08:12:26

Hi,

i review it on nov 6, and there were open questions by me and by
> Emmanuel none of those has been answered:
> http://archives.postgresql.org/pgsql-hackers/2008-11/msg00362.php

Hmm, there's only one actual question in that email, which is a
request for ideas about PL/pgsql vs. C. I suspect you didn't get any
responses because the rest of the email seems to indicate that the
patch is not very mature at this point: for example, being able to
handle updates that move rows between partitions would seem to me to
be an essential feature for a project of this type, even though there
are many practical scenarios were it's unimportant. Likewise, being
able to repartition sounds important.

With respect to the specific question about PL/pgsql vs C, I suspect
it's very unlikely that any patch of this type that relies on PL/pgsql
being loaded would be accepted into core. However, it's possible that
a useful contrib module or pgfoundry project could be spawned on that
basis, and that might be a good place to start.

I think having a useful toolkit, or a core language feature, that
supports table partitioning would be awesome and would find very broad
application... but it sounds like there is quite a bit of work left
to be done to get there.

This patch does introduce some basic syntax to help create partitions.

The status has always being WIP, because what has not happened is that we have not had consensus on whether this is a logical first baby step ahead with partitioning. I haven't seen core members commenting on whether trying to aggregate the current set of manual operations together via this approach is worth spending further efforts, to get it into commitable shape.

To summarize, the community should decide if this is indeed the first step ahead.

Regards,
Nikhils
--
http://www.enterprisedb.com

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

Alvaro Herrera

Дата:

27 ноября 2008 г., 11:07:47

Nikhil Sontakke escribió:

> The status has always being WIP, because what has not happened is that we
> have not had consensus on whether this is a logical first baby step ahead
> with partitioning. I haven't seen core members commenting on whether trying
> to aggregate the current set of manual operations together via this approach
> is worth spending further efforts, to get it into commitable shape.

There was a lenghty, interesting discussion about this topic in the
developer meeting in Ottawa.
http://wiki.postgresql.org/wiki/PgCon_2008_Developer_Meeting#Partitioning_Roadmap

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

"Robert Haas"

Дата:

27 ноября 2008 г., 12:07:57

On Thu, Nov 27, 2008 at 7:04 AM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Nikhil Sontakke escribió:
>
>> The status has always being WIP, because what has not happened is that we
>> have not had consensus on whether this is a logical first baby step ahead
>> with partitioning. I haven't seen core members commenting on whether trying
>> to aggregate the current set of manual operations together via this approach
>> is worth spending further efforts, to get it into commitable shape.
>
> There was a lenghty, interesting discussion about this topic in the
> developer meeting in Ottawa.
> http://wiki.postgresql.org/wiki/PgCon_2008_Developer_Meeting#Partitioning_Roadmap

Interesting - too bad there aren't some mode detailed notes.

The semantics of PARTITION ON (<expr>) are unclear to me.  I was
thinking maybe it would make sense to do something like:

CREATE PARTITION <name> ON <table> WHERE <expr>

Then you could:

CREATE PARTITION transaction_2008_11 ON transaction WHERE record_date
BETWEEN '2008-11-01' AND '2008-11-30';

I like the idea of using table inheritance as a foundation for this
feature, but I think it's not going to be very useful for real-world
applications without cross-table indexes.  Suppose for example that I
have five years worth of data (thus, 60 partitions) and each
transaction has a unique identifier of some sort that is unrelated to
the date.  It's bad enough that a query like this has to check every
partition:

SELECT * FROM transaction WHERE uuid = ?

What's even worse (at least IMHO) is that there's no way to use
transaction (uuid) as a reference for a foreign key.

...Robert

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

Gregory Stark

Дата:

27 ноября 2008 г., 12:31:15

"Robert Haas" <robertmhaas@gmail.com> writes:

> CREATE PARTITION transaction_2008_11 ON transaction WHERE record_date
> BETWEEN '2008-11-01' AND '2008-11-30';

I think the main advantage to a better partitioning method would be teaching
Postgres about the partition key. Instead of a collection of different
constraints Postgres would know that "record_date" is *always* the partition
key. So it wouldn't have to be specified every time you declare a partition.

> I like the idea of using table inheritance as a foundation for this
> feature, but I think it's not going to be very useful for real-world
> applications without cross-table indexes.  

Well we could add support for cross-table indexes. It's not hard from the
point of low level implementation -- just include the table oid in the index
pointers. Figuring out how to represent such a thing at the index description
point of view would be quite tricky though.

*But*... in practice I would suggest that cross-table indexes are actually
very rarely useful. Having them defeats much of the advantage of partitioning
in the first place. Suddenly you would not be able to instantly drop and load
whole partitions. They're a big check-list item that people want to have
before they partition in case they need them but then they find out that the
down-sides of actually using them makes them quite useless.

Postgres's current architecture actually has a big advantage over more
methodical partitioning methods in this case. You can always add additional
constraints on other columns even if they aren't the "real" partitioning key.
So for example if you partition the invoice table by month once you close the
books for a previous month you can add a constraint WHERE invoice_id < 'xxx'.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about
EnterpriseDB'sPostgreSQL training!

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

"Nikhil Sontakke"

Дата:

27 ноября 2008 г., 13:01:40

Hi,

> The status has always being WIP, because what has not happened is that we
> have not had consensus on whether this is a logical first baby step ahead
> with partitioning. I haven't seen core members commenting on whether trying
> to aggregate the current set of manual operations together via this approach
> is worth spending further efforts, to get it into commitable shape.

There was a lenghty, interesting discussion about this topic in the
developer meeting in Ottawa.
http://wiki.postgresql.org/wiki/PgCon_2008_Developer_Meeting#Partitioning_Roadmap

The discussion is indeed interesting. But again the notes do not indicate any broad consensus on the roadmap :).

The current inheritance based mechanism has its pros-cons and there seem to be a multitude of requests/expectations around partitioning from different quarters. Even basic consensus about the syntax is missing. What we need is a step-by-step approach (starting with fixing up the syntax - if it can be done like that) and working our way downwards towards the underlying representation/planning for partitions...

Regards,
Nikhils

--
http://www.enterprisedb.com

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

"Jaime Casanova"

Дата:

27 ноября 2008 г., 13:02:07

On Thu, Nov 27, 2008 at 8:07 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> The semantics of PARTITION ON (<expr>) are unclear to me.  I was
> thinking maybe it would make sense to do something like:
>
> CREATE PARTITION <name> ON <table> WHERE <expr>
>

At first look seems nice but s Gregory said the ideal would be to
identify the key partition.

>
> I like the idea of using table inheritance as a foundation for this
> feature, but I think it's not going to be very useful for real-world
> applications without cross-table indexes.  Suppose for example that I
> have five years worth of data (thus, 60 partitions) and each
> transaction has a unique identifier of some sort that is unrelated to
> the date.  It's bad enough that a query like this has to check every
> partition:
>

you haven't. the WHERE clause in your hipotetical CREATE PARTITION
should create a check constraint on the child (inherited) table and if
you have constraint_exclusion to on you will check just the
partition(s) that match with the check constraint.

>
> What's even worse (at least IMHO) is that there's no way to use
> transaction (uuid) as a reference for a foreign key.
>

not directly, but you always can create a trigger instead of the
foreign key constraint...
mmm...the docs says that there is no good workaround, what about
mention a trigger?

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

"Robert Haas"

Дата:

27 ноября 2008 г., 13:41:09

On Thu, Nov 27, 2008 at 8:31 AM, Gregory Stark <stark@enterprisedb.com> wrote:
>> CREATE PARTITION transaction_2008_11 ON transaction WHERE record_date
>> BETWEEN '2008-11-01' AND '2008-11-30';
>
> I think the main advantage to a better partitioning method would be teaching
> Postgres about the partition key. Instead of a collection of different
> constraints Postgres would know that "record_date" is *always* the partition
> key. So it wouldn't have to be specified every time you declare a partition.

Hmm...  I thought the main advantage would be that you wouldn't have
to manually add constraints to all of the child tables, and you
wouldn't have to manually add rules/triggers to the parent table to
redirect DML operations.

What do you see as the advantage of pre-declaring record_date as the
partition key?  The major advantage I can think of is that it should
simplify constraint exclusion calculations considerably.  Also, you
can easily enforce that partitions are non-overlapping.  The
disadvantage is that you can't support more complex partitioning
schemes that can't be expressed in terms of ranges on a single key (an
obvious case is when you want to partition by date AND transaction
type, though that could probably be made to work if you allow
specifying multiple partition keys; less tractable cases are
imaginable).

I guess we could decide we don't care about the more complex
scenarios.  Or we could offer:

CREATE TABLE (...) WITH PARTITIONING;  -- ad-hoc partitioning
CREATE TABLE (...) WITH PARTITIONING ON (...);  -- partition keys must
be non-overlapping slices based only on the given columns

> *But*... in practice I would suggest that cross-table indexes are actually
> very rarely useful. Having them defeats much of the advantage of partitioning
> in the first place. Suddenly you would not be able to instantly drop and load
> whole partitions. They're a big check-list item that people want to have
> before they partition in case they need them but then they find out that the
> down-sides of actually using them makes them quite useless.

That's possible.  My every attempt to use inheritance has been stymied
by lack of this feature, but my attempts may not be representative.
In any case, the projects are severable.

> Postgres's current architecture actually has a big advantage over more
> methodical partitioning methods in this case. You can always add additional
> constraints on other columns even if they aren't the "real" partitioning key.
> So for example if you partition the invoice table by month once you close the
> books for a previous month you can add a constraint WHERE invoice_id < 'xxx'.

That's cool.

...Robert

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

"Robert Haas"

Дата:

27 ноября 2008 г., 13:47:32

>> I like the idea of using table inheritance as a foundation for this
>> feature, but I think it's not going to be very useful for real-world
>> applications without cross-table indexes.  Suppose for example that I
>> have five years worth of data (thus, 60 partitions) and each
>> transaction has a unique identifier of some sort that is unrelated to
>> the date.  It's bad enough that a query like this has to check every
>> partition:
> you haven't. the WHERE clause in your hipotetical CREATE PARTITION
> should create a check constraint on the child (inherited) table and if
> you have constraint_exclusion to on you will check just the
> partition(s) that match with the check constraint.

The problem is that constraint exclusion will not be able to exclude
anything for queries unrelated to the partition key. If my
transactions are identified by UUIDs or similar, there's no way to
predict which table will contain any particular value.  You end up
having to scan them all, and even if they all have individual indices
on the column in question, that's still 60 index scans instead of 1.

>> What's even worse (at least IMHO) is that there's no way to use
>> transaction (uuid) as a reference for a foreign key.
> not directly, but you always can create a trigger instead of the
> foreign key constraint...
> mmm...the docs says that there is no good workaround, what about
> mention a trigger?

I think it's pretty hard to make this bulletproof.  I think the
triggers that enforce ordinary foreign key constraints contain some
magical cross-checks on transaction commit that can't easily be
emulated by user-written triggers.  In any case, it's a long way from
"Oh, yeah, that just works."

...Robert

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

"Jaime Casanova"

Дата:

27 ноября 2008 г., 14:10:43

On Thu, Nov 27, 2008 at 9:41 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Nov 27, 2008 at 8:31 AM, Gregory Stark <stark@enterprisedb.com> wrote:
>>> CREATE PARTITION transaction_2008_11 ON transaction WHERE record_date
>>> BETWEEN '2008-11-01' AND '2008-11-30';
>>
>> I think the main advantage to a better partitioning method would be teaching
>> Postgres about the partition key. Instead of a collection of different
>> constraints Postgres would know that "record_date" is *always* the partition
>> key. So it wouldn't have to be specified every time you declare a partition.
>
> Hmm...  I thought the main advantage would be that you wouldn't have
> to manually add constraints to all of the child tables, and you
> wouldn't have to manually add rules/triggers to the parent table to
> redirect DML operations.
>

ok. what about let CREATE TABLE WITH PARTITIONING to create an entry
in a catalog indicating the key of the partition and install the
triggers and let the trigger decide if it has the partition to insert
the new row (making UPDATE working almost as DELETE+INSERT if it needs
to change of partitions) or create the new partition maybe with an
apropiate CREATE PARTITION...

that way i don't need to create triggers nor inherit tables
manually... and because of that maybe we can make possible to add
<expr> as partition key...

PS: i'm against using CREATE TABLE because we are inventing new syntax
but it seems like using ALTER TABLE is a *lot* of work altough ISTM
more usefull

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

"Jaime Casanova"

Дата:

27 ноября 2008 г., 14:12:18

On Thu, Nov 27, 2008 at 10:10 AM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:
> On Thu, Nov 27, 2008 at 9:41 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Thu, Nov 27, 2008 at 8:31 AM, Gregory Stark <stark@enterprisedb.com> wrote:
>>>> CREATE PARTITION transaction_2008_11 ON transaction WHERE record_date
>>>> BETWEEN '2008-11-01' AND '2008-11-30';
>>>
>>> I think the main advantage to a better partitioning method would be teaching
>>> Postgres about the partition key. Instead of a collection of different
>>> constraints Postgres would know that "record_date" is *always* the partition
>>> key. So it wouldn't have to be specified every time you declare a partition.
>>
>> Hmm...  I thought the main advantage would be that you wouldn't have
>> to manually add constraints to all of the child tables, and you
>> wouldn't have to manually add rules/triggers to the parent table to
>> redirect DML operations.
>>
>
> ok. what about let CREATE TABLE WITH PARTITIONING to create an entry
> in a catalog indicating the key of the partition and install the
> triggers and let the trigger decide if it has the partition to insert
> the new row (making UPDATE working almost as DELETE+INSERT if it needs
> to change of partitions) or create the new partition maybe with an
> apropiate CREATE PARTITION...
>

i thik i have to clarify this...

i intend to say that, the trigger will insert or create the partition
and insert...

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

Emmanuel Cecchet

Дата:

27 ноября 2008 г., 15:16:29

Hi all,

I have been following that discussion very closely but it seems that we 
are debating solutions without a good specification of the 
problem/requirements.
I would suggest that we collect all the partitioning requirements on a 
dedicated Wiki page. There might not be a one size fits it all solution 
for all requirements. We can also look at what other databases are 
proposing to address these issues.
If we can prioritize features, that should also allow us to stage the 
partitioning implementation.
I have a prototype insert trigger in C that directly move inserts in a 
master table to the appropriate child table (directly moving the tuple). 
Let me know if anyone is interested.

Emmanuel

Jaime Casanova wrote:
> On Thu, Nov 27, 2008 at 10:10 AM, Jaime Casanova
> <jcasanov@systemguards.com.ec> wrote:
>   
>> On Thu, Nov 27, 2008 at 9:41 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>     
>>> On Thu, Nov 27, 2008 at 8:31 AM, Gregory Stark <stark@enterprisedb.com> wrote:
>>>       
>>>>> CREATE PARTITION transaction_2008_11 ON transaction WHERE record_date
>>>>> BETWEEN '2008-11-01' AND '2008-11-30';
>>>>>           
>>>> I think the main advantage to a better partitioning method would be teaching
>>>> Postgres about the partition key. Instead of a collection of different
>>>> constraints Postgres would know that "record_date" is *always* the partition
>>>> key. So it wouldn't have to be specified every time you declare a partition.
>>>>         
>>> Hmm...  I thought the main advantage would be that you wouldn't have
>>> to manually add constraints to all of the child tables, and you
>>> wouldn't have to manually add rules/triggers to the parent table to
>>> redirect DML operations.
>>>
>>>       
>> ok. what about let CREATE TABLE WITH PARTITIONING to create an entry
>> in a catalog indicating the key of the partition and install the
>> triggers and let the trigger decide if it has the partition to insert
>> the new row (making UPDATE working almost as DELETE+INSERT if it needs
>> to change of partitions) or create the new partition maybe with an
>> apropiate CREATE PARTITION...
>>
>>     
>
> i thik i have to clarify this...
>
> i intend to say that, the trigger will insert or create the partition
> and insert...
>
>   


-- 
Emmanuel Cecchet
FTO @ Frog Thinker 
Open Source Development & Consulting
--
Web: http://www.frogthinker.org
email: manu@frogthinker.org
Skype: emmanuel_cecchet

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

"Robert Haas"

Дата:

27 ноября 2008 г., 15:38:44

> ok. what about let CREATE TABLE WITH PARTITIONING to create an entry
> in a catalog indicating the key of the partition and install the
> triggers and let the trigger decide if it has the partition to insert
> the new row (making UPDATE working almost as DELETE+INSERT if it needs
> to change of partitions) or create the new partition maybe with an
> apropiate CREATE PARTITION...
>
> that way i don't need to create triggers nor inherit tables
> manually... and because of that maybe we can make possible to add
> <expr> as partition key...
>
>
> PS: i'm against using CREATE TABLE because we are inventing new syntax
> but it seems like using ALTER TABLE is a *lot* of work altough ISTM
> more usefull

I think that's one of the useful things that could be done in this
area (not the only one, certainly), but I don't think we've defined
the semantics well enough to start talking about exactly which
commands to use.  As to CREATE TABLE and ALTER TABLE, I suspect you'll
need both.  We have to come to some consensus on whether predefining a
partition key is necessary, optional, or not supported.  And we need
to define ways both to set things up and to change them later.

If there is no predefined partition key, there's probably nothing
terribly special that needs to be done to prepare a table for
partitioning.  You could decide that all the data will live in the
parent table except for the partitions that are explicitly created.
When the user creates a partition, you create the new child table, set
it to inherit from the parent, add the necessary constraint,
create/update an automatically generated rule/trigger on the parent
that redirects DML to the appropriate partition, and move any EXISTING
tuples that belong in that partition into it.  You'd also need
operations to merge a partition back into the parent table (moving the
data back), drop a partition (lose the data), and change the
definition of a partition (move data around).

A significant problem with this design is that you don't know that the
partition constraints are mutually exclusive.  What do you do with
data that matches multiple partition constraints?  You'll have to
devise some rule, like maybe picking the first partition
alphabetically, which will complicate the rearrangement of data when
partitions are added or removed.

If there IS a predefined partition key, then you'll need a way to tell
the parent table what it is (and a way to remove it later if you
change your mind).  Then it should be possible to validate that child
partitions are defined only in terms of that key and that they are
mutually exclusive.  You'll still need basically all the same
operations: create partition, modify partition, merge partition back
into parent, drop partition.

...Robert

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

"Robert Haas"

Дата:

02 декабря 2008 г., 21:52:11

On Thu, Nov 27, 2008 at 11:09 AM, Emmanuel Cecchet <manu@frogthinker.org> wrote:
> I have been following that discussion very closely but it seems that we are
> debating solutions without a good specification of the problem/requirements.
> I would suggest that we collect all the partitioning requirements on a
> dedicated Wiki page. There might not be a one size fits it all solution for
> all requirements. We can also look at what other databases are proposing to
> address these issues.
> If we can prioritize features, that should also allow us to stage the
> partitioning implementation.

This might be a good idea.  Want to take a crack at it?

> I have a prototype insert trigger in C that directly move inserts in a
> master table to the appropriate child table (directly moving the tuple). Let
> me know if anyone is interested.

Can't hurt to post it.

...Robert

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

Emmanuel Cecchet

Дата:

09 декабря 2008 г., 23:57:52

Hi all,

While I was trying to find the right place to add a new page on the
wiki, I found the document of Simon on partitioning requirements
(http://wiki.postgresql.org/wiki/Image:Partitioning_Requirements.pdf)
referenced from http://wiki.postgresql.org/wiki/Development_projects
I think  this is a good base to start from. Should we convert the doc
into a wiki page or get the source for the doc and go from there?

I attach what I have come up with so far for the C trigger I was talking
about for efficient automatic auto-partitioning of inserts in child tables.

Emmanuel

Robert Haas wrote:
> On Thu, Nov 27, 2008 at 11:09 AM, Emmanuel Cecchet <manu@frogthinker.org> wrote:
>
>> I have been following that discussion very closely but it seems that we are
>> debating solutions without a good specification of the problem/requirements.
>> I would suggest that we collect all the partitioning requirements on a
>> dedicated Wiki page. There might not be a one size fits it all solution for
>> all requirements. We can also look at what other databases are proposing to
>> address these issues.
>> If we can prioritize features, that should also allow us to stage the
>> partitioning implementation.
>>
>
> This might be a good idea.  Want to take a crack at it?
>
>
>> I have a prototype insert trigger in C that directly move inserts in a
>> master table to the appropriate child table (directly moving the tuple). Let
>> me know if anyone is interested.
>>
>
> Can't hurt to post it.
>
> ...Robert
>
>


--
Emmanuel Cecchet
FTO @ Frog Thinker
Open Source Development & Consulting
--
Web: http://www.frogthinker.org
email: manu@frogthinker.org
Skype: emmanuel_cecchet

### Eclipse Workspace Patch 1.0
#P Postgres-HEAD
Index: src/test/regress/regress.c
===================================================================
RCS file: /root/cvsrepo/pgsql/src/test/regress/regress.c,v
retrieving revision 1.71
diff -u -r1.71 regress.c
--- src/test/regress/regress.c    25 Mar 2008 22:42:46 -0000    1.71
+++ src/test/regress/regress.c    13 Nov 2008 06:11:08 -0000
@@ -10,6 +10,9 @@
 #include "utils/geo_decls.h"    /* includes <math.h> */
 #include "executor/executor.h"    /* For GetAttributeByName */
 #include "commands/sequence.h"    /* for nextval() */
+#include "catalog/namespace.h"
+#include "executor/executor.h"
+#include "executor/tuptable.h"

 #define P_MAXDIG 12
 #define LDELIM            '('
@@ -732,3 +735,90 @@
     *--walk = '\0';
     PG_RETURN_CSTRING(result);
 }
+
+
+/*
+ * Partition trigger test
+ *
+ * The trigger should be used this way:
+ * CREATE TRIGGER child_table_name
+    BEFORE INSERT ON master_table
+    FOR EACH ROW EXECUTE PROCEDURE partition_insert_trigger();
+ */
+
+extern Datum partition_insert_trigger(PG_FUNCTION_ARGS);
+
+PG_FUNCTION_INFO_V1(partition_insert_trigger);
+
+Datum
+partition_insert_trigger(PG_FUNCTION_ARGS)
+{
+    TriggerData *trigdata = (TriggerData *) fcinfo->context;
+    HeapTuple    trigtuple= trigdata->tg_trigtuple;
+    char        *child_table_name;
+    Relation child_table_relation;
+    Oid relation_id;
+
+    /* make sure it's called as a trigger at all */
+    if (!CALLED_AS_TRIGGER(fcinfo))
+        elog(ERROR, "partition_insert_trigger: not called by trigger manager");
+
+    /* Sanity checks */
+    if (!TRIGGER_FIRED_BY_INSERT(trigdata->tg_event) || !TRIGGER_FIRED_BEFORE(trigdata->tg_event))
+        elog(ERROR, "partition_insert_trigger: not called on insert before");
+
+    // Child table name is either given as the unique parameter or it is the name of the trigger
+    if (trigdata->tg_trigger->tgnargs == 1)
+        child_table_name = trigdata->tg_trigger->tgargs[0];
+    else
+        child_table_name = trigdata->tg_trigger->tgname;
+
+    // Lookup the child relation
+    relation_id = RelnameGetRelid(child_table_name);
+    if (relation_id == InvalidOid)
+        elog(ERROR, "partition_insert_trigger: Invalid child table %s", child_table_name);
+    child_table_relation = RelationIdGetRelation(relation_id);
+    if (child_table_relation == NULL)
+        elog(ERROR, "partition_insert_trigger: Failed to locate relation for child table %s", child_table_name);
+
+    { // Check the constraints
+        TupleConstr *constr = child_table_relation->rd_att->constr;
+
+        if (constr->num_check > 0)
+        {
+            ResultRelInfo *resultRelInfo;
+            TupleTableSlot *slot;
+            EState *estate= CreateExecutorState();
+
+            resultRelInfo = makeNode(ResultRelInfo);
+            resultRelInfo->ri_RangeTableIndex = 1;        /* dummy */
+            resultRelInfo->ri_RelationDesc = child_table_relation;
+
+            estate->es_result_relations = resultRelInfo;
+            estate->es_num_result_relations = 1;
+            estate->es_result_relation_info = resultRelInfo;
+
+            /* Set up a tuple slot too */
+            slot = MakeSingleTupleTableSlot(trigdata->tg_relation->rd_att);
+            ExecStoreTuple(trigtuple, slot, InvalidBuffer, false);
+
+            if (ExecRelCheck(resultRelInfo, slot, estate) == NULL)
+            { // Constraints satisfied, insert the row in the child table
+                bool use_wal = true;
+                bool use_fsm=true;
+
+                heap_insert(child_table_relation, trigtuple, GetCurrentCommandId(true), use_wal, use_fsm);
+                RelationClose(child_table_relation);
+                ExecDropSingleTupleTableSlot(slot);
+                return PointerGetDatum(NULL);
+            }
+            ExecDropSingleTupleTableSlot(slot);
+        }
+        else
+            elog(ERROR, "partition_insert_trigger: No constraint found for child table %s", child_table_name);
+    }
+    RelationClose(child_table_relation);
+
+    return PointerGetDatum(trigdata->tg_trigtuple);
+}
+
Index: src/backend/executor/execMain.c
===================================================================
RCS file: /root/cvsrepo/pgsql/src/backend/executor/execMain.c,v
retrieving revision 1.314
diff -u -r1.314 execMain.c
--- src/backend/executor/execMain.c    31 Oct 2008 21:07:54 -0000    1.314
+++ src/backend/executor/execMain.c    13 Nov 2008 06:11:08 -0000
@@ -1947,7 +1947,7 @@
 /*
  * ExecRelCheck --- check that tuple meets constraints for result relation
  */
-static const char *
+const char *
 ExecRelCheck(ResultRelInfo *resultRelInfo,
              TupleTableSlot *slot, EState *estate)
 {
Index: src/test/regress/output/create_function_1.source
===================================================================
RCS file: /root/cvsrepo/pgsql/src/test/regress/output/create_function_1.source,v
retrieving revision 1.34
diff -u -r1.34 create_function_1.source
--- src/test/regress/output/create_function_1.source    31 Oct 2008 19:37:56 -0000    1.34
+++ src/test/regress/output/create_function_1.source    13 Nov 2008 06:11:08 -0000
@@ -47,6 +47,10 @@
         RETURNS int4
         AS '@libdir@/regress@DLSUFFIX@'
         LANGUAGE C STRICT;
+CREATE FUNCTION partition_insert_trigger ()
+        RETURNS trigger
+        AS '@libdir@/regress@DLSUFFIX@'
+        LANGUAGE C STRICT;
 -- Things that shouldn't work:
 CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL
     AS 'SELECT ''not an integer'';';
@@ -80,3 +84,70 @@
 CREATE FUNCTION test1 (int) RETURNS int LANGUAGE internal
     AS 'nosuch';
 ERROR:  there is no built-in function named "nosuch"
+-- Partitioning trigger test
+CREATE TABLE master (
+  id     int not null,
+  date   date not null,
+  value  int
+);
+CREATE TABLE child_y2008m01 (
+    CHECK ( date >= DATE '2008-01-01' AND date < DATE '2008-02-01' )
+) INHERITS (master);
+CREATE TABLE child_y2008m02 (
+    CHECK ( date >= DATE '2008-02-01' AND date < DATE '2008-03-01' )
+) INHERITS (master);
+CREATE TABLE child_y2008m03 (
+    CHECK ( date >= DATE '2008-03-01' AND date < DATE '2008-04-01' )
+) INHERITS (master);
+CREATE TRIGGER insert_child_y2008m01
+    BEFORE INSERT ON master
+    FOR EACH ROW EXECUTE PROCEDURE partition_insert_trigger('child_y2008m01');
+CREATE TRIGGER child_y2008m02
+    BEFORE INSERT ON master
+    FOR EACH ROW EXECUTE PROCEDURE partition_insert_trigger();
+CREATE TRIGGER child_y2008m03
+    BEFORE INSERT ON master
+    FOR EACH ROW EXECUTE PROCEDURE partition_insert_trigger();
+INSERT INTO master VALUES (1, '2008-01-15', 1);
+INSERT INTO master VALUES (2, '2008-02-15', 2);
+INSERT INTO master VALUES (3, '2008-03-15', 3);
+INSERT INTO master VALUES (4, '2008-04-15', 4);
+COPY master FROM '/root/pg_partitions/copy_input.txt';
+select * from master;
+ id |    date    | value
+----+------------+-------
+  4 | 04-15-2008 |     4
+  1 | 01-15-2008 |     1
+ 11 | 01-10-2008 |    11
+  2 | 02-15-2008 |     2
+ 12 | 02-15-2008 |    12
+  3 | 03-15-2008 |     3
+ 13 | 03-15-2008 |    13
+(7 rows)
+
+ select * from child_y2008m01;
+ id |    date    | value
+----+------------+-------
+  1 | 01-15-2008 |     1
+ 11 | 01-10-2008 |    11
+(2 rows)
+
+  select * from child_y2008m02;
+ id |    date    | value
+----+------------+-------
+  2 | 02-15-2008 |     2
+ 12 | 02-15-2008 |    12
+(2 rows)
+
+  select * from child_y2008m03;
+ id |    date    | value
+----+------------+-------
+  3 | 03-15-2008 |     3
+ 13 | 03-15-2008 |    13
+(2 rows)
+
+DROP TABLE master CASCADE;
+NOTICE:  drop cascades to 3 other objects
+DETAIL:  drop cascades to table child_y2008m01
+drop cascades to table child_y2008m02
+drop cascades to table child_y2008m03
Index: src/include/executor/executor.h
===================================================================
RCS file: /root/cvsrepo/pgsql/src/include/executor/executor.h,v
retrieving revision 1.152
diff -u -r1.152 executor.h
--- src/include/executor/executor.h    31 Oct 2008 21:07:55 -0000    1.152
+++ src/include/executor/executor.h    13 Nov 2008 06:11:08 -0000
@@ -155,6 +155,8 @@
 extern bool ExecContextForcesOids(PlanState *planstate, bool *hasoids);
 extern void ExecConstraints(ResultRelInfo *resultRelInfo,
                 TupleTableSlot *slot, EState *estate);
+extern const char *ExecRelCheck(ResultRelInfo *resultRelInfo,
+             TupleTableSlot *slot, EState *estate);
 extern TupleTableSlot *EvalPlanQual(EState *estate, Index rti,
              ItemPointer tid, TransactionId priorXmax);
 extern PlanState *ExecGetActivePlanTree(QueryDesc *queryDesc);
Index: src/test/regress/input/create_function_1.source
===================================================================
RCS file: /root/cvsrepo/pgsql/src/test/regress/input/create_function_1.source,v
retrieving revision 1.19
diff -u -r1.19 create_function_1.source
--- src/test/regress/input/create_function_1.source    1 Oct 2008 22:38:57 -0000    1.19
+++ src/test/regress/input/create_function_1.source    13 Nov 2008 06:11:08 -0000
@@ -52,6 +52,12 @@
         AS '@libdir@/regress@DLSUFFIX@'
         LANGUAGE C STRICT;

+CREATE FUNCTION partition_insert_trigger ()
+        RETURNS trigger
+        AS '@libdir@/regress@DLSUFFIX@'
+        LANGUAGE C STRICT;
+
+
 -- Things that shouldn't work:

 CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL
@@ -77,3 +83,48 @@

 CREATE FUNCTION test1 (int) RETURNS int LANGUAGE internal
     AS 'nosuch';
+
+-- Partitioning trigger test
+
+CREATE TABLE master (
+  id     int not null,
+  date   date not null,
+  value  int
+);
+
+CREATE TABLE child_y2008m01 (
+    CHECK ( date >= DATE '2008-01-01' AND date < DATE '2008-02-01' )
+) INHERITS (master);
+
+CREATE TABLE child_y2008m02 (
+    CHECK ( date >= DATE '2008-02-01' AND date < DATE '2008-03-01' )
+) INHERITS (master);
+
+CREATE TABLE child_y2008m03 (
+    CHECK ( date >= DATE '2008-03-01' AND date < DATE '2008-04-01' )
+) INHERITS (master);
+
+CREATE TRIGGER insert_child_y2008m01
+    BEFORE INSERT ON master
+    FOR EACH ROW EXECUTE PROCEDURE partition_insert_trigger('child_y2008m01');
+
+CREATE TRIGGER child_y2008m02
+    BEFORE INSERT ON master
+    FOR EACH ROW EXECUTE PROCEDURE partition_insert_trigger();
+
+CREATE TRIGGER child_y2008m03
+    BEFORE INSERT ON master
+    FOR EACH ROW EXECUTE PROCEDURE partition_insert_trigger();
+
+INSERT INTO master VALUES (1, '2008-01-15', 1);
+INSERT INTO master VALUES (2, '2008-02-15', 2);
+INSERT INTO master VALUES (3, '2008-03-15', 3);
+INSERT INTO master VALUES (4, '2008-04-15', 4);
+COPY master FROM '/root/pg_partitions/copy_input.txt';
+
+select * from master;
+select * from child_y2008m01;
+select * from child_y2008m02;
+select * from child_y2008m03;
+
+DROP TABLE master CASCADE;

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

Josh Berkus

Дата:

15 декабря 2008 г., 17:16:01

Hackers,

We don't yet seem to have a clear specification for this feature, and 
the Other Open Source DB has shown us how problematic it is to get 
auto-partitioning wrong.

Should we defer auto-partitioning to 8.5?

--Josh

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

Gregory Stark

Дата:

15 декабря 2008 г., 17:48:31

Josh Berkus <josh@agliodbs.com> writes:

> Hackers,
>
> We don't yet seem to have a clear specification for this feature, and the Other
> Open Source DB has shown us how problematic it is to get auto-partitioning
> wrong.
>
> Should we defer auto-partitioning to 8.5?

If we're serious about having a "next generation" partitioning with a concept
of partition keys then it seems to me to make more sense to do that first and
then add on a feature like this. 

This is still very useful. I haven't looked at the actual patch, does it
require core changes or can it be stashed in a pgfoundry or contrib module?

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support!

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

"Jaime Casanova"

Дата:

15 декабря 2008 г., 18:01:25

On Mon, Dec 15, 2008 at 1:46 PM, Gregory Stark <stark@enterprisedb.com> wrote:
> Josh Berkus <josh@agliodbs.com> writes:
>
>> Hackers,
>>
>> We don't yet seem to have a clear specification for this feature, and the Other
>> Open Source DB has shown us how problematic it is to get auto-partitioning
>> wrong.
>>
>> Should we defer auto-partitioning to 8.5?
>
> If we're serious about having a "next generation" partitioning with a concept
> of partition keys then it seems to me to make more sense to do that first and
> then add on a feature like this.
>

+1

> This is still very useful. I haven't looked at the actual patch, does it
> require core changes or can it be stashed in a pgfoundry or contrib module?
>

what i don't like about this one is that it creates partitions at
create table time and to manually add all new partitions (inherit
tables and modify the trigger)... and what i want to see is an
automatic creation when it's needed...


--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

Emmanuel Cecchet

Дата:

16 декабря 2008 г., 01:05:26

Hi all,

I will be working on a roadmap for the partitioning features. I think
that there are different needs and that we will not be able to address
them all in 8.5 or even 8.6.
The goal will be to get things done step by step but possibly with a
design that will not require major refactoring to support new features.
I will try to setup the new wiki page tomorrow.

In the meantime, I have made some more tests with the trigger in C (see
attached patch). To prevent duplicating too much code, it requires the
ExecRelCheck method to be exported (that would be nice to have this
function exported in 8.4 so that we can start experimenting in 8.4 and
don't have to wait another year for 8.5). If there is locality in the
inserts (which might be the case if you COPY sorted data), the
performance remains constant regardless the number of child tables.
My initial tests to insert 140k rows are as follows:
- direct inserts in a child table: 2 seconds
- pgplsql trigger (IF ... ELSE IF ... blocks) : 14.5 seconds.
- C trigger: 4 seconds (actually the overhead is in the constraint check)

Right now if the row cannot be inserted in any child table, it is
inserted in the parent. If you want to fail, we can add a 'fail trigger'
(after all other triggers) that generates an error if previous triggers
did not capture the row. If you want to create a new partition, you can
have another trigger to handle that.

So I think that this trigger approach is pretty flexible like people
used AOP in J2EE servers to process requests. It has also the advantage
of allowing fast prototyping. It should also be easy to push that
functionality down in the core as needed.

Is it ok if I move Simon's requirement document under a more generic
'Table partitioning' page on the Wiki?

Thanks for your feedback,
manu

Jaime Casanova wrote:
> On Mon, Dec 15, 2008 at 1:46 PM, Gregory Stark <stark@enterprisedb.com> wrote:
>
>> Josh Berkus <josh@agliodbs.com> writes:
>>
>>
>>> Hackers,
>>>
>>> We don't yet seem to have a clear specification for this feature, and the Other
>>> Open Source DB has shown us how problematic it is to get auto-partitioning
>>> wrong.
>>>
>>> Should we defer auto-partitioning to 8.5?
>>>
>> If we're serious about having a "next generation" partitioning with a concept
>> of partition keys then it seems to me to make more sense to do that first and
>> then add on a feature like this.
>>
>>
>
> +1
>
>
>> This is still very useful. I haven't looked at the actual patch, does it
>> require core changes or can it be stashed in a pgfoundry or contrib module?
>>
>>
>
> what i don't like about this one is that it creates partitions at
> create table time and to manually add all new partitions (inherit
> tables and modify the trigger)... and what i want to see is an
> automatic creation when it's needed...
>
>
>


--
Emmanuel Cecchet
FTO @ Frog Thinker
Open Source Development & Consulting
--
Web: http://www.frogthinker.org
email: manu@frogthinker.org
Skype: emmanuel_cecchet

### Eclipse Workspace Patch 1.0
#P Postgres-HEAD
Index: src/test/regress/regress.c
===================================================================
RCS file: /root/cvsrepo/pgsql/src/test/regress/regress.c,v
retrieving revision 1.71
diff -u -r1.71 regress.c
--- src/test/regress/regress.c    25 Mar 2008 22:42:46 -0000    1.71
+++ src/test/regress/regress.c    16 Dec 2008 01:42:50 -0000
@@ -10,6 +10,9 @@
 #include "utils/geo_decls.h"    /* includes <math.h> */
 #include "executor/executor.h"    /* For GetAttributeByName */
 #include "commands/sequence.h"    /* for nextval() */
+#include "catalog/namespace.h"
+#include "executor/executor.h"
+#include "executor/tuptable.h"

 #define P_MAXDIG 12
 #define LDELIM            '('
@@ -732,3 +735,141 @@
     *--walk = '\0';
     PG_RETURN_CSTRING(result);
 }
+
+
+/*
+ * Partition trigger test
+ *
+ * The trigger should be used this way:
+ * CREATE TRIGGER trigger_name
+    BEFORE INSERT ON master_table
+    FOR EACH ROW EXECUTE PROCEDURE partition_insert_trigger(child_table_name, column_number, min_val, max_val);
+ */
+
+extern Datum partition_insert_trigger(PG_FUNCTION_ARGS);
+
+static Datum
+check_constraints_and_insert_tuple(Relation child_table_relation, TriggerData *trigdata ,HeapTuple trigtuple)
+{ // Check the constraints
+    ResultRelInfo    *resultRelInfo;
+    TupleTableSlot    *slot;
+    EState             *estate = CreateExecutorState();
+    Datum             result;
+
+    result = PointerGetDatum(trigdata->tg_trigtuple);
+
+    resultRelInfo = makeNode(ResultRelInfo);
+    resultRelInfo->ri_RangeTableIndex = 1;        /* dummy */
+    resultRelInfo->ri_RelationDesc = child_table_relation;
+
+    estate->es_result_relations = resultRelInfo;
+    estate->es_num_result_relations = 1;
+    estate->es_result_relation_info = resultRelInfo;
+
+    /* Set up a tuple slot too */
+    slot = MakeSingleTupleTableSlot(trigdata->tg_relation->rd_att);
+    ExecStoreTuple(trigtuple, slot, InvalidBuffer, false);
+
+    if (ExecRelCheck(resultRelInfo, slot, estate) == NULL)
+    { // Constraints satisfied, insert the row in the child table
+        bool use_wal = true;
+        bool use_fsm=true;
+
+        /* BEFORE ROW INSERT Triggers */
+        if (resultRelInfo->ri_TrigDesc &&
+                resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_INSERT] > 0)
+        {
+            HeapTuple    newtuple;
+            newtuple = ExecBRInsertTriggers(estate, resultRelInfo, trigtuple);
+
+            if (newtuple != trigtuple) /* modified by Trigger(s) */
+            {
+                heap_freetuple(trigtuple);
+                trigtuple = newtuple;
+            }
+        }
+
+        /* Perform the insert
+         * TODO: Check that we detect constraint violation if before row insert does something bad */
+        heap_insert(child_table_relation, trigtuple, GetCurrentCommandId(true), use_wal, use_fsm);
+
+        /* Update indices */
+        ExecOpenIndices(resultRelInfo);
+        if (resultRelInfo->ri_NumIndices > 0)
+            ExecInsertIndexTuples(slot, &(trigtuple->t_self), estate, false);
+
+        /* AFTER ROW INSERT Triggers */
+        ExecARInsertTriggers(estate, resultRelInfo, trigtuple);
+
+        result = PointerGetDatum(NULL);
+    }
+    // Free resources
+    FreeExecutorState(estate);
+    ExecDropSingleTupleTableSlot(slot);
+
+    return result;
+}
+
+
+PG_FUNCTION_INFO_V1(partition_insert_trigger);
+
+static Relation    last_inserted_relation;
+
+Datum
+partition_insert_trigger(PG_FUNCTION_ARGS)
+{
+    TriggerData *trigdata = (TriggerData *) fcinfo->context;
+    HeapTuple    trigtuple= trigdata->tg_trigtuple;
+    TupleConstr *constr;
+    char        *child_table_name;
+    Relation    child_table_relation;
+    Oid            relation_id;
+    Datum        result;
+
+    // Try to exploit locality for bulk inserts
+    // We expect consecutive insert to go to the same child table
+    if (last_inserted_relation != NULL)
+    {   // Try the last table we used
+        result = check_constraints_and_insert_tuple(last_inserted_relation, trigdata, trigtuple);
+        if (result == PointerGetDatum(NULL))
+            return result;
+        // We got a miss
+        last_inserted_relation = NULL;
+    }
+
+    /* make sure it's called as a trigger at all */
+    if (!CALLED_AS_TRIGGER(fcinfo))
+        elog(ERROR, "partition_insert_trigger: not called by trigger manager");
+
+    /* Sanity checks */
+    if (!TRIGGER_FIRED_BY_INSERT(trigdata->tg_event) || !TRIGGER_FIRED_BEFORE(trigdata->tg_event))
+        elog(ERROR, "partition_insert_trigger: not called on insert before");
+
+    // Child table name is either given as the unique parameter or it is the name of the trigger
+    if (trigdata->tg_trigger->tgnargs == 1)
+        child_table_name = trigdata->tg_trigger->tgargs[0];
+    else
+        child_table_name = trigdata->tg_trigger->tgname;
+
+    // Lookup the child relation
+    relation_id = RelnameGetRelid(child_table_name);
+    if (relation_id == InvalidOid)
+        elog(ERROR, "partition_insert_trigger: Invalid child table %s", child_table_name);
+    child_table_relation = RelationIdGetRelation(relation_id);
+    if (child_table_relation == NULL)
+        elog(ERROR, "partition_insert_trigger: Failed to locate relation for child table %s", child_table_name);
+
+    constr = child_table_relation->rd_att->constr;
+    if (constr->num_check == 0)
+        elog(ERROR, "partition_insert_trigger: No constraint found for child table %s", child_table_name);
+
+    result = check_constraints_and_insert_tuple(child_table_relation, trigdata, trigtuple);
+
+    if (result == PointerGetDatum(NULL))
+        last_inserted_relation = child_table_relation;
+
+    RelationClose(child_table_relation);
+
+    return result;
+}
+
Index: src/backend/executor/execMain.c
===================================================================
RCS file: /root/cvsrepo/pgsql/src/backend/executor/execMain.c,v
retrieving revision 1.314
diff -u -r1.314 execMain.c
--- src/backend/executor/execMain.c    31 Oct 2008 21:07:54 -0000    1.314
+++ src/backend/executor/execMain.c    16 Dec 2008 01:42:50 -0000
@@ -1947,7 +1947,7 @@
 /*
  * ExecRelCheck --- check that tuple meets constraints for result relation
  */
-static const char *
+const char *
 ExecRelCheck(ResultRelInfo *resultRelInfo,
              TupleTableSlot *slot, EState *estate)
 {
Index: src/test/regress/output/create_function_1.source
===================================================================
RCS file: /root/cvsrepo/pgsql/src/test/regress/output/create_function_1.source,v
retrieving revision 1.34
diff -u -r1.34 create_function_1.source
--- src/test/regress/output/create_function_1.source    31 Oct 2008 19:37:56 -0000    1.34
+++ src/test/regress/output/create_function_1.source    16 Dec 2008 01:42:50 -0000
@@ -47,6 +47,10 @@
         RETURNS int4
         AS '@libdir@/regress@DLSUFFIX@'
         LANGUAGE C STRICT;
+CREATE FUNCTION partition_insert_trigger ()
+        RETURNS trigger
+        AS '@libdir@/regress@DLSUFFIX@'
+        LANGUAGE C STRICT;
 -- Things that shouldn't work:
 CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL
     AS 'SELECT ''not an integer'';';
@@ -80,3 +84,70 @@
 CREATE FUNCTION test1 (int) RETURNS int LANGUAGE internal
     AS 'nosuch';
 ERROR:  there is no built-in function named "nosuch"
+-- Partitioning trigger test
+CREATE TABLE master (
+  id     int not null,
+  date   date not null,
+  value  int
+);
+CREATE TABLE child_y2008m01 (
+    CHECK ( date >= DATE '2008-01-01' AND date < DATE '2008-02-01' )
+) INHERITS (master);
+CREATE TABLE child_y2008m02 (
+    CHECK ( date >= DATE '2008-02-01' AND date < DATE '2008-03-01' )
+) INHERITS (master);
+CREATE TABLE child_y2008m03 (
+    CHECK ( date >= DATE '2008-03-01' AND date < DATE '2008-04-01' )
+) INHERITS (master);
+CREATE TRIGGER insert_child_y2008m01
+    BEFORE INSERT ON master
+    FOR EACH ROW EXECUTE PROCEDURE partition_insert_trigger('child_y2008m01');
+CREATE TRIGGER child_y2008m02
+    BEFORE INSERT ON master
+    FOR EACH ROW EXECUTE PROCEDURE partition_insert_trigger();
+CREATE TRIGGER child_y2008m03
+    BEFORE INSERT ON master
+    FOR EACH ROW EXECUTE PROCEDURE partition_insert_trigger();
+INSERT INTO master VALUES (1, '2008-01-15', 1);
+INSERT INTO master VALUES (2, '2008-02-15', 2);
+INSERT INTO master VALUES (3, '2008-03-15', 3);
+INSERT INTO master VALUES (4, '2008-04-15', 4);
+COPY master FROM '/root/pg_partitions/copy_input.txt';
+select * from master;
+ id |    date    | value
+----+------------+-------
+  4 | 04-15-2008 |     4
+  1 | 01-15-2008 |     1
+ 11 | 01-10-2008 |    11
+  2 | 02-15-2008 |     2
+ 12 | 02-15-2008 |    12
+  3 | 03-15-2008 |     3
+ 13 | 03-15-2008 |    13
+(7 rows)
+
+ select * from child_y2008m01;
+ id |    date    | value
+----+------------+-------
+  1 | 01-15-2008 |     1
+ 11 | 01-10-2008 |    11
+(2 rows)
+
+  select * from child_y2008m02;
+ id |    date    | value
+----+------------+-------
+  2 | 02-15-2008 |     2
+ 12 | 02-15-2008 |    12
+(2 rows)
+
+  select * from child_y2008m03;
+ id |    date    | value
+----+------------+-------
+  3 | 03-15-2008 |     3
+ 13 | 03-15-2008 |    13
+(2 rows)
+
+DROP TABLE master CASCADE;
+NOTICE:  drop cascades to 3 other objects
+DETAIL:  drop cascades to table child_y2008m01
+drop cascades to table child_y2008m02
+drop cascades to table child_y2008m03
Index: src/include/executor/executor.h
===================================================================
RCS file: /root/cvsrepo/pgsql/src/include/executor/executor.h,v
retrieving revision 1.152
diff -u -r1.152 executor.h
--- src/include/executor/executor.h    31 Oct 2008 21:07:55 -0000    1.152
+++ src/include/executor/executor.h    16 Dec 2008 01:42:50 -0000
@@ -155,6 +155,8 @@
 extern bool ExecContextForcesOids(PlanState *planstate, bool *hasoids);
 extern void ExecConstraints(ResultRelInfo *resultRelInfo,
                 TupleTableSlot *slot, EState *estate);
+extern const char *ExecRelCheck(ResultRelInfo *resultRelInfo,
+             TupleTableSlot *slot, EState *estate);
 extern TupleTableSlot *EvalPlanQual(EState *estate, Index rti,
              ItemPointer tid, TransactionId priorXmax);
 extern PlanState *ExecGetActivePlanTree(QueryDesc *queryDesc);
Index: src/test/regress/input/create_function_1.source
===================================================================
RCS file: /root/cvsrepo/pgsql/src/test/regress/input/create_function_1.source,v
retrieving revision 1.19
diff -u -r1.19 create_function_1.source
--- src/test/regress/input/create_function_1.source    1 Oct 2008 22:38:57 -0000    1.19
+++ src/test/regress/input/create_function_1.source    16 Dec 2008 01:42:50 -0000
@@ -52,6 +52,12 @@
         AS '@libdir@/regress@DLSUFFIX@'
         LANGUAGE C STRICT;

+CREATE FUNCTION partition_insert_trigger ()
+        RETURNS trigger
+        AS '@libdir@/regress@DLSUFFIX@'
+        LANGUAGE C STRICT;
+
+
 -- Things that shouldn't work:

 CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL
@@ -77,3 +83,48 @@

 CREATE FUNCTION test1 (int) RETURNS int LANGUAGE internal
     AS 'nosuch';
+
+-- Partitioning trigger test
+
+CREATE TABLE master (
+  id     int not null,
+  date   date not null,
+  value  int
+);
+
+CREATE TABLE child_y2008m01 (
+    CHECK ( date >= DATE '2008-01-01' AND date < DATE '2008-02-01' )
+) INHERITS (master);
+
+CREATE TABLE child_y2008m02 (
+    CHECK ( date >= DATE '2008-02-01' AND date < DATE '2008-03-01' )
+) INHERITS (master);
+
+CREATE TABLE child_y2008m03 (
+    CHECK ( date >= DATE '2008-03-01' AND date < DATE '2008-04-01' )
+) INHERITS (master);
+
+CREATE TRIGGER insert_child_y2008m01
+    BEFORE INSERT ON master
+    FOR EACH ROW EXECUTE PROCEDURE partition_insert_trigger('child_y2008m01');
+
+CREATE TRIGGER child_y2008m02
+    BEFORE INSERT ON master
+    FOR EACH ROW EXECUTE PROCEDURE partition_insert_trigger();
+
+CREATE TRIGGER child_y2008m03
+    BEFORE INSERT ON master
+    FOR EACH ROW EXECUTE PROCEDURE partition_insert_trigger();
+
+INSERT INTO master VALUES (1, '2008-01-15', 1);
+INSERT INTO master VALUES (2, '2008-02-15', 2);
+INSERT INTO master VALUES (3, '2008-03-15', 3);
+INSERT INTO master VALUES (4, '2008-04-15', 4);
+COPY master FROM '/root/pg_partitions/copy_input.txt';
+
+select * from master;
+select * from child_y2008m01;
+select * from child_y2008m02;
+select * from child_y2008m03;
+
+DROP TABLE master CASCADE;

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

"Nikhil Sontakke"

Дата:

16 декабря 2008 г., 05:35:18

Hi,

I will be working on a roadmap for the partitioning features. I think that there are different needs and that we will not be able to address them all in 8.5 or even 8.6.
The goal will be to get things done step by step but possibly with a design that will not require major refactoring to support new features. I will try to setup the new wiki page tomorrow.

+1.

In the meantime, I have made some more tests with the trigger in C (see attached patch). To prevent duplicating too much code, it requires the ExecRelCheck method to be exported (that would be nice to have this function exported in 8.4 so that we can start experimenting in 8.4 and don't have to wait another year for 8.5). If there is locality in the inserts (which might be the case if you COPY sorted data), the performance remains constant regardless the number of child tables.

A similar DELETE trigger should be pretty easy to write up in C. I think the main challenge is with UPDATE triggers especially if the new row will fall into another child table - but we can always throw an error for such a case initially.

Right now if the row cannot be inserted in any child table, it is inserted in the parent. If you want to fail, we can add a 'fail trigger' (after all other triggers) that generates an error if previous triggers did not capture the row. If you want to create a new partition, you can have another trigger to handle that.

One of the work items related to partitioning eventually is to avoid having to APPEND the parent in all queries involving children. Maybe having an overflow child table might help to catch failed triggers for those cases?

Regards,
Nikhils

So I think that this trigger approach is pretty flexible like people used AOP in J2EE servers to process requests. It has also the advantage of allowing fast prototyping. It should also be easy to push that functionality down in the core as needed.

Is it ok if I move Simon's requirement document under a more generic 'Table partitioning' page on the Wiki?

Thanks for your feedback,
manu

Jaime Casanova wrote:
On Mon, Dec 15, 2008 at 1:46 PM, Gregory Stark <stark@enterprisedb.com> wrote:

Josh Berkus <josh@agliodbs.com> writes:

Hackers,

We don't yet seem to have a clear specification for this feature, and the Other
Open Source DB has shown us how problematic it is to get auto-partitioning
wrong.

Should we defer auto-partitioning to 8.5?

If we're serious about having a "next generation" partitioning with a concept
of partition keys then it seems to me to make more sense to do that first and
then add on a feature like this.

+1

This is still very useful. I haven't looked at the actual patch, does it
require core changes or can it be stashed in a pgfoundry or contrib module?

what i don't like about this one is that it creates partitions at
create table time and to manually add all new partitions (inherit
tables and modify the trigger)... and what i want to see is an
automatic creation when it's needed...

--
Emmanuel Cecchet
FTO @ Frog Thinker Open Source Development & Consulting
--
Web: http://www.frogthinker.org
email: manu@frogthinker.org
Skype: emmanuel_cecchet

### Eclipse Workspace Patch 1.0
#P Postgres-HEAD
Index: src/test/regress/regress.c
===================================================================
RCS file: /root/cvsrepo/pgsql/src/test/regress/regress.c,v
retrieving revision 1.71
diff -u -r1.71 regress.c
--- src/test/regress/regress.c 25 Mar 2008 22:42:46 -0000 1.71
+++ src/test/regress/regress.c 16 Dec 2008 01:42:50 -0000
@@ -10,6 +10,9 @@
#include "utils/geo_decls.h" /* includes <math.h> */
#include "executor/executor.h" /* For GetAttributeByName */
#include "commands/sequence.h" /* for nextval() */
+#include "catalog/namespace.h"
+#include "executor/executor.h"
+#include "executor/tuptable.h"

#define P_MAXDIG 12
#define LDELIM '('
@@ -732,3 +735,141 @@
*--walk = '\0';
PG_RETURN_CSTRING(result);
}
+
+
+/*
+ * Partition trigger test
+ *
+ * The trigger should be used this way:
+ * CREATE TRIGGER trigger_name
+ BEFORE INSERT ON master_table
+ FOR EACH ROW EXECUTE PROCEDURE partition_insert_trigger(child_table_name, column_number, min_val, max_val);
+ */
+
+extern Datum partition_insert_trigger(PG_FUNCTION_ARGS);
+
+static Datum
+check_constraints_and_insert_tuple(Relation child_table_relation, TriggerData *trigdata ,HeapTuple trigtuple)
+{ // Check the constraints
+ ResultRelInfo *resultRelInfo;
+ TupleTableSlot *slot;
+ EState *estate = CreateExecutorState();
+ Datum result;
+
+ result = PointerGetDatum(trigdata->tg_trigtuple);
+
+ resultRelInfo = makeNode(ResultRelInfo);
+ resultRelInfo->ri_RangeTableIndex = 1; /* dummy */
+ resultRelInfo->ri_RelationDesc = child_table_relation;
+
+ estate->es_result_relations = resultRelInfo;
+ estate->es_num_result_relations = 1;
+ estate->es_result_relation_info = resultRelInfo;
+
+ /* Set up a tuple slot too */
+ slot = MakeSingleTupleTableSlot(trigdata->tg_relation->rd_att);
+ ExecStoreTuple(trigtuple, slot, InvalidBuffer, false);
+
+ if (ExecRelCheck(resultRelInfo, slot, estate) == NULL)
+ { // Constraints satisfied, insert the row in the child table
+ bool use_wal = true;
+ bool use_fsm=true;
+
+ /* BEFORE ROW INSERT Triggers */
+ if (resultRelInfo->ri_TrigDesc &&
+ resultRelInfo->ri_TrigDesc->n_before_row[TRIGGER_EVENT_INSERT] > 0)
+ {
+ HeapTuple newtuple;
+ newtuple = ExecBRInsertTriggers(estate, resultRelInfo, trigtuple);
+
+ if (newtuple != trigtuple) /* modified by Trigger(s) */
+ {
+ heap_freetuple(trigtuple);
+ trigtuple = newtuple;
+ }
+ }
+
+ /* Perform the insert
+ * TODO: Check that we detect constraint violation if before row insert does something bad */
+ heap_insert(child_table_relation, trigtuple, GetCurrentCommandId(true), use_wal, use_fsm);
+
+ /* Update indices */
+ ExecOpenIndices(resultRelInfo);
+ if (resultRelInfo->ri_NumIndices > 0)
+ ExecInsertIndexTuples(slot, &(trigtuple->t_self), estate, false);
+
+ /* AFTER ROW INSERT Triggers */
+ ExecARInsertTriggers(estate, resultRelInfo, trigtuple);
+
+ result = PointerGetDatum(NULL);
+ }
+ // Free resources
+ FreeExecutorState(estate);
+ ExecDropSingleTupleTableSlot(slot);
+
+ return result;
+}
+
+
+PG_FUNCTION_INFO_V1(partition_insert_trigger);
+
+static Relation last_inserted_relation;
+
+Datum
+partition_insert_trigger(PG_FUNCTION_ARGS)
+{
+ TriggerData *trigdata = (TriggerData *) fcinfo->context;
+ HeapTuple trigtuple= trigdata->tg_trigtuple;
+ TupleConstr *constr;
+ char *child_table_name;
+ Relation child_table_relation;
+ Oid relation_id;
+ Datum result;
+
+ // Try to exploit locality for bulk inserts
+ // We expect consecutive insert to go to the same child table
+ if (last_inserted_relation != NULL)
+ { // Try the last table we used
+ result = check_constraints_and_insert_tuple(last_inserted_relation, trigdata, trigtuple);
+ if (result == PointerGetDatum(NULL))
+ return result;
+ // We got a miss
+ last_inserted_relation = NULL;
+ }
+
+ /* make sure it's called as a trigger at all */
+ if (!CALLED_AS_TRIGGER(fcinfo))
+ elog(ERROR, "partition_insert_trigger: not called by trigger manager");
+
+ /* Sanity checks */
+ if (!TRIGGER_FIRED_BY_INSERT(trigdata->tg_event) || !TRIGGER_FIRED_BEFORE(trigdata->tg_event))
+ elog(ERROR, "partition_insert_trigger: not called on insert before");
+
+ // Child table name is either given as the unique parameter or it is the name of the trigger
+ if (trigdata->tg_trigger->tgnargs == 1)
+ child_table_name = trigdata->tg_trigger->tgargs[0];
+ else
+ child_table_name = trigdata->tg_trigger->tgname;
+
+ // Lookup the child relation
+ relation_id = RelnameGetRelid(child_table_name);
+ if (relation_id == InvalidOid)
+ elog(ERROR, "partition_insert_trigger: Invalid child table %s", child_table_name);
+ child_table_relation = RelationIdGetRelation(relation_id);
+ if (child_table_relation == NULL)
+ elog(ERROR, "partition_insert_trigger: Failed to locate relation for child table %s", child_table_name);
+
+ constr = child_table_relation->rd_att->constr;
+ if (constr->num_check == 0)
+ elog(ERROR, "partition_insert_trigger: No constraint found for child table %s", child_table_name);
+
+ result = check_constraints_and_insert_tuple(child_table_relation, trigdata, trigtuple);
+
+ if (result == PointerGetDatum(NULL))
+ last_inserted_relation = child_table_relation;
+
+ RelationClose(child_table_relation);
+
+ return result;
+}
+
Index: src/backend/executor/execMain.c
===================================================================
RCS file: /root/cvsrepo/pgsql/src/backend/executor/execMain.c,v
retrieving revision 1.314
diff -u -r1.314 execMain.c
--- src/backend/executor/execMain.c 31 Oct 2008 21:07:54 -0000 1.314
+++ src/backend/executor/execMain.c 16 Dec 2008 01:42:50 -0000
@@ -1947,7 +1947,7 @@
/*
* ExecRelCheck --- check that tuple meets constraints for result relation
*/
-static const char *
+const char *
ExecRelCheck(ResultRelInfo *resultRelInfo,
TupleTableSlot *slot, EState *estate)
{
Index: src/test/regress/output/create_function_1.source
===================================================================
RCS file: /root/cvsrepo/pgsql/src/test/regress/output/create_function_1.source,v
retrieving revision 1.34
diff -u -r1.34 create_function_1.source
--- src/test/regress/output/create_function_1.source 31 Oct 2008 19:37:56 -0000 1.34
+++ src/test/regress/output/create_function_1.source 16 Dec 2008 01:42:50 -0000
@@ -47,6 +47,10 @@
RETURNS int4
AS '@libdir@/regress@DLSUFFIX@'
LANGUAGE C STRICT;
+CREATE FUNCTION partition_insert_trigger ()
+ RETURNS trigger
+ AS '@libdir@/regress@DLSUFFIX@'
+ LANGUAGE C STRICT;
-- Things that shouldn't work:
CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL
AS 'SELECT ''not an integer'';';
@@ -80,3 +84,70 @@
CREATE FUNCTION test1 (int) RETURNS int LANGUAGE internal
AS 'nosuch';
ERROR: there is no built-in function named "nosuch"
+-- Partitioning trigger test
+CREATE TABLE master (
+ id int not null,
+ date date not null,
+ value int
+);
+CREATE TABLE child_y2008m01 (
+ CHECK ( date >= DATE '2008-01-01' AND date < DATE '2008-02-01' )
+) INHERITS (master);
+CREATE TABLE child_y2008m02 (
+ CHECK ( date >= DATE '2008-02-01' AND date < DATE '2008-03-01' )
+) INHERITS (master);
+CREATE TABLE child_y2008m03 (
+ CHECK ( date >= DATE '2008-03-01' AND date < DATE '2008-04-01' )
+) INHERITS (master);
+CREATE TRIGGER insert_child_y2008m01
+ BEFORE INSERT ON master
+ FOR EACH ROW EXECUTE PROCEDURE partition_insert_trigger('child_y2008m01');
+CREATE TRIGGER child_y2008m02
+ BEFORE INSERT ON master
+ FOR EACH ROW EXECUTE PROCEDURE partition_insert_trigger();
+CREATE TRIGGER child_y2008m03
+ BEFORE INSERT ON master
+ FOR EACH ROW EXECUTE PROCEDURE partition_insert_trigger();
+INSERT INTO master VALUES (1, '2008-01-15', 1);
+INSERT INTO master VALUES (2, '2008-02-15', 2);
+INSERT INTO master VALUES (3, '2008-03-15', 3);
+INSERT INTO master VALUES (4, '2008-04-15', 4);
+COPY master FROM '/root/pg_partitions/copy_input.txt';
+select * from master;
+ id | date | value
+----+------------+-------
+ 4 | 04-15-2008 | 4
+ 1 | 01-15-2008 | 1
+ 11 | 01-10-2008 | 11
+ 2 | 02-15-2008 | 2
+ 12 | 02-15-2008 | 12
+ 3 | 03-15-2008 | 3
+ 13 | 03-15-2008 | 13
+(7 rows)
+
+ select * from child_y2008m01;
+ id | date | value
+----+------------+-------
+ 1 | 01-15-2008 | 1
+ 11 | 01-10-2008 | 11
+(2 rows)
+
+ select * from child_y2008m02;
+ id | date | value
+----+------------+-------
+ 2 | 02-15-2008 | 2
+ 12 | 02-15-2008 | 12
+(2 rows)
+
+ select * from child_y2008m03;
+ id | date | value
+----+------------+-------
+ 3 | 03-15-2008 | 3
+ 13 | 03-15-2008 | 13
+(2 rows)
+
+DROP TABLE master CASCADE;
+NOTICE: drop cascades to 3 other objects
+DETAIL: drop cascades to table child_y2008m01
+drop cascades to table child_y2008m02
+drop cascades to table child_y2008m03
Index: src/include/executor/executor.h
===================================================================
RCS file: /root/cvsrepo/pgsql/src/include/executor/executor.h,v
retrieving revision 1.152
diff -u -r1.152 executor.h
--- src/include/executor/executor.h 31 Oct 2008 21:07:55 -0000 1.152
+++ src/include/executor/executor.h 16 Dec 2008 01:42:50 -0000
@@ -155,6 +155,8 @@
extern bool ExecContextForcesOids(PlanState *planstate, bool *hasoids);
extern void ExecConstraints(ResultRelInfo *resultRelInfo,
TupleTableSlot *slot, EState *estate);
+extern const char *ExecRelCheck(ResultRelInfo *resultRelInfo,
+ TupleTableSlot *slot, EState *estate);
extern TupleTableSlot *EvalPlanQual(EState *estate, Index rti,
ItemPointer tid, TransactionId priorXmax);
extern PlanState *ExecGetActivePlanTree(QueryDesc *queryDesc);
Index: src/test/regress/input/create_function_1.source
===================================================================
RCS file: /root/cvsrepo/pgsql/src/test/regress/input/create_function_1.source,v
retrieving revision 1.19
diff -u -r1.19 create_function_1.source
--- src/test/regress/input/create_function_1.source 1 Oct 2008 22:38:57 -0000 1.19
+++ src/test/regress/input/create_function_1.source 16 Dec 2008 01:42:50 -0000
@@ -52,6 +52,12 @@
AS '@libdir@/regress@DLSUFFIX@'
LANGUAGE C STRICT;

+CREATE FUNCTION partition_insert_trigger ()
+ RETURNS trigger
+ AS '@libdir@/regress@DLSUFFIX@'
+ LANGUAGE C STRICT;
+
+
-- Things that shouldn't work:

CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL
@@ -77,3 +83,48 @@

CREATE FUNCTION test1 (int) RETURNS int LANGUAGE internal
AS 'nosuch';
+
+-- Partitioning trigger test
+
+CREATE TABLE master (
+ id int not null,
+ date date not null,
+ value int
+);
+
+CREATE TABLE child_y2008m01 (
+ CHECK ( date >= DATE '2008-01-01' AND date < DATE '2008-02-01' )
+) INHERITS (master);
+
+CREATE TABLE child_y2008m02 (
+ CHECK ( date >= DATE '2008-02-01' AND date < DATE '2008-03-01' )
+) INHERITS (master);
+
+CREATE TABLE child_y2008m03 (
+ CHECK ( date >= DATE '2008-03-01' AND date < DATE '2008-04-01' )
+) INHERITS (master);
+
+CREATE TRIGGER insert_child_y2008m01
+ BEFORE INSERT ON master
+ FOR EACH ROW EXECUTE PROCEDURE partition_insert_trigger('child_y2008m01');
+
+CREATE TRIGGER child_y2008m02
+ BEFORE INSERT ON master
+ FOR EACH ROW EXECUTE PROCEDURE partition_insert_trigger();
+
+CREATE TRIGGER child_y2008m03
+ BEFORE INSERT ON master
+ FOR EACH ROW EXECUTE PROCEDURE partition_insert_trigger();
+
+INSERT INTO master VALUES (1, '2008-01-15', 1);
+INSERT INTO master VALUES (2, '2008-02-15', 2);
+INSERT INTO master VALUES (3, '2008-03-15', 3);
+INSERT INTO master VALUES (4, '2008-04-15', 4);
+COPY master FROM '/root/pg_partitions/copy_input.txt';
+
+select * from master;
+select * from child_y2008m01;
+select * from child_y2008m02;
+select * from child_y2008m03;
+
+DROP TABLE master CASCADE;

--
http://www.enterprisedb.com

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

ITAGAKI Takahiro

Дата:

16 декабря 2008 г., 06:02:22

Emmanuel Cecchet <manu@frogthinker.org> wrote:

> In the meantime, I have made some more tests with the trigger in C (see 
> attached patch).

Hmm... The inserting partition is passed by trigger arguments.
Users must replace triggers when the target is changed (ex. every month).
Is it possible to expand all of child paritions from pg_inherits and
determine a suitable parition by checking their constraints?
We can also use it when there are multiple inserting paritions,
something like hash or list paritioning. Fixed target is only applicable
to time-based range paritioning.

BTW, there is another issue in trigger approach. If INSERT commands
are interrupted by triggers, server says "INSERT 0 row" though
rows are inserted into child tables. Since using C, we could
use some back doors to modify a variable counting affected rows.
We could use partitioned tables more transparently if we have it.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

Emmanuel Cecchet

Дата:

16 декабря 2008 г., 06:13:43

Nikhil Sontakke wrote:
> A similar DELETE trigger should be pretty easy to write up in C. I 
> think the main challenge is with UPDATE triggers especially if the new 
> row will fall into another child table - but we can always throw an 
> error for such a case initially.
I agree. A first implementation could be restricted to updates within 
the same partition.
>
>     Right now if the row cannot be inserted in any child table, it is
>     inserted in the parent. If you want to fail, we can add a 'fail
>     trigger' (after all other triggers) that generates an error if
>     previous triggers did not capture the row. If you want to create a
>     new partition, you can have another trigger to handle that.
>
>
> One of the work items related to partitioning eventually is to avoid 
> having to APPEND the parent in all queries involving children. Maybe 
> having an overflow child table might help to catch failed triggers for 
> those cases?
This is a good option too.

Emmanuel

>
> Regards,
> Nikhils
>  
>
>
>     So I think that this trigger approach is pretty flexible like
>     people used AOP in J2EE servers to process requests. It has also
>     the advantage of allowing fast prototyping. It should also be easy
>     to push that functionality down in the core as needed.
>
>     Is it ok if I move Simon's requirement document under a more
>     generic 'Table partitioning' page on the Wiki?
>
>     Thanks for your feedback,
>     manu
>
>
>     Jaime Casanova wrote:
>
>         On Mon, Dec 15, 2008 at 1:46 PM, Gregory Stark
>         <stark@enterprisedb.com <mailto:stark@enterprisedb.com>> wrote:
>          
>
>             Josh Berkus <josh@agliodbs.com <mailto:josh@agliodbs.com>>
>             writes:
>
>                
>
>                 Hackers,
>
>                 We don't yet seem to have a clear specification for
>                 this feature, and the Other
>                 Open Source DB has shown us how problematic it is to
>                 get auto-partitioning
>                 wrong.
>
>                 Should we defer auto-partitioning to 8.5?
>                      
>
>             If we're serious about having a "next generation"
>             partitioning with a concept
>             of partition keys then it seems to me to make more sense
>             to do that first and
>             then add on a feature like this.
>
>                
>
>
>         +1
>
>          
>
>             This is still very useful. I haven't looked at the actual
>             patch, does it
>             require core changes or can it be stashed in a pgfoundry
>             or contrib module?
>
>                
>
>
>         what i don't like about this one is that it creates partitions at
>         create table time and to manually add all new partitions (inherit
>         tables and modify the trigger)... and what i want to see is an
>         automatic creation when it's needed...
>

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

Emmanuel Cecchet

Дата:

16 декабря 2008 г., 06:33:44

ITAGAKI Takahiro wrote:
> Emmanuel Cecchet <manu@frogthinker.org> wrote
>> In the meantime, I have made some more tests with the trigger in C (see 
>> attached patch).
>>     
>
> Hmm... The inserting partition is passed by trigger arguments.
>   
Actually this is just a fallback option. The preferred option is to name 
the trigger after the child table name. This way the trigger retrieve 
the table name directly from the trigger name and no argument has to be 
passed to the trigger.
> Users must replace triggers when the target is changed (ex. every month).
>   
I am not sure what you mean. There is one trigger per child table but 
the trigger is always the same, it is just the name that is given to it 
that changes.
> Is it possible to expand all of child paritions from pg_inherits and
> determine a suitable parition by checking their constraints?
>   
Ideally it would be better to do this way. I have not found yet how to 
automatically get all the child partitions of a parent table from the 
trigger. This would simplify things by having a single trigger.
> We can also use it when there are multiple inserting paritions,
> something like hash or list paritioning. Fixed target is only applicable
> to time-based range paritioning.
>   
I think there is a misunderstanding on how the trigger works. You have 1 
trigger per child table and they are all chained on the parent table.
When a tuple is inserted on the parent table, the first trigger is 
fired, if the constraints of the 1st child table are satisfied, the 
tuple is moved in the 1st child table and that's it. If it is a miss, 
the tuple is passed to the next trigger that checks the constraints of 
the 2nd table. And so on.
This will work with any type of partitioning (hash or even UDF) as long 
as the constraints on the child table reflect the partitioning.
> BTW, there is another issue in trigger approach. If INSERT commands
> are interrupted by triggers, server says "INSERT 0 row" though
> rows are inserted into child tables. Since using C, we could
> use some back doors to modify a variable counting affected rows.
> We could use partitioned tables more transparently if we have it.
>   
Even if you don't abort the query, the query reports 0 row if it has 
been moved to another table (you can COPY 100k lines and the server will 
return 0 if they were all successfully moved to child tables).
Technically this is correct since 0 rows were inserted in the parent 
table. Right now any number >0 is the number of rows that did not 
satisfy any child table constraint and were inserted in the master table 
(useful if you don't want the copy command to fail).

Regards,
Emmanuel
> Regards,
> ---
> ITAGAKI Takahiro
> NTT Open Source Software Center
>   
-- 
Emmanuel Cecchet
FTO @ Frog Thinker 
Open Source Development & Consulting
--
Web: http://www.frogthinker.org
email: manu@frogthinker.org
Skype: emmanuel_cecchet

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

ITAGAKI Takahiro

Дата:

16 декабря 2008 г., 06:58:30

Emmanuel Cecchet <manu@frogthinker.org> wrote:

> I think there is a misunderstanding on how the trigger works. You have 1 
> trigger per child table and they are all chained on the parent table.

Oops, I misunderstand your patch, sorry.

> > Is it possible to expand all of child paritions from pg_inherits and
> > determine a suitable parition by checking their constraints?
> >   
> Ideally it would be better to do this way. I have not found yet how to 
> automatically get all the child partitions of a parent table from the 
> trigger. This would simplify things by having a single trigger.

The chained triggers would have better flexibilty, and the auto expanding
trigger would have better usability. I'm not sure about performance
because expanding child partitions is not always faster than chained
calls of triggers.

I think chained triggers are hard to maintain. If we drop one of partition
tables, we need to reconnect the single-linked-list of the triggers.

> > server says "INSERT 0 row" though rows are inserted into child tables.
> Technically this is correct since 0 rows were inserted in the parent 
> table.

Yes, but users expect non-0 result normally. Some O/R mapping tools
also checks the result exactly and raises errors (it could be turned
off, but default is on).

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

Emmanuel Cecchet

Дата:

16 декабря 2008 г., 07:19:39

ITAGAKI Takahiro wrote:
> The chained triggers would have better flexibilty, and the auto expanding
> trigger would have better usability. I'm not sure about performance
> because expanding child partitions is not always faster than chained
> calls of triggers.
>
> I think chained triggers are hard to maintain. If we drop one of partition
> tables, we need to reconnect the single-linked-list of the triggers.
>   
When you drop one child table, you would also have to drop the trigger 
that has the same name on the parent table.  This does not seem too hard 
but I may be missing something.
>>> server says "INSERT 0 row" though rows are inserted into child tables.
>>>       
>> Technically this is correct since 0 rows were inserted in the parent 
>> table.
>>     
> Yes, but users expect non-0 result normally. Some O/R mapping tools
> also checks the result exactly and raises errors (it could be turned
> off, but default is on).
>   
If the O/R mapping tool is also creating the table it should be aware of 
the semantics specifics to partition. But your comment is well taken, 
this seems counterintuitive and against most API semantics to return 0 
when the number of inserted rows is expected. This would certainly 
require some additional hooks to return the proper value.

Best regards,
Emmanuel
> Regards,
> ---
> ITAGAKI Takahiro
> NTT Open Source Software Center
>   

-- 
Emmanuel Cecchet
FTO @ Frog Thinker 
Open Source Development & Consulting
--
Web: http://www.frogthinker.org
email: manu@frogthinker.org
Skype: emmanuel_cecchet

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

Grzegorz Jaskiewicz

Дата:

16 декабря 2008 г., 16:17:24

On 2008-12-16, at 07:58, ITAGAKI Takahiro wrote:
>
>
>>> server says "INSERT 0 row" though rows are inserted into child  
>>> tables.
>> Technically this is correct since 0 rows were inserted in the parent
>> table.
>
> Yes, but users expect non-0 result normally. Some O/R mapping tools
> also checks the result exactly and raises errors (it could be turned
> off, but default is on).

this is a general problem with triggers on inserts/updates/deletes. To  
be honest, I would love to see someone fixing it in 8.4, cos it is  
quite annoying - that developer is unable to figure out number of rows  
affected - just because there's trigger on that table.

Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

От

Devrim GÜNDÜZ

Дата:

06 января 2009 г., 16:20:13

<snip>

Is there any progress on this patch? I was asked about this feature last
month, during a PostgreSQL talk. I am willing to spend time for testing
this patch, if needed.
--
Devrim GÜNDÜZ, RHCE
devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr                  http://www.gunduz.org

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1

Вложения