Обсуждение: Generic function for partitioning function?

Поиск
Список
Период
Сортировка

Generic function for partitioning function?

От
Gregory Haase
Дата:
I am working on a date-based partitioning framework and I would really like to have a single function that could be used as trigger for any table that needs to be partitioned by day. I am working in a rails environment, so every table has a created_at datetime field.

I created my generic function:

create or replace function day_partition_insert_trigger()
returns trigger as $$
declare
    ins_tbl varchar;
begin
    ins_tbl     :=  TG_TABLE_SCHEMA || '.' || TG_TABLE_NAME || '_' || to_char(NEW.created_at,'YYYYMMDD');
    execute 'insert into '|| ins_tbl ||' select ($1).*' using NEW;
    return null;
end;
$$ language plpgsql;

And then I assigned the function to two different test tables to make sure it would work:

create trigger insert_daily_trigger before insert on testdailyone for each row execute procedure day_partition_insert_trigger();
create trigger insert_daily_trigger before insert on testdailytwo for each row execute procedure day_partition_insert_trigger();

Inserts work fine, and I was able to validate records are being inserted into the correct child tables.

I began to wonder if there would be a performance degradation, so I changed the testdailytwo trigger function the typical if, elsif described in the partitioning documentation and then ran pgbench against both tables.

I noticed that with 7 partitions, the if, elsif was slightly faster (~8%). However, when adding 30 partitions, the if, elsif version became slower. I'd sort of expected this.

So, my conclusion is that the generic function will work, and it will make administration (even automated administration) of partitioned tables much simpler.

My question is...  Is there a compelling reason why I should NOT do this. I must confess, it seems so straightforward that I feel like I must be missing something.

Thanks,

Greg Haase

Re: Generic function for partitioning function?

От
Jeff Janes
Дата:
On Tuesday, September 3, 2013, Gregory Haase wrote:
I am working on a date-based partitioning framework and I would really like to have a single function that could be used as trigger for any table that needs to be partitioned by day. I am working in a rails environment, so every table has a created_at datetime field.

I created my generic function:

create or replace function day_partition_insert_trigger()
returns trigger as $$
declare
    ins_tbl varchar;
begin
    ins_tbl     :=  TG_TABLE_SCHEMA || '.' || TG_TABLE_NAME || '_' || to_char(NEW.created_at,'YYYYMMDD');
    execute 'insert into '|| ins_tbl ||' select ($1).*' using NEW;
    return null;
end;
$$ language plpgsql;

...
 
I began to wonder if there would be a performance degradation, so I changed the testdailytwo trigger function the typical if, elsif described in the partitioning documentation and then ran pgbench against both tables.

I noticed that with 7 partitions, the if, elsif was slightly faster (~8%). However, when adding 30 partitions, the if, elsif version became slower. I'd sort of expected this.

Did you try an if, elsif, version structured like a binary search rather than a linear search?

Also, did you try them with a \copy rather than insert in a loop?

Cheers,

Jeff

Re: Generic function for partitioning function?

От
Gudmundur Johannesson
Дата:
On Wed, Sep 4, 2013 at 2:10 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
On Tuesday, September 3, 2013, Gregory Haase wrote:
I am working on a date-based partitioning framework and I would really like to have a single function that could be used as trigger for any table that needs to be partitioned by day. I am working in a rails environment, so every table has a created_at datetime field.

I created my generic function:

create or replace function day_partition_insert_trigger()
returns trigger as $$
declare
    ins_tbl varchar;
begin
    ins_tbl     :=  TG_TABLE_SCHEMA || '.' || TG_TABLE_NAME || '_' || to_char(NEW.created_at,'YYYYMMDD');
    execute 'insert into '|| ins_tbl ||' select ($1).*' using NEW;
    return null;
end;
$$ language plpgsql;

...
 
I began to wonder if there would be a performance degradation, so I changed the testdailytwo trigger function the typical if, elsif described in the partitioning documentation and then ran pgbench against both tables.

I noticed that with 7 partitions, the if, elsif was slightly faster (~8%). However, when adding 30 partitions, the if, elsif version became slower. I'd sort of expected this.

Did you try an if, elsif, version structured like a binary search rather than a linear search?

Also, did you try them with a \copy rather than insert in a loop?

Cheers,

Jeff


I experimented with trigger based inserts and rule based inserts.
In my case I insert many rows at a time and in that case, rule based inserts performed better.

Here is an example from me and it is based on the online postgres documents.

CREATE TABLE test
(
  id integer,
  ts timestamp without time zone,
  value real
);

-- create each partition, example for a single one
CREATE TABLE test_partition_2013_08_16
(
  CONSTRAINT test_partition_2013_08_16_timestamp_check CHECK (ts >= '2013-08-16'::date AND ts < '2013-08-17'::date)
)
INHERITS (test);

CREATE INDEX idx_test_2013_08_16_ts ON test_partition_2013_08_16
  USING btree
  (ts);

-- for each partition create a rule like the following:
CREATE OR REPLACE RULE test_partition_2013_08_16_rule AS
    ON INSERT TO test
   WHERE new.ts >= '2013-08-16'::date AND new.ts < '2013-08-17'::date DO INSTEAD  INSERT INTO test_partition_2013_08_16 (id, ts, value)
  VALUES (new.id, new.ts, new.value);


I have create a function/procedure that creates and drops the partitions for me and run it from crontab.

Hope this helps,
    - Gummi



Re: Generic function for partitioning function?

От
Martin Collins
Дата:
On 09/03/2013 04:38 PM, Gregory Haase wrote:
>
> So, my conclusion is that the generic function will work, and it will
> make administration (even automated administration) of partitioned
> tables much simpler.
>
> My question is...  Is there a compelling reason why I should NOT do
> this. I must confess, it seems so straightforward that I feel like I
> must be missing something.

I do something similar with my OpenERP setup. The only caveat I have is
that foreign key constraints don't work with partitioned tables.

Martin