Обсуждение: About partitioning

Поиск
Список
Период
Сортировка

About partitioning

От
Vincenzo Romano
Дата:
A main issue in my case with the suggested (chapter 5.9)
implementation is that child tables need to be created in advance,
For a number of reasons (complex partitioning schemas, rows also
related to the past and the future) it'd be unpractical
to create all of them in advance.
So I'm thinking about an "on demand" creation. I see two options only:
1. I check the child table existence before inserting the row or
2. I create the missing table as the result of an insert error (no table found).

In case 1 I need to inspect the catalog with at least a select, while
in case 2 I need to trap errors.
In my (little) experience trapping errors is slow, so I would go for option 1.

Unless there is a better advise.

--
Vincenzo Romano
NotOrAnd Information Technologies
NON QVIETIS MARIBVS NAVTA PERITVS

1

Re: About partitioning

От
Adrian von Bidder
Дата:
Hi,

[ creating db partitions on demand ]

On Wednesday 20 January 2010 11.20:21 Vincenzo Romano wrote:
> In case 1 I need to inspect the catalog with at least a select, while
> in case 2 I need to trap errors.
> In my (little) experience trapping errors is slow, so I would go for
>  option 1.

Trapping/handling the error might be slow, but remember that creating a new
partition (presumably) doesn't happen often (and creating the partition is
slow anyway.)  In case 1 (check if the partition exists before insert),
*every* *single* *insert* is slower because you first query the catalog,
while in case 2, the normal case (insert into existing partition) is fast
and only the rare case (how is your partitioning?  1000000 or more rows per
partition on average?  I'd think it's quite a lot because why else partition
the data at all...?) is slow.

cheers
-- vbi


--
Udall's Fourth Law:
    Any change or reform you make is going to have consequences you
    don't like.

Re: About partitioning

От
Vincenzo Romano
Дата:
2010/1/20 Adrian von Bidder <avbidder@fortytwo.ch>:
> Hi,
>
> [ creating db partitions on demand ]
>
> On Wednesday 20 January 2010 11.20:21 Vincenzo Romano wrote:
>> In case 1 I need to inspect the catalog with at least a select, while
>> in case 2 I need to trap errors.
>> In my (little) experience trapping errors is slow, so I would go for
>>  option 1.
>
> Trapping/handling the error might be slow, but remember that creating a new
> partition (presumably) doesn't happen often (and creating the partition is
> slow anyway.)  In case 1 (check if the partition exists before insert),
> *every* *single* *insert* is slower because you first query the catalog,
> while in case 2, the normal case (insert into existing partition) is fast
> and only the rare case (how is your partitioning?  1000000 or more rows per
> partition on average?  I'd think it's quite a lot because why else partition
> the data at all...?) is slow.
>
> cheers
> -- vbi

Hmmm ... also trapping would happen for every single line being inserted ...

--
Vincenzo Romano
NotOrAnd Information Technologies
NON QVIETIS MARIBVS NAVTA PERITVS

Re: About partitioning

От
Grzegorz Jaśkiewicz
Дата:
I submitted a solution here once in the past, please use search before asking.
the exception handling will be quicker in general, cos you won't have
to test for existence of the table before every insert.

Re: About partitioning

От
Adrian von Bidder
Дата:
Hi,

On Wednesday 20 January 2010 11.57:37 Vincenzo Romano wrote:
> 2010/1/20 Adrian von Bidder <avbidder@fortytwo.ch>:

> > [ creating db partitions on demand ]
> >
> > On Wednesday 20 January 2010 11.20:21 Vincenzo Romano wrote:
> >> In case 1 I need to inspect the catalog with at least a select, while
> >> in case 2 I need to trap errors.
> >> In my (little) experience trapping errors is slow, so I would go for
> >>  option 1.
> >
> > Trapping/handling the error might be slow, but remember that creating a
> > new partition (presumably) doesn't happen often (and creating the
> > partition is slow anyway.)

> Hmmm ... also trapping would happen for every single line being inserted

Why?

By "trapping" I mean: reacting to the error if the INSERT statement fails.
If the INSERT does not fail, there is no error, so there is no error
condition to handle.

Compare:

 * query server to see if partition exists
   (!!! slow: this uses the database server)
 * if partition does not exist (this is almost never the case), create it
 * insert row

Against:

 * try inserting (same speed as the final step above)
 * if (error)
   (this is fast, since it only uses the return value from the insert.  No
additional database action)
   -> then create partition (this, again, is slow but almost never happens)
   -> and then re-try the insert.

cheers
-- vbi

--
featured link: Debian Bookmark Collection - http://bookmarks.debian.net/

Вложения

Re: About partitioning

От
Vincenzo Romano
Дата:
2010/1/20 Adrian von Bidder <avbidder@fortytwo.ch>:
> Hi,
>
> On Wednesday 20 January 2010 11.57:37 Vincenzo Romano wrote:
>> 2010/1/20 Adrian von Bidder <avbidder@fortytwo.ch>:
>
>> > [ creating db partitions on demand ]
>> >
>> > On Wednesday 20 January 2010 11.20:21 Vincenzo Romano wrote:
>> >> In case 1 I need to inspect the catalog with at least a select, while
>> >> in case 2 I need to trap errors.
>> >> In my (little) experience trapping errors is slow, so I would go for
>> >>  option 1.
>> >
>> > Trapping/handling the error might be slow, but remember that creating a
>> > new partition (presumably) doesn't happen often (and creating the
>> > partition is slow anyway.)
>
>> Hmmm ... also trapping would happen for every single line being inserted
>
> Why?
>
> By "trapping" I mean: reacting to the error if the INSERT statement fails.
> If the INSERT does not fail, there is no error, so there is no error
> condition to handle.
>
> Compare:
>
>  * query server to see if partition exists
>   (!!! slow: this uses the database server)
>  * if partition does not exist (this is almost never the case), create it
>  * insert row
>
> Against:
>
>  * try inserting (same speed as the final step above)
>  * if (error)
>   (this is fast, since it only uses the return value from the insert.  No
> additional database action)
>   -> then create partition (this, again, is slow but almost never happens)
>   -> and then re-try the insert.

"trapping" should have the same meaning as in "38.6.5. Trapping Errors",
that is the BEGIN...EXCEPTION...END.
In my case:

BEGIN
  INSERT INTO a_child_table SELECT NEW.*;
EXCEPTION
  WHEN the_table_doesn_t_exist THEN
    CREATE TABLE a_child_table ...
END;

"Tip: A block containing an EXCEPTION clause is significantly more
expensive to enter and exit than
a block without one. Therefore, don’t use EXCEPTION without need."

So my fear is that having such a trapping block defined at runtime for
every INSERT
would yield to a slow implementation.

--
Vincenzo Romano
NotOrAnd Information Technologies
NON QVIETIS MARIBVS NAVTA PERITVS

Re: About partitioning

От
Vincenzo Romano
Дата:
2010/1/20 Vincenzo Romano <vincenzo.romano@notorand.it>:
> 2010/1/20 Adrian von Bidder <avbidder@fortytwo.ch>:
>> Hi,
>>
>> On Wednesday 20 January 2010 11.57:37 Vincenzo Romano wrote:
>>> 2010/1/20 Adrian von Bidder <avbidder@fortytwo.ch>:
>>
>>> > [ creating db partitions on demand ]
>>> >
>>> > On Wednesday 20 January 2010 11.20:21 Vincenzo Romano wrote:
>>> >> In case 1 I need to inspect the catalog with at least a select, while
>>> >> in case 2 I need to trap errors.
>>> >> In my (little) experience trapping errors is slow, so I would go for
>>> >>  option 1.
>>> >
>>> > Trapping/handling the error might be slow, but remember that creating a
>>> > new partition (presumably) doesn't happen often (and creating the
>>> > partition is slow anyway.)
>>
>>> Hmmm ... also trapping would happen for every single line being inserted
>>
>> Why?
>>
>> By "trapping" I mean: reacting to the error if the INSERT statement fails.
>> If the INSERT does not fail, there is no error, so there is no error
>> condition to handle.
>>
>> Compare:
>>
>>  * query server to see if partition exists
>>   (!!! slow: this uses the database server)
>>  * if partition does not exist (this is almost never the case), create it
>>  * insert row
>>
>> Against:
>>
>>  * try inserting (same speed as the final step above)
>>  * if (error)
>>   (this is fast, since it only uses the return value from the insert.  No
>> additional database action)
>>   -> then create partition (this, again, is slow but almost never happens)
>>   -> and then re-try the insert.
>
> "trapping" should have the same meaning as in "38.6.5. Trapping Errors",
> that is the BEGIN...EXCEPTION...END.
> In my case:
>
> BEGIN
>  INSERT INTO a_child_table SELECT NEW.*;
> EXCEPTION
>  WHEN the_table_doesn_t_exist THEN
>    CREATE TABLE a_child_table ...
> END;
>
> "Tip: A block containing an EXCEPTION clause is significantly more
> expensive to enter and exit than
> a block without one. Therefore, don’t use EXCEPTION without need."
>
> So my fear is that having such a trapping block defined at runtime for
> every INSERT
> would yield to a slow implementation.

It looks like Adrian is right and my fear is unfounded.
I have created both implementations, one with test-create-insert and
one with insert-trap-create-insert.
The first implementation shows an average execution time longer than
the second with the worse case of the second one
very close to the worse case of the first one.
This means that that only the EXCEPTION...WHEN sub-block is slow, not
also the BEGIN...EXCEPTION.
I would advise to make this thing clearer in the documentation (tip at
chapter "38.6.5. Trapping Errors").

Thanks again to everyone.

--
Vincenzo Romano
NotOrAnd Information Technologies
NON QVIETIS MARIBVS NAVTA PERITVS

Re: About partitioning

От
Filip Rembiałkowski
Дата:
W dniu 20 stycznia 2010 12:01 użytkownik Grzegorz Jaśkiewicz <gryzman@gmail.com> napisał:
please use search before asking.

please use search before advising to use search; it's not so easy to find.

can you share a link to archived post?



Filip

Re: About partitioning

От
Grzegorz Jaśkiewicz
Дата:

Re: About partitioning

От
Vincenzo Romano
Дата:
2010/1/21 Grzegorz Jaśkiewicz <gryzman@gmail.com>:
> http://www.pubbs.net/pgsql/201001/16503/
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Grzegorz,
Thanks for the reference, which officially is here:
http://archives.postgresql.org/pgsql-general/2010-01/msg00331.php

I confirm it wasn't easy to find that out. Maybe you can add the
snippet into the Postgresql Wiki.

One thing to add here is that I would advise against the CREATE
TABLE...INHERITS.
I find much more useful:

-- code
CREATE TABLE <partition> (
  LIKE <master>
  INCLUDING INDEXES,
  CHECK( <the condition> )
);
ALTER TABLE <partition> INHERIT <master>;
-- end code

because very often I still need the indexes in the partitions.
And in case you need the (other) constraints and the defaults, uyou
can ask for it with the INCLUDING clause.

And, BTW:
EXECUTE 'INSERT INTO '||partition-table-name||' SELECT $1.*' USING NEW;

--
Vincenzo Romano
NotOrAnd Information Technologies
NON QVIETIS MARIBVS NAVTA PERITVS

Re: About partitioning

От
Grzegorz Jaśkiewicz
Дата:
2010/1/21 Vincenzo Romano <vincenzo.romano@notorand.it>:

> And, BTW:
> EXECUTE 'INSERT INTO '||partition-table-name||' SELECT $1.*' USING NEW;

won't work on 8.3 where I need it however :)



--
GJ