Обсуждение: Table partitioning based on multiple criterias possible?

Поиск
Список
Период
Сортировка

Table partitioning based on multiple criterias possible?

От
Ow Mun Heng
Дата:
New to Pg and wondering the extent of PG's table partitioning
capability.

I have a huge table > 18 million rows(growth rate ~8 million a week)
which I like to implement partitioning.

Initially the plan is to just partition it by date. eg: 1 partition per
month.

Now, I'm thinking if it's possible to also partition them by
product/items.

is it possible to do multilevel partitioning sort of like

if month = July then go to July Parition
    if Product = Shampoo then go to shampoo partition
        elif Product = Soap then go to soap partition

something like that..

Thanks


Re: Table partitioning based on multiple criterias possible?

От
Richard Huxton
Дата:
Ow Mun Heng wrote:
> New to Pg and wondering the extent of PG's table partitioning
> capability.
>
> I have a huge table > 18 million rows(growth rate ~8 million a week)
> which I like to implement partitioning.

OK

> Initially the plan is to just partition it by date. eg: 1 partition per
> month.

Fair enough - do you do a lot of searching by date?

> Now, I'm thinking if it's possible to also partition them by
> product/items.
>
> is it possible to do multilevel partitioning sort of like
>
> if month = July then go to July Parition
>     if Product = Shampoo then go to shampoo partition
>         elif Product = Soap then go to soap partition

Well, I suppose you could partition over (month,product) but you
probably don't want to. Sounds fiddly to manage.

Have you verified that performance will be a problem without this?

--
   Richard Huxton
   Archonet Ltd

Re: Table partitioning based on multiple criterias possible?

От
Ow Mun Heng
Дата:
On Wed, 2007-09-12 at 09:42 +0100, Richard Huxton wrote:
> Ow Mun Heng wrote:
> > New to Pg and wondering the extent of PG's table partitioning
> > capability.
> >
> > I have a huge table > 18 million rows(growth rate ~8 million a week)
> > which I like to implement partitioning.
>
> OK
>
> > Initially the plan is to just partition it by date. eg: 1 partition per
> > month.
>
> Fair enough - do you do a lot of searching by date?

Yeah.. the products we usually query on are by dates. default is 1 to 2
weeks period. So, there will be overlaps eg: 3rd week of April to 1 week
of May.. which will hit 2 partitions

>
> > Now, I'm thinking if it's possible to also partition them by
> > product/items.
> >
> > is it possible to do multilevel partitioning sort of like
> >
> > if month = July then go to July Parition
> >     if Product = Shampoo then go to shampoo partition
> >         elif Product = Soap then go to soap partition
>
> Well, I suppose you could partition over (month,product) but you
> probably don't want to. Sounds fiddly to manage.

Yeah.. well current schema w/ everything in 1 table and the way things
are ordered, normalised design etc, is causing PG to calc stats wrongly
and i get a 1 to 500x ratio between plan and actual analysed rows which
is causing queries to run > 2 hours w/o returning results. (this was
tested w/ on tables w/ only ~3 million rows and joining another table w/
another 3 million data - one to many relationship, I think, is causing
the mis-estimates.)
>
> Have you verified that performance will be a problem without this?



Re: Table partitioning based on multiple criterias possible?

От
Richard Huxton
Дата:
Ow Mun Heng wrote:
> On Wed, 2007-09-12 at 09:42 +0100, Richard Huxton wrote:
>> Well, I suppose you could partition over (month,product) but you
>> probably don't want to. Sounds fiddly to manage.
>
> Yeah.. well current schema w/ everything in 1 table and the way things
> are ordered, normalised design etc, is causing PG to calc stats wrongly
> and i get a 1 to 500x ratio between plan and actual analysed rows which
> is causing queries to run > 2 hours w/o returning results. (this was
> tested w/ on tables w/ only ~3 million rows and joining another table w/
> another 3 million data - one to many relationship, I think, is causing
> the mis-estimates.)

Well, let's see the query + explain. It's not in one of your previous
posts, is it? PG doesn't know about cross-column correlations, but you
can usually get better than 500x out. Since it's taking > 2 hours to get
a result we'll have to settle for explain rather than explain-analyse.

--
   Richard Huxton
   Archonet Ltd