Re: [PERFORM] CLUSTER command

Поиск
Список
Период
Сортировка
От Jean-Luc Lachance
Тема Re: [PERFORM] CLUSTER command
Дата
Msg-id 3DFA0DF1.DE3B5462@nsd.ca
обсуждение исходный текст
Ответ на Re: [PERFORM] CLUSTER command  (Stephan Szabo <sszabo@megazone23.bigpanda.com>)
Список pgsql-general
Stephan,

Someone commented earlier about the separation/abstraction of the
storage manager.
I agree that it should not be done at the storage level.

Maybe a better idea, would be to create a new pg_partition table that
would have the functionality of an index on the key field and also be
used to point to a file/table ID.

That would be alot more work to code on thet planner though.

If a newly inherited table could also inherite the constraints and
indecies of its parent maybe things would be easier.

JLL


Stephan Szabo wrote:
>
> On Thu, 12 Dec 2002, johnnnnnn wrote:
>
> > On Thu, Dec 12, 2002 at 05:39:44PM -0500, Jean-Luc Lachance wrote:
> > > Let's create a new command:
> > >
> > > PARTITION <table> ON <attribute>
> > <snip>
> > > Because the table file is already extended (2G limit) using
> > > different files extension (.N)
> > > how complicated (modifying the code) would it be to have the table
> > > files split according to the cluster key?
> >
>
> > I think the code changes would be complicated. Just at a 30-second
> > consideration, this would need to touch:
> > - all sql (selects, inserts, updates, deletes)
> > - vacuuming
> > - indexing
> > - statistics gathering
> > - existing clustering
>
> I think his idea was to treat it similarly to the way that the
> system treats tables >2G with .N files.  The only thing is that
> I believe the code that deals with that wouldn't be particularly
> easy to change to do it though, but I've only taken a cursory look at
> what I think is the place that does that(storage/smgr/md.c). Some sort of
> good partitioning system would be nice though.
>
> > create table u1 (...);
> > create table u2 (...);
> > create table u3 (...);
> >
> > create view uv as (select "A" as partition_key, ... from u1
> >                    union all
> >                    select "B" as partition_key, ... from u2
> >                    union all
> >                    select "C" as partition_key, ... from u3);
> >
> > That keeps the tables in different files on-disk while still allowing
> > you to query against all of them. You need to index them separately
> > and logic is necessary when changing data.
>
> Unfortunately, I think that the optimizer isn't going to do what you'd
> hope here and scan only the appropriate table if you were to say
> partition_key='A' and foo='bar'.  I'd love to be shown that I'm wrong, but
> the best I could see hoping for would be that if partition_key was part of
> u1-u3 and there was an index on partition_key,foo that it could use that
> and do minimal work on the other tables.
>
> In addition, doing something like the above is a nightmare if you don't
> know beforehand what the partitions should be (for example if you know
> there aren't alot of distinct values, but you don't know what they are) or
> for that matter even with 10-15 partitions, writing the rules and such
> would probably be really error prone.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly

В списке pgsql-general по дате отправления:

Предыдущее
От: Stephan Szabo
Дата:
Сообщение: Re: Copy/foreign key contraints
Следующее
От: Thomas Beutin
Дата:
Сообщение: Re: pg_hba.conf authorization question