Re: [PERFORM] CLUSTER command
От | Jean-Luc Lachance |
---|---|
Тема | Re: [PERFORM] CLUSTER command |
Дата | |
Msg-id | 3DFA0DF1.DE3B5462@nsd.ca обсуждение исходный текст |
Ответ на | Re: [PERFORM] CLUSTER command (Stephan Szabo <sszabo@megazone23.bigpanda.com>) |
Список | pgsql-general |
Stephan, Someone commented earlier about the separation/abstraction of the storage manager. I agree that it should not be done at the storage level. Maybe a better idea, would be to create a new pg_partition table that would have the functionality of an index on the key field and also be used to point to a file/table ID. That would be alot more work to code on thet planner though. If a newly inherited table could also inherite the constraints and indecies of its parent maybe things would be easier. JLL Stephan Szabo wrote: > > On Thu, 12 Dec 2002, johnnnnnn wrote: > > > On Thu, Dec 12, 2002 at 05:39:44PM -0500, Jean-Luc Lachance wrote: > > > Let's create a new command: > > > > > > PARTITION <table> ON <attribute> > > <snip> > > > Because the table file is already extended (2G limit) using > > > different files extension (.N) > > > how complicated (modifying the code) would it be to have the table > > > files split according to the cluster key? > > > > > I think the code changes would be complicated. Just at a 30-second > > consideration, this would need to touch: > > - all sql (selects, inserts, updates, deletes) > > - vacuuming > > - indexing > > - statistics gathering > > - existing clustering > > I think his idea was to treat it similarly to the way that the > system treats tables >2G with .N files. The only thing is that > I believe the code that deals with that wouldn't be particularly > easy to change to do it though, but I've only taken a cursory look at > what I think is the place that does that(storage/smgr/md.c). Some sort of > good partitioning system would be nice though. > > > create table u1 (...); > > create table u2 (...); > > create table u3 (...); > > > > create view uv as (select "A" as partition_key, ... from u1 > > union all > > select "B" as partition_key, ... from u2 > > union all > > select "C" as partition_key, ... from u3); > > > > That keeps the tables in different files on-disk while still allowing > > you to query against all of them. You need to index them separately > > and logic is necessary when changing data. > > Unfortunately, I think that the optimizer isn't going to do what you'd > hope here and scan only the appropriate table if you were to say > partition_key='A' and foo='bar'. I'd love to be shown that I'm wrong, but > the best I could see hoping for would be that if partition_key was part of > u1-u3 and there was an index on partition_key,foo that it could use that > and do minimal work on the other tables. > > In addition, doing something like the above is a nightmare if you don't > know beforehand what the partitions should be (for example if you know > there aren't alot of distinct values, but you don't know what they are) or > for that matter even with 10-15 partitions, writing the rules and such > would probably be really error prone. > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly
В списке pgsql-general по дате отправления: