Обсуждение: Feature suggestions (long)

Поиск
Список
Период
Сортировка

Feature suggestions (long)

От
Martijn van Oosterhout
Дата:
[Please CC any replies to me. Thanks.]

I'm going suggest a feature like what Oracle calls "partitions" and later on
something with indexes. The idea is to generate some discussion to see if
they are worthy of being added to the TODO list.

Partitions
==========
The idea is to split data across tables according to the contents of a
tuple. So you could split the contents of a table based on years or some
such. I'm not going to go too much into why it's a good idea, there are
several links on the web discussing it:

http://www.nyoug.org/200212nanda.pdf

Anyway, what I'm proposing is different from what Oracle does but flexible
enough such that what Oracle does could the implemented/emulated easily
enough. I was thinking along the lines of:

CREATE PARTITION sales_2003 ON sales WHERE saledate >= '2003-01-01' AND saledate < '2004-01-01';

This would produce a sequence of events equivalent to:

CREATE TABLE sales_2003 () INHERITS ( sales );
INSERT INTO sales_2003 SELECT * FROM sales WHERE <condition>;
DELETE FROM sales WHERE <condition>;

CREATE RULE rule1 AS ON INSERT TO sales
WHERE <condition> DO INSTEAD
INSERT INTO sales_2003 (NEW.*);

CREATE RULE rule2 AS ON DELETE TO sales
WHERE <condition> DO INSTEAD
DELETE FROM sales_2003 WHERE <???>;

<a pile more RULEs>

err... It gets a bit complex here. My point is that it can be done but with
proper support it can be done more efficiently, reliably and usably.

Internally you could treat them more like real inherited tables. Inserts
would be fairly straight forward (use the conditions to determine the
resulting table). Deletes require no work at all. Updates are trickier. The
problem is that after an update the tuple may end up needing to be in
another table. Adding/dropping/altering columns would have to be handled
similarly to inherited tables.

Looking at heap_update or ExecUpdate it's not clear whether it would be a
problem if the old and new tuples are in a separate tables. As long as the
descriptors are the same it shouldn't be too much of a deal hopefully.

So far the storage doesn't seem too difficult. I'm not sure how to deal with
inheritance and this. I'd suggest making partitions work only on individual
tables and not inherited.

Next stage would be teaching the planner. The conditions would be
pseudo-constraints on the partitions. Hence if the conditions and the
constraints form a non-intersecting set, you can skip that partition
altogether.

Now, semantically, should you be allowed to do inserts, updates and deletes
on partitions directly, or should they be hidden? If/when we get
table-spaces, you should be able to move them around.

Deleting a partition would amount to moving all the tuples back to the main
table. Of course, if there are other partitions the conditions will have to
be re-evaluated to place the tuple into one of the other partitions.

What do you do if multiple partitions have conflicting conditions? I'd be
tempted to do a first-come-first-serve basis, though maybe some kind of
priority? One advantage of the Oracle approach is that this issue doesn't
come up. We certainly can't write a system to detect anything but the
obvious cases.

Of course, making individual indexes for all the partitions could get
tedious, even if the system did it automatically; which me brings to my next
suggestion:

Multi-table indexes
===================

Currently an index only applies to a single table. If you use inheritance
all the indexes can get cumbersome. With something like the above partitions
it could get completely out of hand if used heavily. What I'd like is if
instead of an index mapping <index keys> -> ctid, it does <index keys> ->
(tableoid,ctid). At the same time it lets us build UNIQUE indexes for
inheritance (for primary keys).

Currently, an index is represented by a row in pg_index. To implement this
you would need to arrange a way for an index to list all its constituent
tables so that when doing an update you know which to indexes to update.
Multiple rows seems the easiest way but it does violate a uniqueness
constraint.

Obviously, the columns indexed can only be present in the root table.
However, the index would be usable for queries on subtables too. The planner
would simply have to remember to push down 'tableoid IN (oid,oid,oid,...)'
conditions.

The major problem here is statistics. What you really need is some sort of
aggregate statistics over an entire inheritance tree. Without something like
this the planner can't make sensible decisions on when to use the index.
ANALYZE currently doesn't do this. It's not entirely clear how it would go
about it either. This doesn't apply to UNIQUE indexes though.

Another issue is when a table is deleted. Normally you can just purge the
index when the table is deleted. With these indexes you can't since the
remaining data is still useful. So the index access method would have to be
careful not to return tuples from tables that don't exist anymore. VACUUM
would hopefully eventually clean them up.

There would also be an issue of the IndexScan node returning the right
fields depending on which part of the subtree is being queried. For example,
if B were inherited from A then an IndexScan on A* could return fields not
in A if the query only ever referenced B. This may already be solved as
the IndexScan node can scan multiple indexes.

=

In any case, the second feature seems like it will be required (or at least
something similar) to handle proper referential integrity and primary keys
for inherited tables. But if something like the former is implemented
(hopefully, as using rules would get very cumbersome) then the latter would
be necessary.

Have a good weekend.
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> "the West won the world not by the superiority of its ideas or values or
> religion but rather by its superiority in applying organised violence.
> Westerners often forget this fact, non-Westerners never do."
>   - Samuel P. Huntington

Re: Feature suggestions (long)

От
Tom Lane
Дата:
Martijn van Oosterhout <kleptog@svana.org> writes:
> I'm going suggest a feature like what Oracle calls "partitions" and later on
> something with indexes. The idea is to generate some discussion to see if
> they are worthy of being added to the TODO list.

Why bother?  Make partial indexes corresponding to what you are calling
the partitions of the table, and (I claim) you can get every possible
benefit of a partitioning scheme.  Plus more, because there's nothing
constraining the partial indexes to be nonoverlapping, so you can get
efficient plans for sets of queries that no partitioning scheme would
win for.
        regards, tom lane


Re: Feature suggestions (long)

От
Alvaro Herrera
Дата:
On Sat, May 17, 2003 at 12:05:23PM -0400, Tom Lane wrote:
> Martijn van Oosterhout <kleptog@svana.org> writes:
> > I'm going suggest a feature like what Oracle calls "partitions" and later on
> > something with indexes. The idea is to generate some discussion to see if
> > they are worthy of being added to the TODO list.
> 
> Why bother?

Maybe one can put different partitions in different tablespaces?

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
Essentially, you're proposing Kevlar shoes as a solution for the problem
that you want to walk around carrying a loaded gun aimed at your foot.
(Tom Lane)


Re: Feature suggestions (long)

От
Matthew Kirkwood
Дата:
On Sat, 17 May 2003, Alvaro Herrera wrote:

> > > I'm going suggest a feature like what Oracle calls "partitions" and later on
> > > something with indexes. The idea is to generate some discussion to see if
> > > they are worthy of being added to the TODO list.
> >
> > Why bother?
>
> Maybe one can put different partitions in different tablespaces?

One can.  The tablespace a partition is in can even be
offline if Oracle can prove that a query doesn't require
that partition.

Matthew.



Re: Feature suggestions (long)

От
Don Baccus
Дата:
On Saturday 17 May 2003 10:51 am, Matthew Kirkwood wrote:
> On Sat, 17 May 2003, Alvaro Herrera wrote:
> > > > I'm going suggest a feature like what Oracle calls "partitions" and
> > > > later on something with indexes. The idea is to generate some
> > > > discussion to see if they are worthy of being added to the TODO list.
> > >
> > > Why bother?
> >
> > Maybe one can put different partitions in different tablespaces?
>
> One can.  The tablespace a partition is in can even be
> offline if Oracle can prove that a query doesn't require
> that partition.

People use this feature for warehousing old data that they don't want to purge 
from the database.   For very large databases (of course that definition 
changes with each new generation of computer) this can greatly improve the 
performance of queries on the active portion of the data.



Re: Feature suggestions (long)

От
Don Baccus
Дата:
On Saturday 17 May 2003 06:31 pm, Martijn van Oosterhout wrote:

> Hmm, no comments on the UNIQUE-index-over-multiple-tables. I would have
> thought that would've been the more interesting one.

This would enable making PG inheritence one step closer to being truly useful, 
as one could enforce a primary key on a table and all its children.




Re: Feature suggestions (long)

От
Greg Stark
Дата:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Martijn van Oosterhout <kleptog@svana.org> writes:
> > I'm going suggest a feature like what Oracle calls "partitions" and later on
> > something with indexes. The idea is to generate some discussion to see if
> > they are worthy of being added to the TODO list.
> 
> Why bother?  Make partial indexes corresponding to what you are calling
> the partitions of the table, and (I claim) you can get every possible
> benefit of a partitioning scheme.  Plus more, because there's nothing
> constraining the partial indexes to be nonoverlapping, so you can get
> efficient plans for sets of queries that no partitioning scheme would
> win for.

Partial indexes get some of the advantages of partitioned tables, but
certainly not all. 

A big advantage of partitioned tables is being able to manage the data in
whole chunks very efficiently.

. Archive all records for a single year becomes a seqential scan instead of an index scan.

. Delete all records for a single year becomes a truncate table instead of an update.

This will become even more important when postgres gets some other features
like being able to set the physical storage location of each table and the
equivalent of what oracle calls "transportable tablespaces". 

Then you can move individual partitions to slow read-only media and keep the
currently active partition on the fast read-write media. Or back up the old
partition and drop it but load it on the DSS system using a simple binary
copy.

This isn't theoretical. I've done exactly this before. We had a table that
grew by about a million records per day. When we used dml statements to
archive the old records to the DSS system it took over a day and frequently
failed. When we switched to partitioned tables we were able to run it reliably
daily during prime time without impacting performance and have up-to-date data
in the DSS system.

-- 
greg



Re: Feature suggestions (long)

От
"Zeugswetter Andreas SB SD"
Дата:
> Hmm, it's easy when you put it like that. Still, each update will be
> converted into a whole host of inserts and deletes, so your N squared
> complexity is still there. One thing is that the rules are always
expanded,
> whether or not any rows match.

create rule xx_2003 as on update to xx_2003
WHERE year(dat_loaded)=2003
do instead ....

The rule is there, but it need not expand in explain if the update has
a "where dat_loaded = '2002-01-01'". (maybe todo: try to collaps plans
to
false in more scenarios ?)
Any solution will need above comparison, the question is imho where the
work should be done. Doing the work with the rules system and
constraints
system would imho have general positive side effects like eliminating
table
scans for data that cannot be in a table because of a check constraint.

One way to make the use of rules easier would be to enhance the rule
system to
allow a "case when then when then else default" syntax in addition to
the
WHERE xxx ACTION part, so you only need 3 rules ?

> Anyway, the general trend seems to be against the idea so I may as
well go
> think of something else :)

Nooo, keep it up :-) e.g. to use inheritance for the table definition is
prbbly
way better than my view thing. So the solution might be an empty master
table
that all others inherit from, and applications use, but has the rules
for insert
/update/delete . By the way, if you don't mind writing a huge rule the
unique
index can be emulated with rules also, by checking existance in all
other
tables with a function that returns an appropriate elog. To make this
efficient
it would be enough to create separate unique indexes on the partial
tables.

Andreas


Re: Feature suggestions (long)

От
Martijn van Oosterhout
Дата:
On Sat, May 17, 2003 at 04:04:11AM -0700, Don Baccus wrote:
> On Saturday 17 May 2003 10:51 am, Matthew Kirkwood wrote:
> > On Sat, 17 May 2003, Alvaro Herrera wrote:
> > > > > I'm going suggest a feature like what Oracle calls "partitions" and
> > > > > later on something with indexes. The idea is to generate some
> > > > > discussion to see if they are worthy of being added to the TODO list.
> > > >
> > > > Why bother?
> > >
> > > Maybe one can put different partitions in different tablespaces?
> >
> > One can.  The tablespace a partition is in can even be
> > offline if Oracle can prove that a query doesn't require
> > that partition.
>
> People use this feature for warehousing old data that they don't want to purge
> from the database.   For very large databases (of course that definition
> changes with each new generation of computer) this can greatly improve the
> performance of queries on the active portion of the data.

Or can't delete (eg. financial records). The reason I'm looking at it is for
queries where you want a report depending on all the data for 2002. Since
this data is mixed in with all the data for upto 7 previous years, it's too
big for a index scan but doing a seq. scan across the whole table is
very expensive.

What this buys you is being able to seq. scan over portions of a table,
rather than the whole table. Currently we manage this manually by moving
tuples around after the fact.

Hmm, no comments on the UNIQUE-index-over-multiple-tables. I would have
thought that would've been the more interesting one.

Have a nice weekend.
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> "the West won the world not by the superiority of its ideas or values or
> religion but rather by its superiority in applying organized violence.
> Westerners often forget this fact, non-Westerners never do."
>   - Samuel P. Huntington

Re: Feature suggestions (long)

От
Martijn van Oosterhout
Дата:
On Mon, May 19, 2003 at 10:46:04AM +0200, Zeugswetter Andreas SB SD wrote:
> > Partitions
> > ==========
>
> > Next stage would be teaching the planner. The conditions would be
> > pseudo-constraints on the partitions. Hence if the conditions and the
> > constraints form a non-intersecting set, you can skip that partition
> > altogether.
>
> Make that "normal check constraints", and make the planner consider
> constraints,
> and I think that by itself combined with the current featureset will
> be much more powerful than any of the "partitioning" features out there.
> (This is mainly needed to optimize selects on the big union all view)

Very true. This would give you most of the benefits as well as helping in
other areas. I know there are functions within postgresql which attempt to
determine whether a condition is always true or false. All (ha!) you should
need to do would be to alter the SeqScan and IndexScan nodes to build the
expression:

(check_condition_a) AND (check_condition_b) AND ... AND (query_conditions)

and see if it simplifies to either TRUE or FALSE. Note that a simplification
would be more useful to strip irrelevent clauses from the query_conditions.

> Imho if a dba starts to partition, he usually needs to be more involved
> than the average user, so I think he should be able cope with compexity.
> What imho would help, is a tool that generates a suggested rule set,
> indexes and actions, which the dba can review and apply. I do not think
> new SQL syntax would really help, since that would somehow hide the great
> existing power of the rule system. A tool would teach the dba, and empower
> him to use it.

But the RULE system is not really suited to this. I havn't written it all
out but by my calculations the number of rules required is about (N^2+N)/2
where N is the number of partitions. That's one UPDATE rule for each pair of
tables plus a set of INSERT rules (DELETE requires nothing special). And
each of those rules will be used every single time that table is queried
(both inserts and updates). That's not terribly efficient.

Mind you, maybe there's a better way of doing it. I havn't totally gotten my
head around rules. Maybe it indicates that improvements could be made to the
rule system.

That's why what I'm suggesting pushing it down to a storage management
level, where the difficult issues simply go away.

> And yes, creating several smaller tables and adding the appropriate rules
> usually makes the VLDB life much easier compared to growing single tables
> into the TB range.

Amen.
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> "the West won the world not by the superiority of its ideas or values or
> religion but rather by its superiority in applying organized violence.
> Westerners often forget this fact, non-Westerners never do."
>   - Samuel P. Huntington

Re: Feature suggestions (long)

От
"Zeugswetter Andreas SB SD"
Дата:
> But the RULE system is not really suited to this. I havn't written it
all
> out but by my calculations the number of rules required is about
(N^2+N)/2
> where N is the number of partitions. That's one UPDATE rule for each
pair of
> tables plus a set of INSERT rules (DELETE requires nothing special).
And
> each of those rules will be used every single time that table is
queried
> (both inserts and updates). That's not terribly efficient.
> Mind you, maybe there's a better way of doing it. I havn't totally
gotten my
> head around rules. Maybe it indicates that improvements could be made
to the
> rule system.

What you need is one view that is a union all select selecting all your
partial
tables, all applications only accesses this one view.
You need 1 insert 2 update 1 delete rule for each table plus 3 remainder
rules
(=4*N + 3 rules) (the select rule is already done by create view),
plus one check constraint for each partial table.

You can restrict access to the partial tables with normal
grant/revoke's.
The appls only need grants on the view (which is updateable with above
rules).

Andreas


Re: Feature suggestions (long)

От
Martijn van Oosterhout
Дата:
On Sat, May 17, 2003 at 11:42:01AM -0700, Don Baccus wrote:
> On Saturday 17 May 2003 06:31 pm, Martijn van Oosterhout wrote:
>
> > Hmm, no comments on the UNIQUE-index-over-multiple-tables. I would have
> > thought that would've been the more interesting one.
>
> This would enable making PG inheritence one step closer to being truly useful,
> as one could enforce a primary key on a table and all its children.

Actually, what I described in my original post was a bit more general than
just UNIQUE indexes. From what I wrote it doesn't seem as if it'd be too
complicated (though I won't pretend to fully understand the index code).

But there must be a few hurdles to overcome, which is why I expected someone
to point out what I'd miseed.

Have a nice weekend,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> "the West won the world not by the superiority of its ideas or values or
> religion but rather by its superiority in applying organized violence.
> Westerners often forget this fact, non-Westerners never do."
>   - Samuel P. Huntington

Re: Feature suggestions (long)

От
Mark Kirkwood
Дата:

Matthew Kirkwood wrote:

>On Sat, 17 May 2003, Alvaro Herrera wrote:
>
>  
>
>>>>I'm going suggest a feature like what Oracle calls "partitions" and later on
>>>>something with indexes. The idea is to generate some discussion to see if
>>>>they are worthy of being added to the TODO list.
>>>>        
>>>>
>>>Why bother?
>>>      
>>>
>>Maybe one can put different partitions in different tablespaces?
>>    
>>
>
>One can.  The tablespace a partition is in can even be
>offline if Oracle can prove that a query doesn't require
>that partition.
>  
>

Being able to "segment" tables for admin purposes (like archiving) might 
be beneficial.

I guess this could be orthogonal to clustering or partitioning schemes 
that mininize table/index access for queries.

(Oracle has sort of rolled these ideas together with their partitioning)

Mark







Re: Feature suggestions (long)

От
"Jim C. Nasby"
Дата:
On Sat, May 17, 2003 at 04:04:11AM -0700, Don Baccus wrote:
> On Saturday 17 May 2003 10:51 am, Matthew Kirkwood wrote:
> > On Sat, 17 May 2003, Alvaro Herrera wrote:
> > > > > I'm going suggest a feature like what Oracle calls "partitions" and
> > > > > later on something with indexes. The idea is to generate some
> > > > > discussion to see if they are worthy of being added to the TODO list.
> > > >
> > > > Why bother?
> > >
> > > Maybe one can put different partitions in different tablespaces?
> >
> > One can.  The tablespace a partition is in can even be
> > offline if Oracle can prove that a query doesn't require
> > that partition.
> 
> People use this feature for warehousing old data that they don't want to purge 
> from the database.   For very large databases (of course that definition 
> changes with each new generation of computer) this can greatly improve the 
> performance of queries on the active portion of the data.
This would be very useful for http://stats.distributed.net. Right now
I'm trying to 'partition' our main (120M row) table by project_id.
Sybase handled this very well, because it has good support for
clustered indexes/index organized tables. PGSQL lacks this, which means
large reads from that table are much more painful than on sybase,
because you either tablescan (ugh) or go though the overhead of an index
read for every row retrieved (not much better).

Partitioning would fit the bill here perfectly (though in my case
good/true clustering would work just as well).
-- 
Jim C. Nasby (aka Decibel!)                    jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"


Re: Feature suggestions (long)

От
Martijn van Oosterhout
Дата:
On Mon, May 19, 2003 at 01:59:52PM +0200, Zeugswetter Andreas SB SD wrote:
> What you need is one view that is a union all select selecting all your
> partial tables, all applications only accesses this one view.  You need 1
> insert 2 update 1 delete rule for each table plus 3 remainder rules (=4*N
> + 3 rules) (the select rule is already done by create view), plus one
> check constraint for each partial table.

WOW! You *are* good. Care to reveal the trick? I can see the insert and
delete rules but what are you doing for the updates?

> You can restrict access to the partial tables with normal grant/revoke's.
> The appls only need grants on the view (which is updateable with above
> rules).

Good point.

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> "the West won the world not by the superiority of its ideas or values or
> religion but rather by its superiority in applying organized violence.
> Westerners often forget this fact, non-Westerners never do."
>   - Samuel P. Huntington

Re: Feature suggestions (long)

От
Martijn van Oosterhout
Дата:
On Mon, May 19, 2003 at 03:58:24PM +0200, Zeugswetter Andreas SB SD wrote:
> The update needs one for the case where the tuple stays in the same
> partial table, and the second for moving it to another table. The first is
> simple (where old.x = new.x), the second must delete the original and
> insert a new row into the view. The view insert rule handles the
> distribution (might be, that there are TODO's here, my data stays in
> place). The additional problem with the second rule is, that the returned
> UPDATE tag (iirc) does not return the correct number of rows (deleted +
> inserted rows, see lengthy thread about the return tag). Whether that is
> an actual problem will depend on your apps.

Hmm, it's easy when you put it like that. Still, each update will be
converted into a whole host of inserts and deletes, so your N squared
complexity is still there. One thing is that the rules are always expanded,
whether or not any rows match. You'll get lots and lots of little queries.
One of my other wishes was for the EXPLAIN output to remain somewhat
legible.

Also, such a setup could never take advantage of the multi-table indexes I
suggested, since the whole benefit of those comes from treating a whole
inheritence hierarchy as a single virtual table (the UNIQUEness checks are
just a really cool bonus).

Anyway, the general trend seems to be against the idea so I may as well go
think of something else :)
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> "the West won the world not by the superiority of its ideas or values or
> religion but rather by its superiority in applying organized violence.
> Westerners often forget this fact, non-Westerners never do."
>   - Samuel P. Huntington

Re: Feature suggestions (long)

От
Martijn van Oosterhout
Дата:
On Sun, May 18, 2003 at 05:02:39PM +1200, Mark Kirkwood wrote:
> Being able to "segment" tables for admin purposes (like archiving) might
> be beneficial.
>
> I guess this could be orthogonal to clustering or partitioning schemes
> that mininize table/index access for queries.

Segmenting tables is what inheritence can buy you now (although somewhat
suboptimally). You can create subtables of your super-table and just
remember which table is supposed to contain which tuples. By adding CHECK
constraints you can even make the system error out on you if you make a
mistake.

Segmenting tables for archiving is only useful if the planner rarely uses
them. If the planner is still going to generate a seq. scan on the "archive"
every time you query the table, then it's not really archived in any sense.

Which bring another question. If I have a CHECK constraint on a table saying
'id < 5000' and I do a query 'WHERE id = 6000', is the query optimiser
clever enough to notice that there can be no matches. Preliminary testing
indicates no (although I see it's 7.2).

> (Oracle has sort of rolled these ideas together with their partitioning)

My guess is because segmenting without the planner cleverness is a bit
useless.
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> "the West won the world not by the superiority of its ideas or values or
> religion but rather by its superiority in applying organized violence.
> Westerners often forget this fact, non-Westerners never do."
>   - Samuel P. Huntington

Re: Feature suggestions (long)

От
Mark Kirkwood
Дата:

Martijn van Oosterhout wrote:

>>(Oracle has sort of rolled these ideas together with their partitioning)
>>    
>>
>
>My guess is because segmenting without the planner cleverness is a bit
>useless.
>
Yes!  you want the planner to be *aware* of any "segmenting", but there 
are other was to reduce the "amount of table" that is scanned (e.g. 
partial indexes and clustering), so the "Oracle solution" might not be 
the only  (or best) way to approach :

1   query plans for large tables
2   administration of large tables

cheers

Mark




Re: Feature suggestions (long)

От
"Zeugswetter Andreas SB SD"
Дата:
> Partitions
> ==========

> Next stage would be teaching the planner. The conditions would be
> pseudo-constraints on the partitions. Hence if the conditions and the
> constraints form a non-intersecting set, you can skip that partition
> altogether.

Make that "normal check constraints", and make the planner consider
constraints,
and I think that by itself combined with the current featureset will
be much more powerful than any of the "partitioning" features out there.
(This is mainly needed to optimize selects on the big union all view)

Imho if a dba starts to partition, he usually needs to be more involved
than the average user, so I think he should be able cope with compexity.
What imho would help, is a tool that generates a suggested rule set,
indexes
and actions, which the dba can review and apply. I do not think new SQL
syntax
would really help, since that would somehow hide the great existing
power of
the rule system. A tool would teach the dba, and empower him to use it.

And yes, creating several smaller tables and adding the appropriate
rules
usually makes the VLDB life much easier compared to growing single
tables into
the TB range.

Andreas


Re: Feature suggestions (long)

От
"Zeugswetter Andreas SB SD"
Дата:
> > What you need is one view that is a union all select selecting all
your
> > partial tables, all applications only accesses this one view.  You
need 1
> > insert 2 update 1 delete rule for each table plus 3 remainder rules
(=4*N
> > + 3 rules) (the select rule is already done by create view), plus
one
> > check constraint for each partial table.
>
> WOW! You *are* good. Care to reveal the trick? I can see the insert
and
> delete rules but what are you doing for the updates?

The update needs one for the case where the tuple stays in the same
partial table,
and the second for moving it to another table. The first is simple
(where old.x = new.x),
the second must delete the original and insert a new row into the view.
The view insert
rule handles the distribution (might be, that there are TODO's here, my
data stays in
place). The additional problem with the second rule is, that the
returned UPDATE tag (iirc)
does not return the correct number of rows (deleted + inserted rows, see
lengthy thread
about the return tag). Whether that is an actual problem will depend on
your apps.

Andreas


Re: Feature suggestions (long)

От
"Jim C. Nasby"
Дата:
On Tue, May 20, 2003 at 12:40:00AM +1000, Martijn van Oosterhout wrote:
> Anyway, the general trend seems to be against the idea so I may as well go
> think of something else :)

I'm disappointed to hear that. Having no way to effectively partition
data is a real pain in pgsql, and your proposal would adress that. Yes,
you can build it yourself by creating the view and all the rules by
hand, but that has a lot of drawbacks:

It's completely PGSQL specific
It leaves no possibility for performance improvements down the road
It's a lot of code to write
You have to manually maintain it all every time you need to add a new
partition (in your example, at the start of every year).

I don't know what the policies for patches are, but I'd hope that the
core team would consider adding this functionality, especially since a
first-round implimentation can be done entirely with rules (or so it
seems).

I certainly understand that development time is a very limited resource,
and I'm willing to work on this (though I'm not a C coder). Even if no
one can commit to this right now, can't it be added to the todo list?
-- 
Jim C. Nasby (aka Decibel!)                    jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"


Re: Feature suggestions (long)

От
Martijn van Oosterhout
Дата:
On Tue, May 20, 2003 at 10:02:24AM -0500, Jim C. Nasby wrote:
> On Tue, May 20, 2003 at 12:40:00AM +1000, Martijn van Oosterhout wrote:
> > Anyway, the general trend seems to be against the idea so I may as well go
> > think of something else :)
>
> I'm disappointed to hear that. Having no way to effectively partition
> data is a real pain in pgsql, and your proposal would adress that. Yes,
> you can build it yourself by creating the view and all the rules by
> hand, but that has a lot of drawbacks:

I agree, there is a lot of potential here. And I don't beleive it would be
too much work as most of the infrastructure is already there. At this stage
I'm just wondering if it will go on the TODO list. I propose that the
following items be added:
  * Improve the planner to take CHECK constraints into account to prune the plan.  * Allow a single index to index
multipletables (also for inherited PRIMARY KEYS)  * Allow partitioning of table into multiple subtables 

The first two items would be useful in their own right. With them the final
one would be straight forward. I'd be prepared to put some effort into this
if there is some indication it would be accepted.

> I don't know what the policies for patches are, but I'd hope that the
> core team would consider adding this functionality, especially since a
> first-round implimentation can be done entirely with rules (or so it
> seems).

Well, I think the policy is 'if you write the code you have a better chance
to have it accepted' :) So, if it's likely to be accepted then we only need
to find someone to code it. Given the other priorities currently I think
waiting for the core team to write it would be futile (unless you can
convince someone like IBM to give the core team money to write it).

Right now I'd be happy if the anonymous CVS server would talk to me :)

By the way, has anyone given thought to user-defined storage managers? Apart
from allowing backward compatable table access, you could implement a simple
version of partitioning that doesn't take advantage of planner tricks.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> "the West won the world not by the superiority of its ideas or values or
> religion but rather by its superiority in applying organized violence.
> Westerners often forget this fact, non-Westerners never do."
>   - Samuel P. Huntington

Re: Feature suggestions (long)

От
Bruce Momjian
Дата:
This seems related to table spaces --- once we have them, having some
automatic way to spread the data across tablespaces would make sense, so
I have added this discussion to that TODO.detail.

---------------------------------------------------------------------------

Martijn van Oosterhout wrote:
-- Start of PGP signed section.
> On Tue, May 20, 2003 at 10:02:24AM -0500, Jim C. Nasby wrote:
> > On Tue, May 20, 2003 at 12:40:00AM +1000, Martijn van Oosterhout wrote:
> > > Anyway, the general trend seems to be against the idea so I may as well go
> > > think of something else :)
> > 
> > I'm disappointed to hear that. Having no way to effectively partition
> > data is a real pain in pgsql, and your proposal would adress that. Yes,
> > you can build it yourself by creating the view and all the rules by
> > hand, but that has a lot of drawbacks:
> 
> I agree, there is a lot of potential here. And I don't beleive it would be
> too much work as most of the infrastructure is already there. At this stage
> I'm just wondering if it will go on the TODO list. I propose that the
> following items be added:
> 
>    * Improve the planner to take CHECK constraints into account to prune the plan.
>    * Allow a single index to index multiple tables (also for inherited PRIMARY KEYS)
>    * Allow partitioning of table into multiple subtables
> 
> The first two items would be useful in their own right. With them the final
> one would be straight forward. I'd be prepared to put some effort into this
> if there is some indication it would be accepted.
> 
> > I don't know what the policies for patches are, but I'd hope that the
> > core team would consider adding this functionality, especially since a
> > first-round implimentation can be done entirely with rules (or so it
> > seems).
> 
> Well, I think the policy is 'if you write the code you have a better chance
> to have it accepted' :) So, if it's likely to be accepted then we only need
> to find someone to code it. Given the other priorities currently I think
> waiting for the core team to write it would be futile (unless you can
> convince someone like IBM to give the core team money to write it).
> 
> Right now I'd be happy if the anonymous CVS server would talk to me :)
> 
> By the way, has anyone given thought to user-defined storage managers? Apart
> from allowing backward compatable table access, you could implement a simple
> version of partitioning that doesn't take advantage of planner tricks.
> 
> Have a nice day,
> -- 
> Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> > "the West won the world not by the superiority of its ideas or values or
> > religion but rather by its superiority in applying organized violence.
> > Westerners often forget this fact, non-Westerners never do."
> >   - Samuel P. Huntington
-- End of PGP section, PGP failed!

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Feature suggestions (long)

От
Bruce Momjian
Дата:
I was able to merge your ideas into the TODO because they are also items
that relate to other optimizations.  Look for 'subtable' on the web TODO
to see the changes:

> * Allow a single index to index multiple tables (for inheritance and
subtables)
> * Improve the planner to use CHECK constraints to prune the plan (for
subtables)
> * Allow partitioning of table into multiple subtables

---------------------------------------------------------------------------

Martijn van Oosterhout wrote:
-- Start of PGP signed section.
> On Tue, May 20, 2003 at 10:02:24AM -0500, Jim C. Nasby wrote:
> > On Tue, May 20, 2003 at 12:40:00AM +1000, Martijn van Oosterhout wrote:
> > > Anyway, the general trend seems to be against the idea so I may as well go
> > > think of something else :)
> > 
> > I'm disappointed to hear that. Having no way to effectively partition
> > data is a real pain in pgsql, and your proposal would adress that. Yes,
> > you can build it yourself by creating the view and all the rules by
> > hand, but that has a lot of drawbacks:
> 
> I agree, there is a lot of potential here. And I don't beleive it would be
> too much work as most of the infrastructure is already there. At this stage
> I'm just wondering if it will go on the TODO list. I propose that the
> following items be added:
> 
>    * Improve the planner to take CHECK constraints into account to prune the plan.
>    * Allow a single index to index multiple tables (also for inherited PRIMARY KEYS)
>    * Allow partitioning of table into multiple subtables
> 
> The first two items would be useful in their own right. With them the final
> one would be straight forward. I'd be prepared to put some effort into this
> if there is some indication it would be accepted.
> 
> > I don't know what the policies for patches are, but I'd hope that the
> > core team would consider adding this functionality, especially since a
> > first-round implimentation can be done entirely with rules (or so it
> > seems).
> 
> Well, I think the policy is 'if you write the code you have a better chance
> to have it accepted' :) So, if it's likely to be accepted then we only need
> to find someone to code it. Given the other priorities currently I think
> waiting for the core team to write it would be futile (unless you can
> convince someone like IBM to give the core team money to write it).
> 
> Right now I'd be happy if the anonymous CVS server would talk to me :)
> 
> By the way, has anyone given thought to user-defined storage managers? Apart
> from allowing backward compatable table access, you could implement a simple
> version of partitioning that doesn't take advantage of planner tricks.
> 
> Have a nice day,
> -- 
> Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> > "the West won the world not by the superiority of its ideas or values or
> > religion but rather by its superiority in applying organized violence.
> > Westerners often forget this fact, non-Westerners never do."
> >   - Samuel P. Huntington
-- End of PGP section, PGP failed!

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073