Обсуждение: Query plan and Inheritance. Weird behavior

Поиск
Список
Период
Сортировка

Query plan and Inheritance. Weird behavior

От
John Lange
Дата:
I have a database that makes fairly extensive use of table Inheritance.

Structure is one parent table and 5 child tables as follows:

tbl_objects (parent table)
  -> tbl_viewers
  -> tbl_documents
    -> tbl_icons
  -> tbl_massemails
  -> tbl_formats

I have two questions:

First, if I create an index on the parent table will queries to the
child tables use that index?

Secondly, I tried to use explain to find out but I got very strange
results. It appears to read all the child tables even when you specify
only the parent table. In this case this appears to make the select do 6
queries instead of only 1. Obviously a huge performance hit. And none of
them uses the index though the table only has 420 rows at the moment so
that might be why its just doing a scan (though IMHO 'explain' should
explain that it isn't using the available index and why).

I can't say that I'm reading these results properly but here they are:

"EXPLAIN select * from tbl_objects where id = 1;"

Gives:

NOTICE:  QUERY PLAN:

Result  (cost=0.00..27.25 rows=6 width=138)
  ->  Append  (cost=0.00..27.25 rows=6 width=138)
        ->  Seq Scan on tbl_objects  (cost=0.00..12.24 rows=1 width=73)
        ->  Seq Scan on tbl_viewers tbl_objects  (cost=0.00..1.07 rows=1
width=83)
        ->  Seq Scan on tbl_documents tbl_objects  (cost=0.00..11.56
rows=1 width=78)
        ->  Seq Scan on tbl_massemails tbl_objects  (cost=0.00..0.00
rows=1 width=138)
        ->  Seq Scan on tbl_formats tbl_objects  (cost=0.00..1.12 rows=1
width=80)
        ->  Seq Scan on tbl_icons tbl_objects  (cost=0.00..1.25 rows=1
width=89)


Can anyone tell me if these results are making any sense and why
postgres is doing 6 reads when I only need one?

John Lange


Re: Query plan and Inheritance. Weird behavior

От
Stephan Szabo
Дата:
On 22 Jan 2003, John Lange wrote:

> I have a database that makes fairly extensive use of table Inheritance.
>
> Structure is one parent table and 5 child tables as follows:
>
> tbl_objects (parent table)
>   -> tbl_viewers
>   -> tbl_documents
>     -> tbl_icons
>   -> tbl_massemails
>   -> tbl_formats
>
> I have two questions:
>
> First, if I create an index on the parent table will queries to the
> child tables use that index?

AFAIK no since indices aren't inherited.

> Secondly, I tried to use explain to find out but I got very strange
> results. It appears to read all the child tables even when you specify
> only the parent table. In this case this appears to make the select do 6
> queries instead of only 1. Obviously a huge performance hit. And none of
> them uses the index though the table only has 420 rows at the moment so
> that might be why its just doing a scan (though IMHO 'explain' should
> explain that it isn't using the available index and why).

It seems reasonable to me since given the # of rows and the estimated
row width the table is probably only like 5 or 6 pages.  Reading the index
is unlikely to make life much better given an index read, seek in heap
file, read heap file page.

> I can't say that I'm reading these results properly but here they are:
>
> "EXPLAIN select * from tbl_objects where id = 1;"

This gets any rows in tbl_objects that have id=1 and any rows in any
subtables that have id=1.  Is that the intended effect?



Re: Query plan and Inheritance. Weird behavior

От
John Lange
Дата:
> This gets any rows in tbl_objects that have id=1 and any rows in any
> subtables that have id=1.  Is that the intended effect?

It is the intended result, but not the expected implementation.

Doing more investigation I think I figured out why Postgres does what it
does.

Creating child tables by inheriting from another table doesn't really do
what I consider to be 'true' inheritance, at least not in the way I
expected as a programmer.

Postgres seems to create "child" tables by first fully duplicating the
parent table and then adding the new columns to it. It then links the
tables internally some how so that a query on a parent table also
queries the child tables.

IHO this seems like inheritance by 'brute force' and a parent table that
has many children will cause a significant performance hit.

When I say "as a programmer" what I mean is I had expected it to be done
entirely the opposite way. In other words, child tables would simply be
linked internally to the parent table and a new table created which only
contains the new columns.

In this way the parent table would not need to know, nor would it care
about child tables in any way (just like inheritance in most programming
languages). If done this way a select on a parent table would only
require the retrieval of a single row and a select on a child table
would only require the retrieval of two rows (one in the child table and
one in the parent table).

I don't pretend to know the intricacies of Postgres performance but this
is the way I'm interpreting the data from the explains.

At this time, now that I (think I) understand how the inheritance is
implemented I'm considering abandoning it in Postgres and solving the
issue entirely pragmatically.

I hoping someone on the list will tell me where I'm going wrong here or
what wrong assumptions I'm making.

John Lange

On Wed, 2003-01-22 at 18:59, Stephan Szabo wrote:
>
> On 22 Jan 2003, John Lange wrote:
>
> > I have a database that makes fairly extensive use of table Inheritance.
> >
> > Structure is one parent table and 5 child tables as follows:
> >
> > tbl_objects (parent table)
> >   -> tbl_viewers
> >   -> tbl_documents
> >     -> tbl_icons
> >   -> tbl_massemails
> >   -> tbl_formats
> >
> > I have two questions:
> >
> > First, if I create an index on the parent table will queries to the
> > child tables use that index?
>
> AFAIK no since indices aren't inherited.
>
> > Secondly, I tried to use explain to find out but I got very strange
> > results. It appears to read all the child tables even when you specify
> > only the parent table. In this case this appears to make the select do 6
> > queries instead of only 1. Obviously a huge performance hit. And none of
> > them uses the index though the table only has 420 rows at the moment so
> > that might be why its just doing a scan (though IMHO 'explain' should
> > explain that it isn't using the available index and why).
>
> It seems reasonable to me since given the # of rows and the estimated
> row width the table is probably only like 5 or 6 pages.  Reading the index
> is unlikely to make life much better given an index read, seek in heap
> file, read heap file page.
>
> > I can't say that I'm reading these results properly but here they are:
> >
> > "EXPLAIN select * from tbl_objects where id = 1;"
>
> This gets any rows in tbl_objects that have id=1 and any rows in any
> subtables that have id=1.  Is that the intended effect?
>
>


Re: Query plan and Inheritance. Weird behavior

От
Stephan Szabo
Дата:
On 22 Jan 2003, John Lange wrote:

> Creating child tables by inheriting from another table doesn't really do
> what I consider to be 'true' inheritance, at least not in the way I
> expected as a programmer.
>
> Postgres seems to create "child" tables by first fully duplicating the
> parent table and then adding the new columns to it. It then links the
> tables internally some how so that a query on a parent table also
> queries the child tables.

That pretty much sums up my understanding of it.

[snip]
> In this way the parent table would not need to know, nor would it care
> about child tables in any way (just like inheritance in most programming
> languages). If done this way a select on a parent table would only
> require the retrieval of a single row and a select on a child table
> would only require the retrieval of two rows (one in the child table and
> one in the parent table).

As opposed to needing one row from a select on a child table and
effectively a union all when selecting from the parent.  There are up and
down sides of both implementations, and I haven't played with it enough
to speak meaningfully on it.

> I don't pretend to know the intricacies of Postgres performance but this
> is the way I'm interpreting the data from the explains.

As a side note, for a better understanding of timings, explain analyze is
much better than plain explain which only gives the plan and estimates.


Re: Query plan and Inheritance. Weird behavior

От
Tom Lane
Дата:
> On 22 Jan 2003, John Lange wrote:
>> In this way the parent table would not need to know, nor would it care
>> about child tables in any way (just like inheritance in most programming
>> languages). If done this way a select on a parent table would only
>> require the retrieval of a single row and a select on a child table
>> would only require the retrieval of two rows (one in the child table and
>> one in the parent table).

No, it'd require the retrieval of N rows: you're failing to think about
multiple levels of inheritance or multi-parent inheritance, both of
which are supported reasonably effectively by the current model.
My guess is that this scheme would crash and burn just on locking
considerations.  (When you want to update a child row, what locks do you
have to get in what order?  With pieces of the row scattered through
many tables, it'd be pretty messy.)

You may care to look in the pghackers archives for prior discussions.
The variant scheme that's sounded most interesting to me so far is to
store *all* rows of an inheritance hierarchy in a single physical table.
This'd require giving up multiple inheritance, but few people seem to
use that, and the other benefits (like being able to enforce uniqueness
constraints over the whole hierarchy with just a standard unique index)
seem worth it.  No one's stepped up to bat to do the legwork on the idea
yet, though.  One bit that looks pretty tricky is ALTER TABLE ADD
COLUMN.

            regards, tom lane

Re: Query plan and Inheritance. Weird behavior

От
John Lange
Дата:
On Thu, 2003-01-23 at 00:07, Tom Lane wrote:
> > On 22 Jan 2003, John Lange wrote:
> >> In this way the parent table would not need to know, nor would it care
> >> about child tables in any way (just like inheritance in most programming
> >> languages). If done this way a select on a parent table would only
> >> require the retrieval of a single row and a select on a child table
> >> would only require the retrieval of two rows (one in the child table and
> >> one in the parent table).
>
> No, it'd require the retrieval of N rows: you're failing to think about
> multiple levels of inheritance or multi-parent inheritance, both of
> which are supported reasonably effectively by the current model.

Lets not be too nit-picky here. In the case of multiple layers of
inheritance you are still only selecting two rows (at a time), one
child, one parent. However if the parent also has a parent, then the
process repeats, once for every layer.

This is entirely reasonable and efficient compared to the current model
where a select on a parent table requires the same select to be executed
on EVERY child table. If it's a large expensive JOIN of some kind then
this is verging on un-workable.

> My guess is that this scheme would crash and burn just on locking
> considerations.  (When you want to update a child row, what locks do you
> have to get in what order?  With pieces of the row scattered through
> many tables, it'd be pretty messy.)

You lock the parent on down to the last child. I'm not a database
developer but that seems fairly straight forward?

The choice between the schema I've suggested and the way it is currently
implemented is a trade off between more efficient selects vs. more
efficient updates. If you are selecting on the parent table more than
updating then my idea is vastly more efficient. If you INSERT a lot then
the current way is marginally better.

With apologies to the developers, I don't feel the current
implementation is really usable for the simple fact that expensive
operations performed on the parent table causes them to be repeated for
every child table. And, as an added penalty, indexes on parent tables
are NOT inherited to the children so the child operations can be even
more expensive.

This solution is not that large and I've already got 6 child tables. It
just so happens that I do a LOT of selects on the parent so I'm going to
have to make a decision on where to go from here.

Solving this programmatically is not really that hard but I've gone a
ways down this path now so I'm not anxious to redo the entire database
schema since we do have customers already using this.

> You may care to look in the pghackers archives for prior discussions.

I will, thanks.

> The variant scheme that's sounded most interesting to me so far is to
> store *all* rows of an inheritance hierarchy in a single physical table.

Unless I'm not understanding I don't think that works. In my case for
example, a single parent has 4-5 children so the only columns they have
in common are the ones in the parent. Combining them all into a single
table means a big de-normalized table (loads of empty columns). If you
are going to go this route then you might as well just do it. It doesn't
need to be implemented on the DBMS.

Regards,

John Lange

> This'd require giving up multiple inheritance, but few people seem to
> use that, and the other benefits (like being able to enforce uniqueness
> constraints over the whole hierarchy with just a standard unique index)
> seem worth it.  No one's stepped up to bat to do the legwork on the idea
> yet, though.  One bit that looks pretty tricky is ALTER TABLE ADD
> COLUMN.
>
>             regards, tom lane


Re: Query plan and Inheritance. Weird behavior

От
John Lange
Дата:
> I don't see how performance would be significantly better if you stored
> the common columns of all rows (parent and children) in the parent
> table, in contrast with how it is done now, storing entire rows of child
> tables in the child table and omitting them from the parent table.

Well there are a couple of points.

Firstly, from the simple standpoint of database normalization you
shouldn't have tables that have the same columns. The way it is
implemented, child tables are copies of parent tables.

But more importantly it is bad for performance because selecting from a
parent table causes the same select to be done on all the child tables.
In my case selecting from the parent causes six selects to be done (one
for every child table).

I would have assumed that child tables only contained the new columns
unique to it, not duplicates of the columns already in the parent table.

An insert to a child table would actually cause two inserts to be done
(assuming only one level of inheritance), one to the parent, and then
one to the child.

Therefore, selects from the parent table would only require a single
select (because the common data is all stored in the parent table).

Selects to a child would require two selects to get the entire row (one
to the parent, one to the child). Similar to a view.

As I said previously, performance would depend on what operation you
were mostly doing.

I think I have more or less covered this in my previous postings.

John Lange

On Tue, 2003-01-28 at 17:52, Andras Kadinger wrote:
> I see.
>
> I don't see how performance would be significantly better if you stored
> the common columns of all rows (parent and children) in the parent
> table, in contrast with how it is done now, storing entire rows of child
> tables in the child table and omitting them from the parent table.
>
> Hmm, reviewing your posts to pgsql-performance, I must admit I cannot
> really see what you feel you are losing performance-wise.
>
> As the discussion on pgsql-performance seems to have died off, would you
> be willing to explain to me?
>
> Regards,
> Andras
>
> John Lange wrote:
> >
> > No, the keyword ONLY will limit selects to that table ONLY. I need to
> > return the rows which are common to all tables. Postgres is doing the
> > work in the correct way, however, the issue is the underlaying design
> > which is terribly inefficient requiring a separate table scan for every
> > child table.
> >
> > Thanks for the suggestion.
> >
> > John Lange
> >
> > On Fri, 2003-01-24 at 14:30, Andras Kadinger wrote:
> > > Hi John,
> > >
> > > Isn't the keyword ONLY is what you are after?
> > >
> > > "EXPLAIN select * from tbl_objects where id = 1;" - this will select
> > > from table tbl_objects and all it's descendant tables.
> > >
> > > "EXPLAIN select * from tbl_objects ONLY where id = 1;" - this will
> > > select from table tbl_objects only.
> > >
> > > Regards,
> > > Andras Kadinger


Re: Query plan and Inheritance. Weird behavior

От
Tom Lane
Дата:
John Lange <lists@darkcore.net> writes:
> Firstly, from the simple standpoint of database normalization you
> shouldn't have tables that have the same columns. The way it is
> implemented, child tables are copies of parent tables.

There is no copied data though.  Or are you saying that if any table
in the database has, say, a timestamp column, then it's a failure of
normalization for any other one to have a timestamp column?  Don't think
I buy that.

> But more importantly it is bad for performance because selecting from a
> parent table causes the same select to be done on all the child tables.

So?  The same amount of data gets scanned either way.  To the extent
that the planner fails to generate an optimal plan in such cases, we
have a performance problem --- but that's just an implementation
shortcoming, not a fundamental limitation AFAICS.

The only real disadvantage I can see to the current storage scheme is
that we can't easily make an index that covers both a parent and all its
children; the index would have to include a table pointer as well as a
row pointer.  This creates problems for foreign keys and unique constraints.
But there is more than one way to attack that.

            regards, tom lane

Re: Query plan and Inheritance. Weird behavior

От
Andras Kadinger
Дата:
John Lange wrote:
>
> > I don't see how performance would be significantly better if you stored
> > the common columns of all rows (parent and children) in the parent
> > table, in contrast with how it is done now, storing entire rows of child
> > tables in the child table and omitting them from the parent table.
>
> Well there are a couple of points.
>
> Firstly, from the simple standpoint of database normalization you
> shouldn't have tables that have the same columns. The way it is
> implemented, child tables are copies of parent tables.

As Tom pointed out, only the schema is copied, but not the data.

This has the following advantages:
- if you select from child tables, PostgreSQL will only have to scan
rows that belong to that child (and further down), and not all rows in
all tables of the inheritance hierarchy; so if you have 100 million rows
in the whole hierarchy, but only have say 1 million in the child you are
currently interested in, you only have to scan those 1 million rows, and
not the whole 100 million.
- all columns of rows are stored together, so to read a row only one
disk access is needed (your way it would probably need roughly one
random disk access per each inheritance level upwards, both for
reads/selects and writes/inserts/updates; with a sizable inheritance
hierarchy this would be a considerable performance hit)
- it doesn't really cost much in terms of disk space, only some
bookkeeping information is needed

I don't think inheritance really fits into 'database normalization'
itself, but still there are cases where it is more convenient/efficient
than with traditional database normalization, where you would have to
either go create completely separate tables for each type (and do UNIONs
of SELECTs if you are interested in more than one child only), or what's
even more cumbersome, create a table with common columns ('parent' here)
and then go create children and children of children that each link
upwards to their respective parents through some kind of key: in a
select, you would have to explicitly specify all tables upwards the
inheritance hierarchy, and specify the respective joins for them.

So I think whether you should choose more traditional database
normalization or use inheritance depends on what you want to do.

> But more importantly it is bad for performance because selecting from a
> parent table causes the same select to be done on all the child tables.
> In my case selecting from the parent causes six selects to be done (one
> for every child table).

'causes the same select to be done on all the child tables' - I don't
agree with that, and I hope this is where the misunderstanding lies.

Consider this:

CREATE TABLE parent ( id integer NOT NULL, text text);
CREATE TABLE child1 ( child1field text) INHERITS (parent);
CREATE TABLE child2 ( child2field text) INHERITS (parent);
CREATE TABLE child3 ( child3field text) INHERITS (parent);
CREATE TABLE child4 ( child4field text) INHERITS (parent);

CREATE TABLE othertable ( id integer NOT NULL, othertext text);

ALTER TABLE ONLY parent ADD CONSTRAINT parent_pkey PRIMARY KEY (id);
ALTER TABLE ONLY child1 ADD CONSTRAINT child1_pkey PRIMARY KEY (id);
ALTER TABLE ONLY child2 ADD CONSTRAINT child2_pkey PRIMARY KEY (id);
ALTER TABLE ONLY child3 ADD CONSTRAINT child3_pkey PRIMARY KEY (id);
ALTER TABLE ONLY child4 ADD CONSTRAINT child4_pkey PRIMARY KEY (id);

ALTER TABLE ONLY othertable ADD CONSTRAINT othertable_pkey PRIMARY KEY
(id);

Then I filled all tables with 10000 rows of synthetic data and ANALYZEd
just to make sure the optimizer considers indexes to be valuable.

First I tried this:

johnlange=# explain select * from parent where id=13;
                                          QUERY
PLAN
----------------------------------------------------------------------------------------------
 Result  (cost=0.00..15.07 rows=5 width=36)
   ->  Append  (cost=0.00..15.07 rows=5 width=36)
         ->  Index Scan using parent_pkey on parent  (cost=0.00..3.01
rows=1 width=36)
               Index Cond: (id = 13)
         ->  Index Scan using child1_pkey on child1 parent
(cost=0.00..3.01 rows=1 width=36)
               Index Cond: (id = 13)
         ->  Index Scan using child2_pkey on child2 parent
(cost=0.00..3.01 rows=1 width=36)
               Index Cond: (id = 13)
         ->  Index Scan using child3_pkey on child3 parent
(cost=0.00..3.01 rows=1 width=36)
               Index Cond: (id = 13)
         ->  Index Scan using child4_pkey on child4 parent
(cost=0.00..3.01 rows=1 width=36)
               Index Cond: (id = 13)
(12 rows)

The planner has rightly chosen to use indexes, and as a result the query
will be pretty fast.

Also, at first sight this might look like the multiple selects you
mention, but actually it isn't; here's another example to show that:

inh=# explain select * from parent natural join othertable where
parent.id=13;
                                          QUERY
PLAN
----------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..30.20 rows=5 width=72)
   ->  Append  (cost=0.00..15.07 rows=5 width=36)
         ->  Index Scan using parent_pkey on parent  (cost=0.00..3.01
rows=1 width=36)
               Index Cond: (id = 13)
         ->  Index Scan using child1_pkey on child1 parent
(cost=0.00..3.01 rows=1 width=36)
               Index Cond: (id = 13)
         ->  Index Scan using child2_pkey on child2 parent
(cost=0.00..3.01 rows=1 width=36)
               Index Cond: (id = 13)
         ->  Index Scan using child3_pkey on child3 parent
(cost=0.00..3.01 rows=1 width=36)
               Index Cond: (id = 13)
         ->  Index Scan using child4_pkey on child4 parent
(cost=0.00..3.01 rows=1 width=36)
               Index Cond: (id = 13)
   ->  Index Scan using othertable_pkey on othertable  (cost=0.00..3.01
rows=1 width=36)
         Index Cond: ("outer".id = othertable.id)
(14 rows)

As you can see, the planner decided to use the indexes on parent and
children here too, retrieved and then collated the resulting rows first
and only then performed the join against othertable.

In other words, it is not peforming 5 separate selects with their
respective joins; it collects all qualifying rows first from the
inheritance hierarchy, and only then performs the join; so the extra
cost compared to the non-inheriting case is pretty much only the added
cost of consulting five indexes instead of just one - unless you have
inheritance hierarchies consisting of several dozen tables or more (and
even then) I don't think this added cost would be significant.

> This is entirely reasonable and efficient compared to the current model
> where a select on a parent table requires the same select to be executed
> on EVERY child table. If it's a large expensive JOIN of some kind then
> this is verging on un-workable.

Please show us a join that you would like to use and let us see how well
the planner handles it.

Regards,
Andras

PS (John, don't look here :) ): I have found some queries with plans
that are less efficient than I think they could be.

Changing the where clause in the above query to refer to othertable
gives:

johnlange=# explain select * from parent natural join othertable where
othertable.id=13;
                                          QUERY
PLAN
-----------------------------------------------------------------------------------------------
 Hash Join  (cost=3.02..978.08 rows=5 width=72)
   Hash Cond: ("outer".id = "inner".id)
   ->  Append  (cost=0.00..725.00 rows=50000 width=36)
         ->  Seq Scan on parent  (cost=0.00..145.00 rows=10000 width=36)
         ->  Seq Scan on child1 parent  (cost=0.00..145.00 rows=10000
width=36)
         ->  Seq Scan on child2 parent  (cost=0.00..145.00 rows=10000
width=36)
         ->  Seq Scan on child3 parent  (cost=0.00..145.00 rows=10000
width=36)
         ->  Seq Scan on child4 parent  (cost=0.00..145.00 rows=10000
width=36)
   ->  Hash  (cost=3.01..3.01 rows=1 width=36)
         ->  Index Scan using othertable_pkey on othertable
(cost=0.00..3.01 rows=1 width=36)
               Index Cond: (id = 13)
(11 rows)

While:

johnlange=# explain select * from ONLY parent natural join othertable
where othertable.id=13;
                                       QUERY
PLAN
-----------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..6.04 rows=1 width=72)
   ->  Index Scan using othertable_pkey on othertable  (cost=0.00..3.01
rows=1 width=36)
         Index Cond: (id = 13)
   ->  Index Scan using parent_pkey on parent  (cost=0.00..3.01 rows=1
width=36)
         Index Cond: (parent.id = "outer".id)
(5 rows)

Similarly, as a somewhat more real-life example:

johnlange=# explain select * from parent natural join othertable where
othertable.othertext='apple';
                                            QUERY
PLAN
--------------------------------------------------------------------------------------------------
 Hash Join  (cost=131.37..1234.50 rows=250 width=72)
   Hash Cond: ("outer".id = "inner".id)
   ->  Append  (cost=0.00..725.00 rows=50000 width=36)
         ->  Seq Scan on parent  (cost=0.00..145.00 rows=10000 width=36)
         ->  Seq Scan on child1 parent  (cost=0.00..145.00 rows=10000
width=36)
         ->  Seq Scan on child2 parent  (cost=0.00..145.00 rows=10000
width=36)
         ->  Seq Scan on child3 parent  (cost=0.00..145.00 rows=10000
width=36)
         ->  Seq Scan on child4 parent  (cost=0.00..145.00 rows=10000
width=36)
   ->  Hash  (cost=131.25..131.25 rows=50 width=36)
         ->  Index Scan using othertable_text on othertable
(cost=0.00..131.25 rows=50 width=36)
               Index Cond: (othertext = 'alma'::text)
(11 rows)

What's more strange, that it still does it with enable_seqscan set to
off:

johnlange=# explain select * from parent natural join othertable where
othertable.othertext='apple';
                                            QUERY
PLAN
--------------------------------------------------------------------------------------------------
 Hash Join  (cost=100000131.37..500001234.50 rows=250 width=72)
   Hash Cond: ("outer".id = "inner".id)
   ->  Append  (cost=100000000.00..500000725.00 rows=50000 width=36)
         ->  Seq Scan on parent  (cost=100000000.00..100000145.00
rows=10000 width=36)
         ->  Seq Scan on child1 parent  (cost=100000000.00..100000145.00
rows=10000 width=36)
         ->  Seq Scan on child2 parent  (cost=100000000.00..100000145.00
rows=10000 width=36)
         ->  Seq Scan on child3 parent  (cost=100000000.00..100000145.00
rows=10000 width=36)
         ->  Seq Scan on child4 parent  (cost=100000000.00..100000145.00
rows=10000 width=36)
   ->  Hash  (cost=131.25..131.25 rows=50 width=36)
         ->  Index Scan using othertable_text on othertable
(cost=0.00..131.25 rows=50 width=36)
               Index Cond: (othertext = 'apple'::text)
(11 rows)

While:

johnlange=# explain select * from ONLY parent natural join othertable
where othertable.othertext='apple';
                                         QUERY
PLAN
--------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..282.55 rows=50 width=72)
   ->  Index Scan using othertable_text on othertable
(cost=0.00..131.25 rows=50 width=36)
         Index Cond: (othertext = 'apple'::text)
   ->  Index Scan using parent_pkey on parent  (cost=0.00..3.01 rows=1
width=36)
         Index Cond: (parent.id = "outer".id)
(5 rows)

If I try to make it more efficient and get rid of the seq scans by
pushing the condition into a subselect, I get an even more interesting
plan:

johnlange=# explain select * from parent where id in (select id from
othertable where othertext='alma');
                                                  QUERY
PLAN
---------------------------------------------------------------------------------------------------------------
 Result  (cost=0.00..6563171.43 rows=25000 width=36)
   ->  Append  (cost=0.00..6563171.43 rows=25000 width=36)
         ->  Seq Scan on parent  (cost=0.00..1312634.29 rows=5000
width=36)
               Filter: (subplan)
               SubPlan
                 ->  Materialize  (cost=131.25..131.25 rows=50 width=4)
                       ->  Index Scan using othertable_text on
othertable  (cost=0.00..131.25 rows=50 width=4)
                             Index Cond: (othertext = 'alma'::text)
         ->  Seq Scan on child1 parent  (cost=0.00..1312634.29 rows=5000
width=36)
               Filter: (subplan)
               SubPlan
                 ->  Materialize  (cost=131.25..131.25 rows=50 width=4)
                       ->  Index Scan using othertable_text on
othertable  (cost=0.00..131.25 rows=50 width=4)
                             Index Cond: (othertext = 'alma'::text)
         ->  Seq Scan on child2 parent  (cost=0.00..1312634.29 rows=5000
width=36)
               Filter: (subplan)
               SubPlan
                 ->  Materialize  (cost=131.25..131.25 rows=50 width=4)
                       ->  Index Scan using othertable_text on
othertable  (cost=0.00..131.25 rows=50 width=4)
                             Index Cond: (othertext = 'alma'::text)
         ->  Seq Scan on child3 parent  (cost=0.00..1312634.29 rows=5000
width=36)
               Filter: (subplan)
               SubPlan
                 ->  Materialize  (cost=131.25..131.25 rows=50 width=4)
                       ->  Index Scan using othertable_text on
othertable  (cost=0.00..131.25 rows=50 width=4)
                             Index Cond: (othertext = 'alma'::text)
         ->  Seq Scan on child4 parent  (cost=0.00..1312634.29 rows=5000
width=36)
               Filter: (subplan)
               SubPlan
                 ->  Materialize  (cost=131.25..131.25 rows=50 width=4)
                       ->  Index Scan using othertable_text on
othertable  (cost=0.00..131.25 rows=50 width=4)
                             Index Cond: (othertext = 'alma'::text)
(32 rows)

johnlange=# select version();

version
--------------------------------------------------------------------------------------------------------
 PostgreSQL 7.3.1 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2.1
(Mandrake Linux 9.1 3.2.1-2mdk)
(1 row)

Re: Query plan and Inheritance. Weird behavior

От
John Lange
Дата:
On Wed, 2003-01-29 at 01:52, Andras Kadinger wrote:
> John Lange wrote:
> >
> > > I don't see how performance would be significantly better if you stored
> > > the common columns of all rows (parent and children) in the parent
> > > table, in contrast with how it is done now, storing entire rows of child
> > > tables in the child table and omitting them from the parent table.
> >
> > Well there are a couple of points.
> >
> > Firstly, from the simple standpoint of database normalization you
> > shouldn't have tables that have the same columns. The way it is
> > implemented, child tables are copies of parent tables.
>
> As Tom pointed out, only the schema is copied, but not the data.

I guess you are right, strictly speaking this isn't a violation of
normalization since no data is duplicated.

> This has the following advantages:
> - if you select from child tables, PostgreSQL will only have to scan
> rows that belong to that child (and further down), and not all rows in
> all tables of the inheritance hierarchy; so if you have 100 million rows
> in the whole hierarchy, but only have say 1 million in the child you are
> currently interested in, you only have to scan those 1 million rows, and
> not the whole 100 million.
> - all columns of rows are stored together, so to read a row only one
> disk access is needed (your way it would probably need roughly one
> random disk access per each inheritance level upwards, both for
> reads/selects and writes/inserts/updates; with a sizable inheritance
> hierarchy this would be a considerable performance hit)
> - it doesn't really cost much in terms of disk space, only some
> bookkeeping information is needed
>
> I don't think inheritance really fits into 'database normalization'
> itself, but still there are cases where it is more convenient/efficient
> than with traditional database normalization, where you would have to
> either go create completely separate tables for each type (and do UNIONs
> of SELECTs if you are interested in more than one child only), or what's
> even more cumbersome, create a table with common columns ('parent' here)
> and then go create children and children of children that each link
> upwards to their respective parents through some kind of key: in a
> select, you would have to explicitly specify all tables upwards the
> inheritance hierarchy, and specify the respective joins for them.
>
> So I think whether you should choose more traditional database
> normalization or use inheritance depends on what you want to do.
>
> > But more importantly it is bad for performance because selecting from a
> > parent table causes the same select to be done on all the child tables.
> > In my case selecting from the parent causes six selects to be done (one
> > for every child table).
>
> 'causes the same select to be done on all the child tables' - I don't
> agree with that, and I hope this is where the misunderstanding lies.
>
> Consider this:
>
> CREATE TABLE parent ( id integer NOT NULL, text text);
> CREATE TABLE child1 ( child1field text) INHERITS (parent);
> CREATE TABLE child2 ( child2field text) INHERITS (parent);
> CREATE TABLE child3 ( child3field text) INHERITS (parent);
> CREATE TABLE child4 ( child4field text) INHERITS (parent);
>
> CREATE TABLE othertable ( id integer NOT NULL, othertext text);
>
> ALTER TABLE ONLY parent ADD CONSTRAINT parent_pkey PRIMARY KEY (id);
> ALTER TABLE ONLY child1 ADD CONSTRAINT child1_pkey PRIMARY KEY (id);
> ALTER TABLE ONLY child2 ADD CONSTRAINT child2_pkey PRIMARY KEY (id);
> ALTER TABLE ONLY child3 ADD CONSTRAINT child3_pkey PRIMARY KEY (id);
> ALTER TABLE ONLY child4 ADD CONSTRAINT child4_pkey PRIMARY KEY (id);
>
> ALTER TABLE ONLY othertable ADD CONSTRAINT othertable_pkey PRIMARY KEY
> (id);
>
> Then I filled all tables with 10000 rows of synthetic data and ANALYZEd
> just to make sure the optimizer considers indexes to be valuable.
>
> First I tried this:
>
> johnlange=# explain select * from parent where id=13;
>                                           QUERY
> PLAN
> ----------------------------------------------------------------------------------------------
>  Result  (cost=0.00..15.07 rows=5 width=36)
>    ->  Append  (cost=0.00..15.07 rows=5 width=36)
>          ->  Index Scan using parent_pkey on parent  (cost=0.00..3.01
> rows=1 width=36)
>                Index Cond: (id = 13)
>          ->  Index Scan using child1_pkey on child1 parent
> (cost=0.00..3.01 rows=1 width=36)
>                Index Cond: (id = 13)
>          ->  Index Scan using child2_pkey on child2 parent
> (cost=0.00..3.01 rows=1 width=36)
>                Index Cond: (id = 13)
>          ->  Index Scan using child3_pkey on child3 parent
> (cost=0.00..3.01 rows=1 width=36)
>                Index Cond: (id = 13)
>          ->  Index Scan using child4_pkey on child4 parent
> (cost=0.00..3.01 rows=1 width=36)
>                Index Cond: (id = 13)
> (12 rows)
>
> The planner has rightly chosen to use indexes, and as a result the query
> will be pretty fast.
>
> Also, at first sight this might look like the multiple selects you
> mention, but actually it isn't; here's another example to show that:
>
> inh=# explain select * from parent natural join othertable where
> parent.id=13;
>                                           QUERY
> PLAN
> ----------------------------------------------------------------------------------------------
>  Nested Loop  (cost=0.00..30.20 rows=5 width=72)
>    ->  Append  (cost=0.00..15.07 rows=5 width=36)
>          ->  Index Scan using parent_pkey on parent  (cost=0.00..3.01
> rows=1 width=36)
>                Index Cond: (id = 13)
>          ->  Index Scan using child1_pkey on child1 parent
> (cost=0.00..3.01 rows=1 width=36)
>                Index Cond: (id = 13)
>          ->  Index Scan using child2_pkey on child2 parent
> (cost=0.00..3.01 rows=1 width=36)
>                Index Cond: (id = 13)
>          ->  Index Scan using child3_pkey on child3 parent
> (cost=0.00..3.01 rows=1 width=36)
>                Index Cond: (id = 13)
>          ->  Index Scan using child4_pkey on child4 parent
> (cost=0.00..3.01 rows=1 width=36)
>                Index Cond: (id = 13)
>    ->  Index Scan using othertable_pkey on othertable  (cost=0.00..3.01
> rows=1 width=36)
>          Index Cond: ("outer".id = othertable.id)
> (14 rows)
>
> As you can see, the planner decided to use the indexes on parent and
> children here too, retrieved and then collated the resulting rows first
> and only then performed the join against othertable.
>
> In other words, it is not peforming 5 separate selects with their
> respective joins; it collects all qualifying rows first from the
> inheritance hierarchy, and only then performs the join; so the extra
> cost compared to the non-inheriting case is pretty much only the added
> cost of consulting five indexes instead of just one - unless you have
> inheritance hierarchies consisting of several dozen tables or more (and
> even then) I don't think this added cost would be significant.
>
> > This is entirely reasonable and efficient compared to the current model
> > where a select on a parent table requires the same select to be executed
> > on EVERY child table. If it's a large expensive JOIN of some kind then
> > this is verging on un-workable.
>
> Please show us a join that you would like to use and let us see how well
> the planner handles it.

Ok, your reply here is very informative. Firstly, I can see from your
example that I likely don't have my keys and constraints implemented
properly.

However, the issue of indexes is not necessarily that relevant since you
may not be selecting rows based on columns that have indexes.

So the issue of indexes aside, I think some of the misunderstanding is
related to my assumption that the appended operations are relatively
more expensive than scanning the same number of rows in a single select.

Here is the way it looks on my system when I select a single object.

db_drs0001=> explain select * from tbl_objects where id = 1;
NOTICE:  QUERY PLAN:

Result  (cost=0.00..153.70 rows=6 width=111)
 ->  Append  (cost=0.00..153.70 rows=6 width=111)
  ->  Seq Scan on tbl_objects  (cost=0.00..144.35 rows=1 width=107)
  ->  Seq Scan on tbl_viewers tbl_objects  (cost=0.00..1.06 rows=1
width=97)
  ->  Seq Scan on tbl_documents tbl_objects  (cost=0.00..4.91 rows=1
width=111)
  ->  Seq Scan on tbl_formats tbl_objects  (cost=0.00..1.11 rows=1
width=100)
  ->  Seq Scan on tbl_massemails tbl_objects  (cost=0.00..1.01 rows=1
width=90)
  ->  Seq Scan on tbl_icons tbl_objects  (cost=0.00..1.25 rows=1
width=110)

db_drs0001=> select version();
                            version
---------------------------------------------------------------
 PostgreSQL 7.2.1 on i686-pc-linux-gnu, compiled by GCC 2.95.3
(1 row)

The only question here is, if a select requires the scanning of all rows
to return a result set, is it dramatically less efficient to have it
scanning 100,000 rows spread over 6 tables or to scan 100,000 rows in a
single table?

(At the moment I have no where near that amount of data. Side question,
what technique do you use to generate data to fill your tables for
testing?)

I'm now starting to see that it likely isn't that much different either
way so the benefits of the way it's implemented probably out weigh the
negatives. Your end up scanning the same number of rows either way.

On the topic of proper indexes, if you would indulge me, can you show me
where I have gone wrong in that regard? My biggest point of confusion
here is with regards to the sequences that are used in the parent table.

Here is the schema as produced by pg_dump. The original create used the
keyword "serial" or "bigserial" as the case may be. I've edited some of
the columns out just to keep the example shorter:

CREATE SEQUENCE "tbl_objects_id_seq" start 1 increment 1 maxvalue
9223372036854775807 minvalue 1 cache 1;

CREATE TABLE "tbl_objects" (
   "id" bigint DEFAULT nextval('"tbl_objects_id_seq"'::text) NOT NULL,
   "name" text DEFAULT '' NOT NULL,
   "description" text DEFAULT '' NOT NULL,
   "status" smallint DEFAULT '1' NOT NULL,
   "class" text
);

CREATE TABLE "tbl_viewers" (
        "exec" text DEFAULT '' NOT NULL )
INHERITS ("tbl_objects");

CREATE TABLE "tbl_documents" (
        "filename" text DEFAULT '' NOT NULL )
INHERITS ("tbl_objects");

CREATE TABLE "tbl_massemails" (
        "from" text DEFAULT '' NOT NULL,
        "subject" text DEFAULT '' NOT NULL,
        "message" text DEFAULT '' NOT NULL )
INHERITS ("tbl_objects");

CREATE TABLE "tbl_icons" (
        "format_id" bigint DEFAULT '0' NOT NULL )
INHERITS ("tbl_documents");

CREATE TABLE "tbl_formats" (
        "viewer_id" bigint DEFAULT '0' NOT NULL,
        "extension" text DEFAULT '' NOT NULL,
        "contenttype" text DEFAULT '' NOT NULL,
        "upload_class" text )
INHERITS ("tbl_objects");

CREATE UNIQUE INDEX tbl_objects_id_key ON tbl_objects USING btree (id);

Thanks very much for taking the time to look into this with me. It has
been most informative.

John Lange

>
> Regards,
> Andras
>
> PS (John, don't look here :) ): I have found some queries with plans
> that are less efficient than I think they could be.
>
> Changing the where clause in the above query to refer to othertable
> gives:
>
> johnlange=# explain select * from parent natural join othertable where
> othertable.id=13;
>                                           QUERY
> PLAN
> -----------------------------------------------------------------------------------------------
>  Hash Join  (cost=3.02..978.08 rows=5 width=72)
>    Hash Cond: ("outer".id = "inner".id)
>    ->  Append  (cost=0.00..725.00 rows=50000 width=36)
>          ->  Seq Scan on parent  (cost=0.00..145.00 rows=10000 width=36)
>          ->  Seq Scan on child1 parent  (cost=0.00..145.00 rows=10000
> width=36)
>          ->  Seq Scan on child2 parent  (cost=0.00..145.00 rows=10000
> width=36)
>          ->  Seq Scan on child3 parent  (cost=0.00..145.00 rows=10000
> width=36)
>          ->  Seq Scan on child4 parent  (cost=0.00..145.00 rows=10000
> width=36)
>    ->  Hash  (cost=3.01..3.01 rows=1 width=36)
>          ->  Index Scan using othertable_pkey on othertable
> (cost=0.00..3.01 rows=1 width=36)
>                Index Cond: (id = 13)
> (11 rows)
>
> While:
>
> johnlange=# explain select * from ONLY parent natural join othertable
> where othertable.id=13;
>                                        QUERY
> PLAN
> -----------------------------------------------------------------------------------------
>  Nested Loop  (cost=0.00..6.04 rows=1 width=72)
>    ->  Index Scan using othertable_pkey on othertable  (cost=0.00..3.01
> rows=1 width=36)
>          Index Cond: (id = 13)
>    ->  Index Scan using parent_pkey on parent  (cost=0.00..3.01 rows=1
> width=36)
>          Index Cond: (parent.id = "outer".id)
> (5 rows)
>
> Similarly, as a somewhat more real-life example:
>
> johnlange=# explain select * from parent natural join othertable where
> othertable.othertext='apple';
>                                             QUERY
> PLAN
> --------------------------------------------------------------------------------------------------
>  Hash Join  (cost=131.37..1234.50 rows=250 width=72)
>    Hash Cond: ("outer".id = "inner".id)
>    ->  Append  (cost=0.00..725.00 rows=50000 width=36)
>          ->  Seq Scan on parent  (cost=0.00..145.00 rows=10000 width=36)
>          ->  Seq Scan on child1 parent  (cost=0.00..145.00 rows=10000
> width=36)
>          ->  Seq Scan on child2 parent  (cost=0.00..145.00 rows=10000
> width=36)
>          ->  Seq Scan on child3 parent  (cost=0.00..145.00 rows=10000
> width=36)
>          ->  Seq Scan on child4 parent  (cost=0.00..145.00 rows=10000
> width=36)
>    ->  Hash  (cost=131.25..131.25 rows=50 width=36)
>          ->  Index Scan using othertable_text on othertable
> (cost=0.00..131.25 rows=50 width=36)
>                Index Cond: (othertext = 'alma'::text)
> (11 rows)
>
> What's more strange, that it still does it with enable_seqscan set to
> off:
>
> johnlange=# explain select * from parent natural join othertable where
> othertable.othertext='apple';
>                                             QUERY
> PLAN
> --------------------------------------------------------------------------------------------------
>  Hash Join  (cost=100000131.37..500001234.50 rows=250 width=72)
>    Hash Cond: ("outer".id = "inner".id)
>    ->  Append  (cost=100000000.00..500000725.00 rows=50000 width=36)
>          ->  Seq Scan on parent  (cost=100000000.00..100000145.00
> rows=10000 width=36)
>          ->  Seq Scan on child1 parent  (cost=100000000.00..100000145.00
> rows=10000 width=36)
>          ->  Seq Scan on child2 parent  (cost=100000000.00..100000145.00
> rows=10000 width=36)
>          ->  Seq Scan on child3 parent  (cost=100000000.00..100000145.00
> rows=10000 width=36)
>          ->  Seq Scan on child4 parent  (cost=100000000.00..100000145.00
> rows=10000 width=36)
>    ->  Hash  (cost=131.25..131.25 rows=50 width=36)
>          ->  Index Scan using othertable_text on othertable
> (cost=0.00..131.25 rows=50 width=36)
>                Index Cond: (othertext = 'apple'::text)
> (11 rows)
>
> While:
>
> johnlange=# explain select * from ONLY parent natural join othertable
> where othertable.othertext='apple';
>                                          QUERY
> PLAN
> --------------------------------------------------------------------------------------------
>  Nested Loop  (cost=0.00..282.55 rows=50 width=72)
>    ->  Index Scan using othertable_text on othertable
> (cost=0.00..131.25 rows=50 width=36)
>          Index Cond: (othertext = 'apple'::text)
>    ->  Index Scan using parent_pkey on parent  (cost=0.00..3.01 rows=1
> width=36)
>          Index Cond: (parent.id = "outer".id)
> (5 rows)
>
> If I try to make it more efficient and get rid of the seq scans by
> pushing the condition into a subselect, I get an even more interesting
> plan:
>
> johnlange=# explain select * from parent where id in (select id from
> othertable where othertext='alma');
>                                                   QUERY
> PLAN
> ---------------------------------------------------------------------------------------------------------------
>  Result  (cost=0.00..6563171.43 rows=25000 width=36)
>    ->  Append  (cost=0.00..6563171.43 rows=25000 width=36)
>          ->  Seq Scan on parent  (cost=0.00..1312634.29 rows=5000
> width=36)
>                Filter: (subplan)
>                SubPlan
>                  ->  Materialize  (cost=131.25..131.25 rows=50 width=4)
>                        ->  Index Scan using othertable_text on
> othertable  (cost=0.00..131.25 rows=50 width=4)
>                              Index Cond: (othertext = 'alma'::text)
>          ->  Seq Scan on child1 parent  (cost=0.00..1312634.29 rows=5000
> width=36)
>                Filter: (subplan)
>                SubPlan
>                  ->  Materialize  (cost=131.25..131.25 rows=50 width=4)
>                        ->  Index Scan using othertable_text on
> othertable  (cost=0.00..131.25 rows=50 width=4)
>                              Index Cond: (othertext = 'alma'::text)
>          ->  Seq Scan on child2 parent  (cost=0.00..1312634.29 rows=5000
> width=36)
>                Filter: (subplan)
>                SubPlan
>                  ->  Materialize  (cost=131.25..131.25 rows=50 width=4)
>                        ->  Index Scan using othertable_text on
> othertable  (cost=0.00..131.25 rows=50 width=4)
>                              Index Cond: (othertext = 'alma'::text)
>          ->  Seq Scan on child3 parent  (cost=0.00..1312634.29 rows=5000
> width=36)
>                Filter: (subplan)
>                SubPlan
>                  ->  Materialize  (cost=131.25..131.25 rows=50 width=4)
>                        ->  Index Scan using othertable_text on
> othertable  (cost=0.00..131.25 rows=50 width=4)
>                              Index Cond: (othertext = 'alma'::text)
>          ->  Seq Scan on child4 parent  (cost=0.00..1312634.29 rows=5000
> width=36)
>                Filter: (subplan)
>                SubPlan
>                  ->  Materialize  (cost=131.25..131.25 rows=50 width=4)
>                        ->  Index Scan using othertable_text on
> othertable  (cost=0.00..131.25 rows=50 width=4)
>                              Index Cond: (othertext = 'alma'::text)
> (32 rows)
>
> johnlange=# select version();
>
> version
> --------------------------------------------------------------------------------------------------------
>  PostgreSQL 7.3.1 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2.1
> (Mandrake Linux 9.1 3.2.1-2mdk)
> (1 row)


Re: Query plan and Inheritance. Weird behavior

От
Andras Kadinger
Дата:
John Lange wrote:
>
> On Wed, 2003-01-29 at 01:52, Andras Kadinger wrote:
> > As Tom pointed out, only the schema is copied, but not the data.
>
> I guess you are right, strictly speaking this isn't a violation of
> normalization since no data is duplicated.

[...]

> Ok, your reply here is very informative. Firstly, I can see from your
> example that I likely don't have my keys and constraints implemented
> properly.
>
> However, the issue of indexes is not necessarily that relevant since you
> may not be selecting rows based on columns that have indexes.

Granted, now I see that was not strictly related to your point, I just
wanted to avoid most avoidable objections against performance of
inheritance, and I wasn't 100% sure you seeing seq scans was not part of
you thinking performance of this method would be suboptimal so just to
be sure, I explicitly went for an example with indexes.

> So the issue of indexes aside, I think some of the misunderstanding is
> related to my assumption that the appended operations are relatively
> more expensive than scanning the same number of rows in a single select.

I see. Well, the Append step itself I suppose is not doing much else
than iterates over its subnodes and asks each of them to return their
rows, and forwards the rows upwards to the rest of the plan as it
receives them - it doesn't buffer them, or collect them all before
forwarding them upwards (I think the Materialize step that were to be
seen in an example in my last PS is the one that does that).

So I don't think the Append incurs any significant costs much more than
a few CPU cycles for that iteration and row forwarding (pass of one
pointer to in-memory row I guess) steps - I think these are minuscule
compared to the cost of any disk I/O, and in most non-CPU-bound queries
are hidden by disk throughput/latency anyway.

> Here is the way it looks on my system when I select a single object.
>
> db_drs0001=> explain select * from tbl_objects where id = 1;
> NOTICE:  QUERY PLAN:
>
> Result  (cost=0.00..153.70 rows=6 width=111)
>  ->  Append  (cost=0.00..153.70 rows=6 width=111)
>   ->  Seq Scan on tbl_objects  (cost=0.00..144.35 rows=1 width=107)
>   ->  Seq Scan on tbl_viewers tbl_objects  (cost=0.00..1.06 rows=1
> width=97)
>   ->  Seq Scan on tbl_documents tbl_objects  (cost=0.00..4.91 rows=1
> width=111)
>   ->  Seq Scan on tbl_formats tbl_objects  (cost=0.00..1.11 rows=1
> width=100)
>   ->  Seq Scan on tbl_massemails tbl_objects  (cost=0.00..1.01 rows=1
> width=90)
>   ->  Seq Scan on tbl_icons tbl_objects  (cost=0.00..1.25 rows=1
> width=110)
>
> db_drs0001=> select version();
>                             version
> ---------------------------------------------------------------
>  PostgreSQL 7.2.1 on i686-pc-linux-gnu, compiled by GCC 2.95.3
> (1 row)
>
> The only question here is, if a select requires the scanning of all rows
> to return a result set, is it dramatically less efficient to have it
> scanning 100,000 rows spread over 6 tables or to scan 100,000 rows in a
> single table?

I see. Well, the Append step itself I suppose isn't doing much more than
iterates over its subnodes (the seq scans in the case above) and asks
each of them to return rows, and forwards the rows upwards to the rest
of the plan as it receives them - it doesn't buffer them, or collect
them all before forwarding them upwards (I think the Materialize step
that were to be seen in an example in my last PS does that). So aside
for any costs of consulting indexes, I don't think the Append step -
which is the added step when scanning multiple tables versus scanning
one table - incurs any significant costs much more than a few CPU cycles
for those iteration and row forwarding (pass of one pointer to in-memory
row I guess) steps - I think these are minuscule compared to the cost of
any disk I/O, and in most non-CPU-bound queries are hidden by disk
throughput/latency anyway.

> (At the moment I have no where near that amount of data. Side question,
> what technique do you use to generate data to fill your tables for
> testing?)

For this occasion I just went and created a dozen-line PHP script that
simply inserted 10000 rows with consecutive ids into each table.

I suggest you to try to populate your test database with test data on
the order of your expected working data set and use EXPLAIN ANALYZE to
make estimates of expected performance of the database.

> I'm now starting to see that it likely isn't that much different either
> way so the benefits of the way it's implemented probably out weigh the
> negatives. Your end up scanning the same number of rows either way.

Aside from extreme cases where child rows are considerably much more
wider than parent rows and thus result in considerably more data needed
to be read in case of a sequential scan, yes.

> On the topic of proper indexes, if you would indulge me, can you show me
> where I have gone wrong in that regard?

Hmm, I think you should only have gone and created indexes for child
tables by hand, as indexes are not inherited.

Also, don't forget, PostgreSQL has an advanced query cost estimation
subsystem, which decides for or against using an index based on, among
others, statistics collected on distribution of values in the index to
determine its selectivity (so don't forget to ANALYZE/VACUUM ANALYZE
after inserting/changing a lot of rows that significantly change
distribution of values - this includes initial table fillup), and also
it accounts for costs of accessing index pages, so with less than say a
couple of thousand rows or with not very selective indexes it will
(rightly) decide not to use the index but do a seq scan instead -
probably the reason for why you don't see an index scan on tbl_objects
above despite there being an index on the primary key.

> My biggest point of confusion
> here is with regards to the sequences that are used in the parent table.

Child tables inherit the "nextval('...')" default value, so as a result
they will all draw from the same one sequence, which sequence exists
outside of the tables; as a result as long as you use that default
value, it is guaranteed that the column in question will have unique
values among all tables parent and children; they won't be consecutive -
but that's not a drawback of inheritance either, as a sequence is not
guaranteed to provide consecutive numbers with single tables either due
to transaction concurrency (rolled back transactions don't 'put back'
numbers into the sequence, so in the case of rolled back transactions
there will be numbers drawn from the sequence that never actually get
into any table - this is nicely documented with sequences and
transactions).

> Here is the schema as produced by pg_dump. The original create used the
> keyword "serial" or "bigserial" as the case may be. I've edited some of
> the columns out just to keep the example shorter:
>
> CREATE SEQUENCE "tbl_objects_id_seq" start 1 increment 1 maxvalue
> 9223372036854775807 minvalue 1 cache 1;
>
> CREATE TABLE "tbl_objects" (
>    "id" bigint DEFAULT nextval('"tbl_objects_id_seq"'::text) NOT NULL,
>    "name" text DEFAULT '' NOT NULL,
>    "description" text DEFAULT '' NOT NULL,
>    "status" smallint DEFAULT '1' NOT NULL,
>    "class" text
> );
>
> CREATE TABLE "tbl_viewers" (
>         "exec" text DEFAULT '' NOT NULL )
> INHERITS ("tbl_objects");
>
> CREATE TABLE "tbl_documents" (
>         "filename" text DEFAULT '' NOT NULL )
> INHERITS ("tbl_objects");
>
> CREATE TABLE "tbl_massemails" (
>         "from" text DEFAULT '' NOT NULL,
>         "subject" text DEFAULT '' NOT NULL,
>         "message" text DEFAULT '' NOT NULL )
> INHERITS ("tbl_objects");
>
> CREATE TABLE "tbl_icons" (
>         "format_id" bigint DEFAULT '0' NOT NULL )
> INHERITS ("tbl_documents");
>
> CREATE TABLE "tbl_formats" (
>         "viewer_id" bigint DEFAULT '0' NOT NULL,
>         "extension" text DEFAULT '' NOT NULL,
>         "contenttype" text DEFAULT '' NOT NULL,
>         "upload_class" text )
> INHERITS ("tbl_objects");
>
> CREATE UNIQUE INDEX tbl_objects_id_key ON tbl_objects USING btree (id);

Hmm, I wonder whether you have a specific goal with or reason for
explicitly specifying NOT NULL and empty string ('') as default value
for all these text fields? If it's just because your frontend makes it
inconvenient for you to treat a NULL as empty string, you might want to
consider allowing NULLs and using the coalesce() function in your select
- this would incur a few CPU cycles per returned result row, but will
spare you a few bytes in storage - I think 4 or 8 per column - for each
NULL value. Whether this is worth it or not depends on the percentage of
empty/NULL values in your data though.

> Thanks very much for taking the time to look into this with me. It has
> been most informative.

You're welcome!

Regards,
Andras