Обсуждение: Thoughts on inheritance docs

Поиск
Список
Период
Сортировка

Thoughts on inheritance docs

От
Chris Travers
Дата:
A few observations here.  Want to get feedback before considering
proposing a documentation update here.

The first is that the only example given for table inheritance is a
set/subset model, and the example is problematic for the reason that
the caveats section discusses.  Consequently it isn't immediately
apparent how table inheritance (and in particular multiple inheritance
which, unless I have missed something, seems to be unique to
PostgreSQL).

In general, I see table inheritance as useful for a large number of
tasks including:

  * Set/subset modelling (here borrowing techniques from table
partitioning is necessary-- the full set is partitioned according to
interesting subsets, but each subset may extend associated info)

  * Enforcement of consistent semantics of join syntax, including
functions that would allow you to automatically traverse joins.
Consider:  SELECT (bc.inventory_item).sku from barcodes bc where the
inventory_item method is inherited from inventory_item_ref.  Since
inventory_item_ref would add a column called inventory_item_id to the
table, we'd also know join syntax would generally be consistent.

  * Derivative information modelling of sets of columns which are
insufficient for relational modelling by themselves
     (perhaps because they lack necessary join conditions)

To my knowledge, the latter two are PostgreSQL-specific.  They make
very little sense unless multiple inheritance is permitted, and every
other ORDBMS I have looked at (including Informix, DB2, and Oracle)
has only supported single inheritance.  Restricting to single
inheritance makes sense if you are specializing on set/subset
modelling but multiple inheritance is a tremendously useful tool that
may be well beyond cutting edge.  it might be worth showing this off
in the documentation with real, working examples.

So my questions:

1)  Would it be worth coming up with more useful examples for table inheritance?
2)  Would it be worth adding mention of uses of multiple table inheritance?
3)  Would it be worth discussing how to position the limitations of
table inheritance?  I personally recognize the difficulties in
documenting these but I can't help thinking that a simple chart of
"this is/is not inherited" would be very useful as well.

What do people think?

Best Wishes,
Chris Travers


Re: Thoughts on inheritance docs

От
Tom Lane
Дата:
Chris Travers <chris.travers@gmail.com> writes:
> 1)  Would it be worth coming up with more useful examples for table inheritance?
> 2)  Would it be worth adding mention of uses of multiple table inheritance?
> 3)  Would it be worth discussing how to position the limitations of
> table inheritance?  I personally recognize the difficulties in
> documenting these but I can't help thinking that a simple chart of
> "this is/is not inherited" would be very useful as well.

All of those sound sane to me at this level of detail, and in particular
I agree that the docs are short on examples of multiple inheritance.

The arguments will probably start once you get more concrete ...

            regards, tom lane


Re: Thoughts on inheritance docs

От
Chris Travers
Дата:
On Thu, Aug 23, 2012 at 7:09 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Chris Travers <chris.travers@gmail.com> writes:
>> 1)  Would it be worth coming up with more useful examples for table inheritance?
>> 2)  Would it be worth adding mention of uses of multiple table inheritance?
>> 3)  Would it be worth discussing how to position the limitations of
>> table inheritance?  I personally recognize the difficulties in
>> documenting these but I can't help thinking that a simple chart of
>> "this is/is not inherited" would be very useful as well.
>
> All of those sound sane to me at this level of detail, and in particular
> I agree that the docs are short on examples of multiple inheritance.
>
> The arguments will probably start once you get more concrete ...

I am sure.

In the early prototyping stage, then, maybe it is worth sending out a
brief example, asking for technical feedback before figuring out how
best to work it in.

This example demonstrates re-usable interfaces, the need to declare
foreign keys on child tables, and the ability to add functions to a
sets of tables which use these components.  Multiple inheritance then
gives us an ability to create re-usable interfaces for our data which
can be re-used throughout our database without creating interfaces
which particularly problematic for the use of most relational tools.

If the example looks technically solid then I will look at writing it
up in greater detail.

Suppose we have a country table:

CREATE TABLE country (
    id serial not null unique,
    name text primary key,
    short_name varchar(2) not null unique
);

We may want to create a standardized join interface for this table:

CREATE TABLE country_ref (
    country_id int
);

This table is relatively uninteresting itself and is in fact we will
never directly query it.  However it provides an interface for
defining foreign keys in other tables, and as we will show we can make
this simple interface very semantically rich if we want.

Suppose we also have a table for defining an interface for additional
notes fields on tables:

CREATE TABLE note_fields (
    note_subject text,
    note_content text
);

Typically we may want these to be nullable if we don't know how these
will be used down the inheritance tree.  If we want to we can create
inherited subtypes which enforce not null constraints on these fields.
 Descendant tables however cannot remove an inherited NOT NULL
constraint, and so constraints should be added only carefully.  In the
above two cases, other tables might want to allow null foreign keys to
fields, or may not want to require that notes be filled out.

Now we may wish to create a table which stores landmark information:

CREATE TABLE landmark (
    id serial not null unique,
    name text primary key,
    nearest_city text not null,
    foreign key (country_id) references country(id),
    CHECK (country_id IS NOT NULL),
    CHECK (note_content IS NOT NULL)
) INHERITS (note_fields, country_ref);

The inheritance adds the following fields to the table:  note_content,
note_subject, and country_id.  These can now be queried as you would
normally, but the table can be cast to note_fields or country_ref for
processing in functions.  This gives you some extremely powerful
features you can add.  For example we can:

CREATE FUNCTION note_tsvector(note_fields)
RETURNS tsvector IMMUTABLE
LANGUAGE SQL AS $BODY$

SELECT to_tsvector('english', coalesce($1.note_subject, '') || ' ' ||
coalesce($1.note_content, ''));

$BODY$;

This gives us a value which can be calculated on the fly which is an
english language tsvector for the notes.  This can be invoked either
using class.method syntax or function(input).  Note in the version
below we could use note_tsvector(l) instead.  Moreover this interface
will apply to every table in the database which inherits note_fields.
Moreover we can index this value like we would elsewhere (though
CREATE INDEX appears to require the note_tsvector(landmark) syntax.
See [section on functional indexes] for more information.  Keep in
mind that indexes are not inherited so each individual table must be
indexed separately.


or_examples=# select name, note_content from landmark l where
plainto_tsquery('english', 'bridge') @@ l.note_tsvector;
        name        |            note_content
--------------------+------------------------------------
 Eiffel Tower       | Designed by a great bridge builder
 Golden Gate Bridge | Iconic suspension bridge
(2 rows)

Similarly we can add a method to "dereference" the country_id to every
table which inherits country_ref:

CREATE FUNCTION country(country_ref) RETURNS country
LANGUAGE SQL STABLE AS
$BODY$ SELECT * FROM country WHERE id = $1.country_id $BODY$;

If we want to know the names of the landmarks in the database and what
countries they are in we can then:

SELECT name, (l.country).name as country_name FROM landmark l;

or_examples=# SELECT name, (l.country).name as country_name FROM landmark l;
        name        | country_name
--------------------+---------------
 Eiffel Tower       | France
 Borobudur          | Indonesia
 CN Tower           | Canada
 Golden Gate Bridge | United States
(4 rows)

Again if l.country bothers you, you can substitute country(l).name for
(l.country).name.

Note that this will be slow on large result sets because it will
individually pull a country record for each landmark.  It is worth
keeping this in mind when deciding to dereference keys in this way.