Обсуждение: Inheritance

Поиск
Список
Период
Сортировка

Inheritance

От
Jan Johansson
Дата:
Hi,

I've been reading some threads about inheritance, and how complicated it seems to let child (children) to inherit constraints and indexes (behaviors). In fact this seems to have been the issue for years, with no resolution.

However, inheritance is a very good feature, and it would be great to have it feature complete.

To try to unlock the feature, to be more complete, how about introducing restriction to inheritance like:

 - Allow single (behavior) inheritance (model here is quite a few modern languages, such as C#, D, ...)
 - Allow multiple declarative inheritance (interface like, the inheritance almost works like this today though)

If, with these restrictions (or maybe only the first), do you think that it will simplify implementation and make it more feature complete?

Kind regards,
Jan Johansson

Re: Inheritance

От
Jim Nasby
Дата:
On 5/22/16 1:37 AM, Jan Johansson wrote:
>  - Allow single (behavior) inheritance (model here is quite a few modern
> languages, such as C#, D, ...)
>  - Allow multiple declarative inheritance (interface like, the
> inheritance almost works like this today though)
>
> If, with these restrictions (or maybe only the first), do you think that
> it will simplify implementation and make it more feature complete?

I think you'll need to be a bit more specific to elicit a response. What 
exactly are you proposing to change?
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461



Re: Inheritance

От
Merlin Moncure
Дата:
On Mon, May 23, 2016 at 10:21 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> On 5/22/16 1:37 AM, Jan Johansson wrote:
>>
>>  - Allow single (behavior) inheritance (model here is quite a few modern
>> languages, such as C#, D, ...)
>>  - Allow multiple declarative inheritance (interface like, the
>> inheritance almost works like this today though)
>>
>> If, with these restrictions (or maybe only the first), do you think that
>> it will simplify implementation and make it more feature complete?
>
> I think you'll need to be a bit more specific to elicit a response. What
> exactly are you proposing to change?

I would guess OP is complaining about what everyone complains about.
What people want is for tables to have a base set of shared columns by
a varying set of derived type dependent columns.  Constraints on the
shared columns will be enforced on all the derived columns.

Postgres doesn't work that way, and the documentation disclaims this:
"Note: Although inheritance is frequently useful, it has not been
integrated with unique constraints or foreign keys, which limits its
usefulness. See Section 5.8 for more detail."

Personally, I don't think this will ever be fixed.  The reason why it
doesn't work is due to some foundational implementation decisions that
would have to be revisited.

This feature was very much a product of the time, at the height of the
"Object Relational" fad.  The trend for postgres has been in the exact
opposite direction, towards the SQL standard.  Further complicating
matters, inheritance has been repurposed to be the foundation for
table partitioning, making heavy changes problematic.

The classic SQL approach to the problem, establishing a base table
plus a type and derived tables with a pkey:pkey link isn't a very bad
one from a data modelling perspective and serialization to the
application is increasingly going to be handled by json going forward
as opposed to hacking the postgres type system.  This really reduces
the value proposition of heavy changes to the inheritance features.
If there was consensus on that point, I suppose the way forward is
some documentation restructuring, starting with removing the
increasingly baroque tutorial (trivia: the tutorial was the old manual
at one point).

merlin



Re: Inheritance

От
Jim Nasby
Дата:
On 5/23/16 11:05 AM, Merlin Moncure wrote:
> Postgres doesn't work that way, and the documentation disclaims this:
> "Note: Although inheritance is frequently useful, it has not been
> integrated with unique constraints or foreign keys, which limits its
> usefulness. See Section 5.8 for more detail."
>
> Personally, I don't think this will ever be fixed.  The reason why it
> doesn't work is due to some foundational implementation decisions that
> would have to be revisited.

If the complaint is really about FKs/UNIQUE (and really AFAIK it's only 
UNIQUE that's the problem), then I agree: it should be addressed. It's a 
major impediment to partitioning (and generic inheritance).

> This feature was very much a product of the time, at the height of the
> "Object Relational" fad.  The trend for postgres has been in the exact
> opposite direction, towards the SQL standard.  Further complicating
> matters, inheritance has been repurposed to be the foundation for
> table partitioning, making heavy changes problematic.

I don't see why partitioning complicates fixing these issues. ISTM it's 
the exact same complaint for both inheritance and partitioning.

I also disagree about PK:PK FK's between a bunch of completely 
independent tables being a good way to model this stuff. It doubles the 
complexity of every query against a child table and doesn't perform 
nearly as well, because your data locality goes down the tubes.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461



Re: Inheritance

От
Tom Lane
Дата:
Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
> On 5/23/16 11:05 AM, Merlin Moncure wrote:
>> This feature was very much a product of the time, at the height of the
>> "Object Relational" fad.  The trend for postgres has been in the exact
>> opposite direction, towards the SQL standard.  Further complicating
>> matters, inheritance has been repurposed to be the foundation for
>> table partitioning, making heavy changes problematic.

> I don't see why partitioning complicates fixing these issues. ISTM it's 
> the exact same complaint for both inheritance and partitioning.

My feeling about it is that we need to provide a partitioning feature
that doesn't rely on the current notion of inheritance at all.  We've
heard from multiple users who want to use large numbers of partitions,
enough that simply having a separate relcache entry for each partition
would be a performance problem, never mind the current approach to
planning queries over inheritance trees.  So the partitions need to be
objects much simpler than full-fledged tables.

If we had that, and encouraged people to migrate simple partitioning
use-cases to it, that might take off enough pressure that we could
afford to consider more-complicated inheritance schemes rather than
treating inheritance as an unfortunate legacy design.  But we're
some years away from being able to do that.
        regards, tom lane



Re: Inheritance

От
Joe Conway
Дата:
On 05/23/2016 03:05 PM, Tom Lane wrote:
> Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
>> I don't see why partitioning complicates fixing these issues. ISTM it's
>> the exact same complaint for both inheritance and partitioning.
>
> My feeling about it is that we need to provide a partitioning feature
> that doesn't rely on the current notion of inheritance at all.  We've
> heard from multiple users who want to use large numbers of partitions,
> enough that simply having a separate relcache entry for each partition
> would be a performance problem, never mind the current approach to
> planning queries over inheritance trees.  So the partitions need to be
> objects much simpler than full-fledged tables.

I wonder if it wouldn't make sense to define a partition as a list of
segments within a single table that represent the partition?

But then again, maybe we need to start with a clear notion of what
problems people are trying to solve when they use partitions. At least
some of the historic reasons are no longer valid.

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Re: Inheritance

От
Tom Lane
Дата:
Joe Conway <mail@joeconway.com> writes:
> But then again, maybe we need to start with a clear notion of what
> problems people are trying to solve when they use partitions. At least
> some of the historic reasons are no longer valid.

That's true.  Just because people want to have a gazillion partitions
doesn't necessarily mean it's a good design that we need to support well.
Some investigation would be a smart use of time.
        regards, tom lane



Re: Inheritance

От
Alvaro Herrera
Дата:
Tom Lane wrote:

> My feeling about it is that we need to provide a partitioning feature
> that doesn't rely on the current notion of inheritance at all.  We've
> heard from multiple users who want to use large numbers of partitions,
> enough that simply having a separate relcache entry for each partition
> would be a performance problem, never mind the current approach to
> planning queries over inheritance trees.  So the partitions need to be
> objects much simpler than full-fledged tables.

Sorry to hijack the thread, but I agree on this, and I'm worried that
the patch being floated for partitioning may paint us on a corner from
which it may be difficult to get out.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Inheritance

От
Craig Ringer
Дата:
On 24 May 2016 at 00:05, Merlin Moncure <mmoncure@gmail.com> wrote:
 

This feature was very much a product of the time, at the height of the
"Object Relational" fad.  The trend for postgres has been in the exact
opposite direction, towards the SQL standard.  Further complicating
matters, inheritance has been repurposed to be the foundation for
table partitioning, making heavy changes problematic.

Indeed. 

I find it notable that no popular ORM has bothered adopting PostgreSQL's inheritance features, and instead just use big left joins or repeated SELECTs to implement parent/child relationships, with foreign keys enforcing constraints.

I consider inheritance mostly useless without the ability to have UNIQUE indexes that span a parent relation and all its children. You can use them for partitioning only by sacrificing a bunch of integrity protection or creating messy chains of FKs between individual partitions.

I'd rather like to quietly deprecate inheritance and eventually remove it once we have real partitioning and some time has passed...

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Inheritance

От
Jim Nasby
Дата:
On 5/23/16 10:30 PM, Craig Ringer wrote:
> I find it notable that no popular ORM has bothered adopting PostgreSQL's
> inheritance features, and instead just use big left joins or repeated
> SELECTs to implement parent/child relationships, with foreign keys
> enforcing constraints.

Since when do we consider ORMs to be an example of how to do good 
database design?

In this case, I'm sure no ORM uses the feature because either the ORM's 
authors have no clue it exists (except maybe for partitioning) or 
because we're the only mainstream database that has it.

> I consider inheritance mostly useless without the ability to have UNIQUE
> indexes that span a parent relation and all its children. You can use
> them for partitioning only by sacrificing a bunch of integrity
> protection or creating messy chains of FKs between individual partitions.

There's ways around this issue, but I agree that it's a very unhappy 
situation right now.

> I'd rather like to quietly deprecate inheritance and eventually remove
> it once we have real partitioning and some time has passed...

IMO that's a very unfortunate attitude to have for the "Worlds Most 
Advanced Open Source Database". Now that OO programming isn't treated as 
a magic bullet it's proven to be an extremely powerful tool, especially 
when used to encapsulate complex data. The 
ndarray->Series->DataFrame->Panel inheritance tree is a great example of 
this. Feature-wise, we need to be doing *more* of this kind of work, not 
less. Lack of support for OO paradigms was one of the drivers for NoSQL.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461



Re: Inheritance

От
Konstantin Knizhnik
Дата:
There is one aspect of inheritance support which was not mentioned: 
polymorphic queries.
Actually polymorphism is the fundamental feature of OOP, without it 
there is no behavioral inheritance and inheritance can be considered 
just as "syntax sugar" for sharing some common subset of attributes 
between tables.

The main problem with supporting polymorphic queries is that SQL query 
returns set of tuples, not set of objects.
So there is no nice way to return both instances of based and derived 
tables. There are several alternatives
(for example return joined set of attributes in all derived tables, 
leaving missed as NULLs) but none of them is good.

There is more specific problem - polymorphic function calls, see:

http://postgresql.nabble.com/Polymorphic-function-calls-td5784772.html

which in principle can be solved... But looks like there is no such 
intention.

I do not know how much people will be interested in normal OO support 
from DBMS.
Right now PostgreQSL inheritance is used for different purposes: 
partitioning, FDW-based replication or sharding, ...
But IMHO it is mostly because current inheritance implementation just 
not make it possible to use it for OOP.
As far as I know before there were a lot of discussions whether arrays, 
json and other non-traditional for RDBMS types is needed.
But right now them are quite popular and frequently used. The same thing 
can happen with OO support: it will be popular if provide necessary 
functionality.

Right now OO DBMS applications are mostly using ORMs and them are not 
using PostgreSQL inheritance.
Mostly because ORMs are used to be more flexible in providing different 
approaches for implementing inheritance (for example in most cases 
placing all derived classes in single table works better than approach 
based on appending results of several queries).
Another reason is that ORM usually support more than one DBMS, and since 
most of them have no built-in support for inheritance,
ORM has to implement inheritance itself.
But I do not think that presence of ORM excludes necessity to have 
internal support of OO in DBMS.



On 24.05.2016 16:51, Jim Nasby wrote:
> On 5/23/16 10:30 PM, Craig Ringer wrote:
>> I find it notable that no popular ORM has bothered adopting PostgreSQL's
>> inheritance features, and instead just use big left joins or repeated
>> SELECTs to implement parent/child relationships, with foreign keys
>> enforcing constraints.
>
> Since when do we consider ORMs to be an example of how to do good 
> database design?
>
> In this case, I'm sure no ORM uses the feature because either the 
> ORM's authors have no clue it exists (except maybe for partitioning) 
> or because we're the only mainstream database that has it.
>
>> I consider inheritance mostly useless without the ability to have UNIQUE
>> indexes that span a parent relation and all its children. You can use
>> them for partitioning only by sacrificing a bunch of integrity
>> protection or creating messy chains of FKs between individual 
>> partitions.
>
> There's ways around this issue, but I agree that it's a very unhappy 
> situation right now.
>
>> I'd rather like to quietly deprecate inheritance and eventually remove
>> it once we have real partitioning and some time has passed...
>
> IMO that's a very unfortunate attitude to have for the "Worlds Most 
> Advanced Open Source Database". Now that OO programming isn't treated 
> as a magic bullet it's proven to be an extremely powerful tool, 
> especially when used to encapsulate complex data. The 
> ndarray->Series->DataFrame->Panel inheritance tree is a great example 
> of this. Feature-wise, we need to be doing *more* of this kind of 
> work, not less. Lack of support for OO paradigms was one of the 
> drivers for NoSQL.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Inheritance

От
Craig Ringer
Дата:
On 24 May 2016 at 21:51, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
On 5/23/16 10:30 PM, Craig Ringer wrote:
I find it notable that no popular ORM has bothered adopting PostgreSQL's
inheritance features, and instead just use big left joins or repeated
SELECTs to implement parent/child relationships, with foreign keys
enforcing constraints.

Since when do we consider ORMs to be an example of how to do good database design?

I don't, but I do consider them a useful measure of what the majority of users in the field are seeing and doing. 
 
I'd rather like to quietly deprecate inheritance and eventually remove
it once we have real partitioning and some time has passed...

IMO that's a very unfortunate attitude to have for the "Worlds Most Advanced Open Source Database". Now that OO programming isn't treated as a magic bullet it's proven to be an extremely powerful tool, especially when used to encapsulate complex data.

Absolutely, and I use OO heavily. But a half-assed "object relational" feature in the database that only kind-of works isn't OO, and it's generally less useful than using existing relational-to-oo modelling techniques like FK-related 1:1 child tables for specialisation.

I'm so sick of seeing "object relational" in PostgreSQL's tagline. It's nonsense. We do so much so well, why focus on this bogus half-feature so much?

 
The ndarray->Series->DataFrame->Panel inheritance tree is a great example of this. Feature-wise, we need to be doing *more* of this kind of work, not less. Lack of support for OO paradigms was one of the drivers for NoSQL.

I agree.

Note that there was enough enthusiasm to adopt whole new database engines, but not enough to use PostgreSQL's existing features for that. Partly because they suck. In particular, people looking for this tend to want to be able to create new subtypes without having to mess around making schema changes and modelling everything.

There's much more future in improving document-structured storage like jsonb, and possibly extending in future toward hybrid storage with some normal cols and some dynamic cols, than with Pg's pseudo-object-relational inheritance feature.

I really wish we could just remove it. I know it's impossible because it's used for partitioning and because there's a significant and important, albeit tiny relative to total users, group who're utterly reliant on it for their DB schemas. I know of one health engine provider who uses it heavily for example. It's a pity, because deleting the complexity around inheritance entirely would simplify a number of useful features. I'd certainly be in favour of allowing new PostgreSQL features _not_ to support tables with inheritance, making it a semi-deprecated feature that new functionality doesn't have to support. (I know that goes against the usual way of things, and I'm usually really against introducing things where X works unless you have Y or A enabled or it's a full moon on a Thursday, but IMO the benefit justifies the cost for making inheritance support optional).

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Inheritance

От
Craig Ringer
Дата:
On 24 May 2016 at 22:45, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:
There is one aspect of inheritance support which was not mentioned: polymorphic queries.
Actually polymorphism is the fundamental feature of OOP, without it there is no behavioral inheritance and inheritance can be considered just as "syntax sugar" for sharing some common subset of attributes between tables.

The main problem with supporting polymorphic queries is that SQL query returns set of tuples, not set of objects.
So there is no nice way to return both instances of based and derived tables. There are several alternatives
(for example return joined set of attributes in all derived tables, leaving missed as NULLs) but none of them is good.

Exactly. We have a sort-of-object-ish storage option, but none of the surrounding stuff to make it useful for actual OO / object-relational work.

The "joined set of attributes" approach is exactly what ORMs already do, and many direct implementations of the same idea will use too. So we'd offer no advantage over what they already do in a way that works with multiple DBMSes, except we might be able to do it faster. Maybe.

The lack of polymorphism is critical. It's not really usefully OO but it costs you important relational features if you use it. We have some very limited polymorphism in the sense that you can query the parent table and see rows in child tables, but you only get the subset of cols that exists at that level of the heirarchy.

One thing I'd like to explore one day is a nice, user-friendly way to express "SELECT this row and the corresponding sets of rows from [these tables and their children in turn] as a structured object". Right now users have to write series of LEFT JOINs and de-duplicate the left-hand sides. Or do multiple queries (n+1 selects), possibly expensively with repeated join work involved. Or they have to write pretty baroque queries to construct a jsonb object with jsonb_agg with multiple levels of group-by in subqueries-in-from. We should be able to do this for them, so they can say

SELECTOBJECT customer
  CHILD JOIN invoice ON (customer.customer_id = invoice.customer_id AND invoice_date > $1)
  CHILD JOIN invoiceline USING (invoice_id)
  CHILD JOIN address USING (customer_id)
WHERE customer.in_debt_collections;

instead of the current nested mess of aggregation and subqueries needed, like:

SELECT
    to_jsonb(customer) || (
      SELECT jsonb_build_object('invoices', jsonb_agg(invoice_obj))
      FROM (
        SELECT to_jsonb(invoice) || jsonb_build_object('invoice_lines', jsonb_agg(invoice_line))
        FROM invoice
        LEFT OUTER JOIN invoice_line ON (invoice.invoice_id = invoice_line.invoice_id)
        WHERE invoice.customer_id = customer.customer_id AND invoice_date >= current_date
        GROUP BY invoice.invoice_id
      ) invoice_obj
    ) || (
      SELECT jsonb_build_object('addresses', jsonb_agg(address))
      FROM address
      WHERE address.customer_id = customer.customer_id
    )
FROM customer
WHERE customer.in_debt_collections

which is just pure joy to read, and gets even more contorted as the layers of parent/child relationships get deeper. The resulting query plan clearly expresses the desired result, but writing it in SQL is horrible:

                                                             QUERY PLAN                                                              
-------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on customer  (cost=0.00..130226.27 rows=1270 width=64)
   SubPlan 1
     ->  Aggregate  (cost=77.47..77.48 rows=1 width=24)
           ->  Subquery Scan on invoice_obj  (cost=77.37..77.46 rows=3 width=24)
                 ->  HashAggregate  (cost=77.37..77.43 rows=3 width=104)
                       Group Key: invoice.invoice_id
                       ->  Hash Right Join  (cost=50.84..77.36 rows=3 width=104)
                             Hash Cond: (invoice_line.invoice_id = invoice.invoice_id)
                             ->  Seq Scan on invoice_line  (cost=0.00..22.00 rows=1200 width=68)
                             ->  Hash  (cost=50.80..50.80 rows=3 width=40)
                                   ->  Seq Scan on invoice  (cost=0.00..50.80 rows=3 width=40)
                                         Filter: ((customer_id = customer.customer_id) AND (invoice_date >= ('now'::cstring)::date))
   SubPlan 2
     ->  Aggregate  (cost=25.02..25.03 rows=1 width=32)
           ->  Seq Scan on address  (cost=0.00..25.00 rows=6 width=32)
                 Filter: (customer_id = customer.customer_id)
(16 rows)


Maybe grouping sets can help avoid the nested joins, but I couldn't figure out how without wrapping the grouping set output query in another query to aggregate the produced objects into a top level one.

Inheritance does nothing to help with this.
 
 
But I do not think that presence of ORM excludes necessity to have internal support of OO in DBMS.


I'd agree if it was a finished and general OO feature. I just don't think what we have now is.


--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Inheritance

От
Merlin Moncure
Дата:
On Tue, May 24, 2016 at 9:47 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
> On 24 May 2016 at 22:45, Konstantin Knizhnik <k.knizhnik@postgrespro.ru>
> wrote:
>>
>> There is one aspect of inheritance support which was not mentioned:
>> polymorphic queries.
>> Actually polymorphism is the fundamental feature of OOP, without it there
>> is no behavioral inheritance and inheritance can be considered just as
>> "syntax sugar" for sharing some common subset of attributes between tables.
>>
>> The main problem with supporting polymorphic queries is that SQL query
>> returns set of tuples, not set of objects.
>> So there is no nice way to return both instances of based and derived
>> tables. There are several alternatives
>> (for example return joined set of attributes in all derived tables,
>> leaving missed as NULLs) but none of them is good.
>
>
> Exactly. We have a sort-of-object-ish storage option, but none of the
> surrounding stuff to make it useful for actual OO / object-relational work.
>
> The "joined set of attributes" approach is exactly what ORMs already do, and
> many direct implementations of the same idea will use too. So we'd offer no
> advantage over what they already do in a way that works with multiple
> DBMSes, except we might be able to do it faster. Maybe.
>
> The lack of polymorphism is critical. It's not really usefully OO but it
> costs you important relational features if you use it. We have some very
> limited polymorphism in the sense that you can query the parent table and
> see rows in child tables, but you only get the subset of cols that exists at
> that level of the heirarchy.
>
> One thing I'd like to explore one day is a nice, user-friendly way to
> express "SELECT this row and the corresponding sets of rows from [these
> tables and their children in turn] as a structured object". Right now users
> have to write series of LEFT JOINs and de-duplicate the left-hand sides. Or
> do multiple queries (n+1 selects), possibly expensively with repeated join
> work involved. Or they have to write pretty baroque queries to construct a
> jsonb object with jsonb_agg with multiple levels of group-by in
> subqueries-in-from. We should be able to do this for them, so they can say
>
> SELECTOBJECT customer
>   CHILD JOIN invoice ON (customer.customer_id = invoice.customer_id AND
> invoice_date > $1)
>   CHILD JOIN invoiceline USING (invoice_id)
>   CHILD JOIN address USING (customer_id)
> WHERE customer.in_debt_collections;
>
> instead of the current nested mess of aggregation and subqueries needed,
> like:
>
> SELECT
>     to_jsonb(customer) || (
>       SELECT jsonb_build_object('invoices', jsonb_agg(invoice_obj))
>       FROM (
>         SELECT to_jsonb(invoice) || jsonb_build_object('invoice_lines',
> jsonb_agg(invoice_line))
>         FROM invoice
>         LEFT OUTER JOIN invoice_line ON (invoice.invoice_id =
> invoice_line.invoice_id)
>         WHERE invoice.customer_id = customer.customer_id AND invoice_date >=
> current_date
>         GROUP BY invoice.invoice_id
>       ) invoice_obj
>     ) || (
>       SELECT jsonb_build_object('addresses', jsonb_agg(address))
>       FROM address
>       WHERE address.customer_id = customer.customer_id
>     )
> FROM customer
> WHERE customer.in_debt_collections

Well, I don't know. There's a lot of ways to write that type of thing.
Personally, I tend to prefer to delay the serialization to json as
long as possible (although it's sometimes unavoidable) because it
keeps the query cleaner.  I also sometimes use the array() subquery
syntax for sake of brevity, but this query could be restructured to
use proper aggregation on all levels if you're concerned about
performance (this query would tend to underperform yours for very
large compositions because of the second subquery scan vs the hash
join OTOH, it's a faster serialization model).  I didn't test the
syntax, but you get the idea.

SELECT to_json(q)
FROM
( SELECT   c.*,   array(     SELECT       i.*,       array(         SELECT il         FROM invoice_line il
WHEREil.invoice_id = i.invoice_id       ) AS invoice_lines     FROM invoice i     WHERE i.customer_id = c.customer_id
ANDinvoice_date >= current_date   ) AS invoices FROM customer c WHERE c.in_debt_collections
 
) q

The point is this: the postgresql type system is flexible enough that
you can do arbitrary constructions pretty easy and the situation has
been one of continuous improvement over the last several releases.  It
isn't perfect, but json enhancements FWICT have made syntactical
approaches to the problem a dead end; json gets the job done is less
likely to cause problems with the SQL standard down the road.  For the
same set of reasons I no longer use crosstab.

In the 15+ years I've been watching postgres inheritance has gone
precisely nowhere, and there other ways to do the things it can do
that also supply a much broader range of use cases.  Plus, I'm biased:
I happen to think the 90's OO style of inheritance is pretty dumb :-).

merlin



Re: Inheritance

От
Jim Nasby
Дата:
On 5/24/16 8:52 PM, Craig Ringer wrote:> Absolutely, and I use OO heavily. But a half-assed "object relational">
featurein the database that only kind-of works isn't OO, and it's> generally less useful than using existing
relational-to-oomodelling> techniques like FK-related 1:1 child tables for specialisation.
 

How is it less useful than that? To me, the FK "solution" is the 
absolute worst of everything: you still have all the separate child 
tables that you must explicitly query *and* you have to get all the 
joins correct as well. And hope it doesn't have horrible performance.

> Note that there was enough enthusiasm to adopt whole new database
> engines, but not enough to use PostgreSQL's existing features for that.
> Partly because they suck. In particular, people looking for this tend to
> want to be able to create new subtypes without having to mess around
> making schema changes and modelling everything.

Which is a decision people have come to regret, because then your 
codebase somehow has to deal with 38 different versions of what a 
"customer" is.

> There's much more future in improving document-structured storage like
> jsonb, and possibly extending in future toward hybrid storage with some
> normal cols and some dynamic cols, than with Pg's
> pseudo-object-relational inheritance feature.

I don't see why we can't do both. There's many cases where more 
flexibility in what output tuples look like would be very valuable. The 
JSON features are one aspect; crosstab is another.

Postgres is well past the point where our relational features are the 
big selling point. It's now about scale, an incredibly robust storage 
engine, and all the extensiblity opportunities. We've moved from being 
an RDBMS to being a "Data Platform". Improving our OO capabilities just 
continues that.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461



Re: Inheritance

От
Craig Ringer
Дата:
On 26 May 2016 at 01:56, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
On 5/24/16 8:52 PM, Craig Ringer wrote:
> Absolutely, and I use OO heavily. But a half-assed "object relational"
> feature in the database that only kind-of works isn't OO, and it's
> generally less useful than using existing relational-to-oo modelling
> techniques like FK-related 1:1 child tables for specialisation.

How is it less useful than that? To me, the FK "solution" is the absolute worst of everything: you still have all the separate child tables that you must explicitly query *and* you have to get all the joins correct as well. And hope it doesn't have horrible performance.

Note that there was enough enthusiasm to adopt whole new database
engines, but not enough to use PostgreSQL's existing features for that.
Partly because they suck. In particular, people looking for this tend to
want to be able to create new subtypes without having to mess around
making schema changes and modelling everything.

Which is a decision people have come to regret, because then your codebase somehow has to deal with 38 different versions of what a "customer" is.

Oh, I totally agree there. It's almost as bad as people serialising Java objects into the DB. Ugh. It's a bad, bad idea.

It's also what people seem to want to do, and I understand that somewhat given the pain involved in full table rewrites under extended locks and the hoop-jumping required to avoid them. It's particularly painful with something app devs tend to need, but RDBMS designers prefer to ignore: user-defined columns/attributes. Find me someone who *doesn't* want the ability for their app users/deployers/etc to add arbitrary attributes to customer records etc w/o code changes. jsonb helps a lot there, but you lose Pg's type system and have to use a different query syntax etc.

Lower-pain ways to make schema changes and blend dynamic columns (application-user-defined columns)  with normal columns would help a lot there. A pseudo-column that can store a TOASTable record extension that's a set of colname/coltype/coltypmod . Storage is the relatively easy bit though, the problem is how to work with that though the planner and executor and output useful results...

There's much more future in improving document-structured storage like
jsonb, and possibly extending in future toward hybrid storage with some
normal cols and some dynamic cols, than with Pg's
pseudo-object-relational inheritance feature.

I don't see why we can't do both. There's many cases where more flexibility in what output tuples look like would be very valuable. The JSON features are one aspect; crosstab is another.

Agreed. A real PIVOT, especially one we can use as a source of tuples for subqueries etc, would be really useful and is the sort of thing we'd need some degree of dynamic column handling for.
 
Postgres is well past the point where our relational features are the big selling point. It's now about scale, an incredibly robust storage engine, and all the extensiblity opportunities. We've moved from being an RDBMS to being a "Data Platform". Improving our OO capabilities just continues that.

Right. I'm just not convinced table inheritance is a useful part of that picture. 


--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Inheritance

От
Jim Nasby
Дата:
On 5/25/16 8:19 PM, Craig Ringer wrote:
>     Postgres is well past the point where our relational features are
>     the big selling point. It's now about scale, an incredibly robust
>     storage engine, and all the extensiblity opportunities. We've moved
>     from being an RDBMS to being a "Data Platform". Improving our OO
>     capabilities just continues that.
>
>
> Right. I'm just not convinced table inheritance is a useful part of that
> picture.

In it's current state it's certainly not the best tool in the toolbox, 
but IMHO there's still plenty of room for the ability to support 
restricted polymorphism, because of all the benefits of not allowing 
willy-nilly stuff in the schema. There's certainly other cases (ie: 
per-customer custom fields) where willy-nilly is what you actually need.

And certainly things that reduce table modification pain for *any* 
operations are most welcome! I think allowing queued backgrounded 
processes would be a huge win there. If we had real stored procedures 
(IE: ability to control transaction state) and a modest background 
scheduler then it wouldn't be hard to build that.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461