Обсуждение: Proposed feature: Selective Foreign Keys

Поиск

Список

Период

Сортировка

Proposed feature: Selective Foreign Keys

От

Tom Dunstan

Дата:

28 ноября 2013 г., 01:15:25

Hi all!

The Problem
-----------------
One case that traditional SQL doesn't handle very well is when you have a child entity which can be attached to a
numberof different parent entities. Examples might be comments, tags or file attachments - we might have 20 different
entitiesin the system that we would like our users to be able add comments to, but the existing solutions for mapping
thisall have downsides. 

Existing solution 1: Join tables ahoy
If I can have a list of comments on every other object in the system, and I want to have referrential integrity, then
theobvious thing to do is create a join table between each entity and the comments table. 
Pros:
 - Straight forward, traditional object-with-collection-of-child SQL structure
Cons:
 - If a parent object gets deleted here, we can't use foreign keys to delete e.g. a child comment, so we'll have to
eitherexplicitly do it as part of our delete logic or have a cleanup process to catch orphans. Or do a dance with
deletetriggers on the join tables deleting the comment. 
 - For n entities requiring comments in the system, we need n join tables.If we want both comments and e.g. tags and
likeson all of our entities, we now have 3n join tables for what should be some relatively self-contained on-the-side
data- this is could be more tables than the entire rest of the system 
 - It's difficult to create any kind of self-contained component for building applications in this scenario, as it will
needto know about every other entity in the system, or be able to poke around inside whatever ORM or data access system
thatyou have to work out what join tables it needs when running queries. 

Existing solution 2: Enter the matrix
Instead of having n join tables, let's just mash them all together, with a column per parent object, and a check
constraintto force exactly one of those columns to be set. 
Pros:
 - Less bloat in the number of tables
Cons:
 - Doesn't solve orphan problem
 - Addition of a new entity which needs comments and we now have to add another column onto it, potentially rewriting
thewhole thing 
 - Ugly

Existing solution 3: Embed the matrix
Kinda like the dependency matrix table, except that all the columns referencing potential parent objects we put into
thecomment table instead. 
Pros:
 - Everything contained in column table
 - No orphans, since cascaded deletes will now delete the actual comment
Cons:
 - Comment table now has references to every single type that it may be attached to
 - Addition of a new entity and we probably have to rewrite the comment table now

Existing solution 4: Abandon ye all referential integrity
Have a column indicating parent type and another one for the id. In the case of comments this would be directly on the
commenttable itself. In the case of something like tags that we might expect to be shared between entities, it would be
ina single join table.  
Pros:
 - Pretty self-contained
 - Data model which has neither lots of empty columns or lots of tables
 - Can make new entities "commentable" without rewriting anything
 - Because it's self-contained, can build application components that don't need to know much about the rest of your
system.For example this is the approach that the grails taggable and commentable plugins take. 
Cons:
 - No referential integrity, since we can't have a single column pointing to different tables with existing foreign key
infrastructure
 - Since there's no real db support for doing things this way, existing ORMs etc don't really know how use a single
columnto join against multiple different tables based on a discriminator or 'switch' column. 

Existing solution 5: Everything's a thing
Make your entity hierarchy have a base level object which can have comments attached, and then everything that you need
tobe "commentable" has to extend that. You can do that in an ORM, or with table inheritance in the database. 
Pros:
 - Single top-level thing to hang your data on
Cons:
 - You've polluted your object hierarchy just to hang some stuff off of the end of it rather than it being driven by
behaviours
 - You're going to be paying a performance penalty - everything that extends that base level object will now join
againstit incessantly, and you now have a global id sequence or whatever that you may not want. 

Basically none of the above handle the situation very well. The cleanest is solution 4, but lack of RI sucks.

Feature Proposal: Selective foreign keys.
-------------------------------------------------
Allow foreign keys to have where clauses. The above comment example using solution 4 might then look like then
following:

CREATE TABLE comment as (
  id bigserial primary key,
  content text not null,
  parent_entity regclass not null,
  parent_id int8
);
ALTER TABLE comment ADD CONSTRAINT comment_blog_fk FOREIGN KEY (parent_id) REFERENCES blog(id) WHERE (parent_entity =
‘blog');
ALTER TABLE comment ADD CONSTRAINT comment_event_fk FOREIGN KEY (parent_id) REFERENCES event(id) WHERE (parent_entity =
‘event');

At this point, the following things should work:
insert into blog(id, title, content) values (10, 'i hate mondays', 'so sad');
insert into event(id, title, location, date) values (20, 'my birthday party', 'local pub', now());
insert into comment(content, parent_entity, parent_id) values ('me too', 'blog', 10);
insert into comment(content, parent_entity, parent_id) values ('I love that pub', 'event', 20);

But the following would barf:
insert into comment(content, parent_entity, parent_id) values ('bad comment', 'blog', 20);
ERROR: insert or update on table "comment" violates foreign key constraint "comment_blog_fk"
  Detail: Key (parent_id)=(20) is not present in table "blog".:

Patch
-----
I have attached a proof-of-concept patch for this feature.
What it currently does:
 - Allows where clause on FK declaration
 - Validates and stores the expression into pg_constraint a la check constraints, using the same validity checks
 - Implements logic on insertion to FK table, and update / delete on PK tables as expected.
 - Regression tests handling all of the above

Things I know are still to be done, if there is consensus that this feature is worth having:
 - Handle update of columns affecting the selection expression. Currently the RI trigger doesn’t fire if the FK column
isn’tmodified. I’m not sure where the logic for this is implemented. 
 - pg_dump support
 - Update some comments in a few places referring to conbin/consrc columns of pg_constraint as relevant to check
constraintsonly 
 - Documentation
 - There are probably a few scenarios that I’m not testing in the regression tests that I should be.

Implementation Comments
----------------------------
I borrowed the exclusion constraint where clause in the grammar, which results in the where expression needing
parentheses.With a bit more work this could possibly be removed, but I’m not too upset about it tbh. 

Currently I’m creating a new executor for each call of RI_FKey_check when there is a where clause, and doing other bits
ofwork in there. It’s not 100% clear to me how expensive doing that stuff in there is, and whether it’s worth it or
possibleto cache some of it. 

Stuff added in tablecmds.c was copied from similar stuff elsewhere, but honestly without a huge amount of
understanding.Don’t know if I could do without some of it.  


Comments?


Cheers

Tom



PS Thanks to Noah Misch who answered a question on the list a while ago which got me over a hump, and to the authors of
thewiki pages around submitting a patch.

Вложения

selective-foreign-keys-v1.patch

Re: Proposed feature: Selective Foreign Keys

От

Albe Laurenz

Дата:

02 декабря 2013 г., 08:57:08

Tom Dunstan wrote:
> The Problem
> -----------------
> One case that traditional SQL doesn't handle very well is when you have a child entity which can be
> attached to a number of different parent entities. Examples might be comments, tags or file
> attachments - we might have 20 different entities in the system that we would like our users to be
> able add comments to, but the existing solutions for mapping this all have downsides.
> 
> Existing solution 1: Join tables ahoy
[...]
> 
> Existing solution 2: Enter the matrix
[...]
> 
> Existing solution 3: Embed the matrix
[...]
> 
> Existing solution 4: Abandon ye all referential integrity
[...]
> 
> Existing solution 5: Everything's a thing
[...]
> 
> Basically none of the above handle the situation very well. The cleanest is solution 4, but lack of RI
> sucks.

I personally think that options 3 and 1 are the cleanest ones, but I
agree that they are not entirely satisfying.

I could think of a sixth option:
add a nullable column to each table that need a comment, tag or whatever.
You could use the same application logic for each of these columns, but
particularly for things that are more complicated than mere comments it
might be nice to have them centralized in one table.

> Feature Proposal: Selective foreign keys.
> -------------------------------------------------
> Allow foreign keys to have where clauses. The above comment example using solution 4 might then look
> like then following:
> 
> CREATE TABLE comment as (
>   id bigserial primary key,
>   content text not null,
>   parent_entity regclass not null,
>   parent_id int8
> );
> ALTER TABLE comment ADD CONSTRAINT comment_blog_fk FOREIGN KEY (parent_id) REFERENCES blog(id) WHERE
> (parent_entity = ‘blog');
> ALTER TABLE comment ADD CONSTRAINT comment_event_fk FOREIGN KEY (parent_id) REFERENCES event(id) WHERE
> (parent_entity = ‘event');

> Comments?

I didn't read the patch and I cannot comment on how easy it would be
to implement this and what the performance impact might be.

What strikes me is that since foreign key constraints are implemented
as triggers in PostgreSQL, this solution would probably not have many
performance benefits over a self-written trigger that implements the
same functionality.  Since you need two triggers for your example,
the performance might even be worse than a single self-written trigger.

Now performance isn't everything, but that would mean that the benefit
of your proposal is entirely on the usability side.

I personally don't think that it is so difficult to write a trigger
for that functionality yourself, but I guess that the argument for
this feature rests on how coveted such a functionality would be
(to justify the trade-off in code complexity).

Maybe one measure would be to figure out if any other relational
database system has implemented such a functionality.  If there is
more than one, it might show that there is a certain demand for such
a feature.

Of course that's only circumstantial evidence; I guess that a
better measure would be how many people speak up and say
"I have always wanted that".

Yours,
Laurenz Albe

Re: Proposed feature: Selective Foreign Keys

От

Andres Freund

Дата:

02 декабря 2013 г., 10:06:56

On 2013-12-02 08:57:01 +0000, Albe Laurenz wrote:
> What strikes me is that since foreign key constraints are implemented
> as triggers in PostgreSQL, this solution would probably not have many
> performance benefits over a self-written trigger that implements the
> same functionality.  Since you need two triggers for your example,
> the performance might even be worse than a single self-written trigger.

Note that you cannot really write correct RI triggers without playing
very low level games, i.e. writing C and using special kinds of
snapshots and such.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: Proposed feature: Selective Foreign Keys

От

Florian Pflug

Дата:

02 декабря 2013 г., 12:10:44

On Dec2, 2013, at 10:06 , Andres Freund <andres@2ndquadrant.com> wrote:
> On 2013-12-02 08:57:01 +0000, Albe Laurenz wrote:
>> What strikes me is that since foreign key constraints are implemented
>> as triggers in PostgreSQL, this solution would probably not have many
>> performance benefits over a self-written trigger that implements the
>> same functionality.  Since you need two triggers for your example,
>> the performance might even be worse than a single self-written trigger.
>
> Note that you cannot really write correct RI triggers without playing
> very low level games, i.e. writing C and using special kinds of
> snapshots and such.

Very true. I'm unsure whether that's an argument in favour of extending
the built-in FK triggers, or to expose the necessary functionality at the
SQL level, though ;-)

I once tried to do the latter, in a way, by removing the need for the
cross-checking logic (which is the only real low-level game that the
built-in FK triggers play) altogether. That, unfortunately, didn't pan
out - it would have required enlarging the on-disk tuple size to be
able to record to last transaction which locked a tuple even after the
transaction completes.

A simpler way would be to provide a special command which enabled the
re-checking logic for ordinary query. Something like
 CONSTRAINT UPDATE table SET … WHERE ... CONSTRAINT DELETE FROM table WHERE …

which would execute the command with a cross-check snapshot just like
ri_trigger.c's ri_PerformCheck() does.

best regards,
Florian Pflug

Re: Proposed feature: Selective Foreign Keys

От

Tom Dunstan

Дата:

02 декабря 2013 г., 12:48:54

Hi Laurenz!

On 2 Dec 2013, at 19:27, Albe Laurenz <laurenz.albe@wien.gv.at> wrote:
> What strikes me is that since foreign key constraints are implemented
> as triggers in PostgreSQL, this solution would probably not have many
> performance benefits over a self-written trigger that implements the
> same functionality.  Since you need two triggers for your example,
> the performance might even be worse than a single self-written trigger.

Well, the main cost on insert in the FK table should be looking for matching rows in the referenced tables, which the
patchavoids for non-matching rows. So while you’ll get the overhead of N triggers firing, you only pay the expected
querycost (which will even use a partial index if you’ve got one set up). Each of the referenced tables is only
involvedin one FK, so there’s no difference in cost there. 

> Now performance isn't everything, but that would mean that the benefit
> of your proposal is entirely on the usability side.

Well, I don’t think there’s much of a performance hit, and I don’t think any of the alternatives would perform much
betterin practice, but certainly performance wasn’t  a motivating factor for this feature, it was a) correctness and b)
avoidingthe ugliness of the existing solutions.  

> I personally don't think that it is so difficult to write a trigger
> for that functionality yourself, but I guess that the argument for
> this feature rests on how coveted such a functionality would be
> (to justify the trade-off in code complexity).

The patch is pretty small so far - and more than half of it is regression tests. So there’s not much extra code
complexityIMO. I wouldn’t want to touch the FK system with anything but the lightest touch. 

Cheers

Tom

Re: Proposed feature: Selective Foreign Keys

От

Andres Freund

Дата:

02 декабря 2013 г., 12:48:57

On 2013-12-02 12:10:32 +0000, Florian Pflug wrote:
> On Dec2, 2013, at 10:06 , Andres Freund <andres@2ndquadrant.com> wrote:
> > On 2013-12-02 08:57:01 +0000, Albe Laurenz wrote:
> >> What strikes me is that since foreign key constraints are implemented
> >> as triggers in PostgreSQL, this solution would probably not have many
> >> performance benefits over a self-written trigger that implements the
> >> same functionality.  Since you need two triggers for your example,
> >> the performance might even be worse than a single self-written trigger.
> > 
> > Note that you cannot really write correct RI triggers without playing
> > very low level games, i.e. writing C and using special kinds of
> > snapshots and such.
> 
> Very true. I'm unsure whether that's an argument in favour of extending
> the built-in FK triggers, or to expose the necessary functionality at the
> SQL level, though ;-)

Pretty much neither ;). I was just commenting on the fact that I don't
think Albe's argument has much merit in the current state of
postgresql. I haven't thought sufficiently thought about the issue to
have a clear plan what I think is right.

> I once tried to do the latter, in a way, by removing the need for the
> cross-checking logic (which is the only real low-level game that the
> built-in FK triggers play) altogether. That, unfortunately, didn't pan
> out - it would have required enlarging the on-disk tuple size to be
> able to record to last transaction which locked a tuple even after the
> transaction completes.

That infrastructure kinda is there now though, in the form of multixacts
we have now. I haven't followed the idea back then, but maybe that could
be reused.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: Proposed feature: Selective Foreign Keys

От

Andrew Dunstan

Дата:

02 декабря 2013 г., 15:04:45

On 12/02/2013 05:06 AM, Andres Freund wrote:
> On 2013-12-02 08:57:01 +0000, Albe Laurenz wrote:
>> What strikes me is that since foreign key constraints are implemented
>> as triggers in PostgreSQL, this solution would probably not have many
>> performance benefits over a self-written trigger that implements the
>> same functionality.  Since you need two triggers for your example,
>> the performance might even be worse than a single self-written trigger.
> Note that you cannot really write correct RI triggers without playing
> very low level games, i.e. writing C and using special kinds of
> snapshots and such.

Yeah, I really don't think that's a feasible way to to this.

The only way I have thought of as an alternative to this proposal is to 
use a partitioned table with different FK constraints for each child. 
That's certainly doable, but not without a deal of work, and even then 
you'd be giving up certain things, such as guaranteeing the uniqueness 
of the object key, at least without a lot more work.

You can think of it this way: we currently enforce FK constraints except 
when the value being constrained is NULL (or part of it is NULL in the 
MATCH SIMPLE case). This is really a user-defined extension of the 
exception condition. I have at least one case where I could have used 
this feature and saved a significant amount of work. We wanted to apply 
FK constraints to a very large table, but grandfather in certain cases 
that didn't meet the constraint. That could have been done very simply 
using this feature.

cheers

andrew

Re: Proposed feature: Selective Foreign Keys

От

Robert Haas

Дата:

02 декабря 2013 г., 17:07:24

On Mon, Dec 2, 2013 at 10:04 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
> The only way I have thought of as an alternative to this proposal is to use
> a partitioned table with different FK constraints for each child. That's
> certainly doable, but not without a deal of work, and even then you'd be
> giving up certain things, such as guaranteeing the uniqueness of the object
> key, at least without a lot more work.
>
> You can think of it this way: we currently enforce FK constraints except
> when the value being constrained is NULL (or part of it is NULL in the MATCH
> SIMPLE case). This is really a user-defined extension of the exception
> condition. I have at least one case where I could have used this feature and
> saved a significant amount of work. We wanted to apply FK constraints to a
> very large table, but grandfather in certain cases that didn't meet the
> constraint. That could have been done very simply using this feature.

I also like this feature.   It would be really neat if a FOREIGN KEY
constraint with a WHERE clause could use a *partial* index on the
foreign table provided that the index would be guaranteed to be predOK
for all versions of the foreign key checking query.  That might be
hard to implement, though.

Whether that works or not, it seems to me that a good deal of thought
will need to be given to what dependencies get created when creating a
constraint of this type.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Proposed feature: Selective Foreign Keys

От

Tom Dunstan

Дата:

02 декабря 2013 г., 22:37:23

On 3 Dec 2013, at 01:34, Andrew Dunstan <andrew@dunslane.net> wrote:

> We wanted to apply FK constraints to a very large table, but grandfather in certain cases that didn't meet the
constraint.That could have been done very simply using this feature. 

Yeah, references to old data is the other obvious case for conditionally turning foreign keys off. I decided against
usingit as an example when trying to sell the feature as there are much easier workarounds than in the different-parent
case,and as a solution it really is a bit more ugly. But I suspect that it’s one that people have to deal with fairly
regularly.

It might encourage people to keep e.g. old ids in the same column when they should be putting them in a separate one,
butthen sometimes you don’t get to just rewrite the schema as there are legacy systems etc etc. 

Cheers

Tom

Re: Proposed feature: Selective Foreign Keys

От

Tom Dunstan

Дата:

02 декабря 2013 г., 23:08:53

On 3 Dec 2013, at 03:37, Robert Haas <robertmhaas@gmail.com> wrote:
> I also like this feature.   It would be really neat if a FOREIGN KEY
> constraint with a WHERE clause could use a *partial* index on the
> foreign table provided that the index would be guaranteed to be predOK
> for all versions of the foreign key checking query.  That might be
> hard to implement, though.

Well, with this patch, under the hood the FK query is doing (in the case of RESTRICT):

SELECT 1 FROM ONLY "public"."comment" x WHERE (the id) OPERATOR(pg_catalog.=) "parent_id" AND (parent_entity = 'event')
FORKEY SHARE OF x; 

If we stick a partial index on the column, disable seq scans and run the query, we get:

tom=# create index comment_event_id on comment (parent_id) where parent_entity = 'event';
CREATE INDEX
tom=# set enable_seqscan = off;
SET
tom=# explain SELECT 1 FROM ONLY "public"."comment" x WHERE 20 OPERATOR(pg_catalog.=) "parent_id" AND (parent_entity =
'event')FOR KEY SHARE OF x;                                      QUERY PLAN                                        
----------------------------------------------------------------------------------------LockRows  (cost=0.12..8.15
rows=1width=6)  ->  Index Scan using comment_event_id on comment x  (cost=0.12..8.14 rows=1 width=6)        Index Cond:
(20= parent_id)        Filter: (parent_entity = 'event'::commentable_entity) 
(4 rows)

Is that what you had in mind?

> Whether that works or not, it seems to me that a good deal of thought
> will need to be given to what dependencies get created when creating a
> constraint of this type.

Hmm, yeah I hadn’t thought about that. OTOH, it seems that at least some of the expected functionality works anyway:

tom=# alter table comment drop column parent_entity ;
ERROR:  cannot drop table comment column parent_entity because other objects depend on it
DETAIL:  constraint comment_blog_fk on table comment depends on table comment column parent_entity
constraint comment_event_fk on table comment depends on table comment column parent_entity

I guess those bits that I copied from the check constraint code must have included creating the appropriate pg_depend
entries.:) 

I’ll add some more checks to the regression tests.

Did you have other scenarios in mind?

Thanks

Tom

Re: Proposed feature: Selective Foreign Keys

От

Craig Ringer

Дата:

03 декабря 2013 г., 01:48:04

On 11/28/2013 09:15 AM, Tom Dunstan wrote:
> Feature Proposal: Selective foreign keys.
> -------------------------------------------------
> Allow foreign keys to have where clauses.

I haven't caught up on the discussion yet, but: if you can swing this,
it'd also be very useful for RLS, providing a remotely sane way to
handle foreign key constraints in data models where the RLS constraint
is the same on all tables and doesn't get changed (e.g. multi-tenant).

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Proposed feature: Selective Foreign Keys

От

Tom Lane

Дата:

03 декабря 2013 г., 02:07:38

Tom Dunstan <pgsql@tomd.cc> writes:
> Well, with this patch, under the hood the FK query is doing (in the case of RESTRICT):

> SELECT 1 FROM ONLY "public"."comment" x WHERE (the id) OPERATOR(pg_catalog.=) "parent_id" AND (parent_entity =
'event')FOR KEY SHARE OF x;
 

Hm.  The RI trigger code goes to extreme lengths to make sure that the
query strings it generates will invoke exactly the operators it intends
--- thus the OPERATOR(pg_catalog.=) syntax, which would otherwise be
merely overkill.  The added text you are showing above seems trivially
vulnerable to unwanted behavior and even security bugs, if executed in
say an unexpected search_path context.  I am not sure that we have the
technology to automatically make arbitrary expressions proof against that
sort of hazard, but in any case you don't seem to be trying very hard.

Another issue that would need to be thought about is trojan-horse
functions in the WHERE clause.  IIRC, RI trigger queries might run as the
owner of either the referencing or the referenced tables.  If those two
don't fully trust each other then this is opening the gates for mischief.
        regards, tom lane

Re: Proposed feature: Selective Foreign Keys

От

Tom Dunstan

Дата:

03 декабря 2013 г., 03:17:24

On 3 Dec 2013, at 12:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Tom Dunstan <pgsql@tomd.cc> writes:
>> Well, with this patch, under the hood the FK query is doing (in the case of RESTRICT):
>
>> SELECT 1 FROM ONLY "public"."comment" x WHERE (the id) OPERATOR(pg_catalog.=) "parent_id" AND (parent_entity =
'event')FOR KEY SHARE OF x; 
>
> Hm.  The RI trigger code goes to extreme lengths to make sure that the
> query strings it generates will invoke exactly the operators it intends
> --- thus the OPERATOR(pg_catalog.=) syntax, which would otherwise be
> merely overkill.  The added text you are showing above seems trivially
> vulnerable to unwanted behavior and even security bugs, if executed in
> say an unexpected search_path context.  I am not sure that we have the
> technology to automatically make arbitrary expressions proof against that
> sort of hazard, but in any case you don't seem to be trying very hard.

Not trying at all.

The source for that clause has come out of deparse_expression - it doesn’t seem like it would be totally impossible to
createa deparse_expression_qualified which could get get_rule_expr to emit fully qualified operators, functions etc. I
cansee that my regression testing to test for different expressions is going to get quite large to guard against this
stuff.Thanks for the pointer. 

> Another issue that would need to be thought about is trojan-horse
> functions in the WHERE clause.  IIRC, RI trigger queries might run as the
> owner of either the referencing or the referenced tables.  If those two
> don't fully trust each other then this is opening the gates for mischief.

On insert / update in the table with the fk, the where clause is evaluated separately first against the current trigger
tuple.If the check passed, we then do the check against the referenced table (as that table’s owner), which is
unchangedfrom now and does not have a where clause. 

For updates / deletes on the referenced table, the queries are run against the FK table, as that table’s owner, with
theextra where clause. But given that the FK is defined on that table, I’m not sure that I see a problem - presumably
thetable’s owner was the one who created the FK, right? 

Am I missing a scenario here?

Thanks

Tom

Re: Proposed feature: Selective Foreign Keys

От

Robert Haas

Дата:

03 декабря 2013 г., 14:54:58

On Mon, Dec 2, 2013 at 6:08 PM, Tom Dunstan <pgsql@tomd.cc> wrote:
> On 3 Dec 2013, at 03:37, Robert Haas <robertmhaas@gmail.com> wrote:
>> I also like this feature.   It would be really neat if a FOREIGN KEY
>> constraint with a WHERE clause could use a *partial* index on the
>> foreign table provided that the index would be guaranteed to be predOK
>> for all versions of the foreign key checking query.  That might be
>> hard to implement, though.
>
> Well, with this patch, under the hood the FK query is doing (in the case of RESTRICT):
>
> SELECT 1 FROM ONLY "public"."comment" x WHERE (the id) OPERATOR(pg_catalog.=) "parent_id" AND (parent_entity =
'event')FOR KEY SHARE OF x;
 
>
> If we stick a partial index on the column, disable seq scans and run the query, we get:
>
> tom=# create index comment_event_id on comment (parent_id) where parent_entity = 'event';
> CREATE INDEX
> tom=# set enable_seqscan = off;
> SET
> tom=# explain SELECT 1 FROM ONLY "public"."comment" x WHERE 20 OPERATOR(pg_catalog.=) "parent_id" AND (parent_entity
='event') FOR KEY SHARE OF x;
 
>                                        QUERY PLAN
> ----------------------------------------------------------------------------------------
>  LockRows  (cost=0.12..8.15 rows=1 width=6)
>    ->  Index Scan using comment_event_id on comment x  (cost=0.12..8.14 rows=1 width=6)
>          Index Cond: (20 = parent_id)
>          Filter: (parent_entity = 'event'::commentable_entity)
> (4 rows)
>
> Is that what you had in mind?

Yeah, more or less, but the key is ensuring that it wouldn't let you
create the constraint in the first place if the partial index
specified *didn't* match the WHERE clause.  For example, suppose the
partial index says WHERE parent_entity = 'event' but the constraint
definition is WHERE parent_event = 'somethingelse'.  That ought to
fail, just as creating a regular foreign constraint will fail if
there's no matching unique index.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Proposed feature: Selective Foreign Keys

От

Tom Dunstan

Дата:

03 декабря 2013 г., 22:58:02

On 4 December 2013 01:24, Robert Haas <robertmhaas@gmail.com> wrote:
> Yeah, more or less, but the key is ensuring that it wouldn't let you
> create the constraint in the first place if the partial index
> specified *didn't* match the WHERE clause.  For example, suppose the
> partial index says WHERE parent_entity = 'event' but the constraint
> definition is WHERE parent_event = 'somethingelse'.  That ought to
> fail, just as creating a regular foreign constraint will fail if
> there's no matching unique index.

The where clause only applies to queries against the FK table, and we
don’t currently fail if there isn’t a matching index on the fk column
when creating a FK (I’ve been bitten by that before).

We fail if there isn’t a unique index on the referenced
table/column(s), but queries against that table on insert/update not
the FK table are unchanged (save that we don’t bother with them at all
if the where clause expression fails for the given tuple).

Cheers

Tom

Re: Proposed feature: Selective Foreign Keys

От

Robert Haas

Дата:

04 декабря 2013 г., 16:25:26

On Tue, Dec 3, 2013 at 5:57 PM, Tom Dunstan <pgsql@tomd.cc> wrote:
> On 4 December 2013 01:24, Robert Haas <robertmhaas@gmail.com> wrote:
>> Yeah, more or less, but the key is ensuring that it wouldn't let you
>> create the constraint in the first place if the partial index
>> specified *didn't* match the WHERE clause.  For example, suppose the
>> partial index says WHERE parent_entity = 'event' but the constraint
>> definition is WHERE parent_event = 'somethingelse'.  That ought to
>> fail, just as creating a regular foreign constraint will fail if
>> there's no matching unique index.
>
> The where clause only applies to queries against the FK table, and we
> don’t currently fail if there isn’t a matching index on the fk column
> when creating a FK (I’ve been bitten by that before).
>
> We fail if there isn’t a unique index on the referenced
> table/column(s), but queries against that table on insert/update not
> the FK table are unchanged (save that we don’t bother with them at all
> if the where clause expression fails for the given tuple).

Oh.  I misinterpreted what this feature was about, then.  I thought it
was about restricting the reference to a subset of the *referenced*
table, but it seems to be about restricting the constraint to a subset
of the *referencing* table.  I guess they're both useful, but the
syntax...

REFERENCES tab(col) WHERE (stuff)

...sure looks like the WHERE clause is syntactically associated with
the table being referenced.  What would we do if we eventually wanted
to support both variants?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Proposed feature: Selective Foreign Keys

От

Andrew Dunstan

Дата:

04 декабря 2013 г., 16:45:11

On 12/04/2013 11:25 AM, Robert Haas wrote:
> On Tue, Dec 3, 2013 at 5:57 PM, Tom Dunstan <pgsql@tomd.cc> wrote:
>> On 4 December 2013 01:24, Robert Haas <robertmhaas@gmail.com> wrote:
>>> Yeah, more or less, but the key is ensuring that it wouldn't let you
>>> create the constraint in the first place if the partial index
>>> specified *didn't* match the WHERE clause.  For example, suppose the
>>> partial index says WHERE parent_entity = 'event' but the constraint
>>> definition is WHERE parent_event = 'somethingelse'.  That ought to
>>> fail, just as creating a regular foreign constraint will fail if
>>> there's no matching unique index.
>> The where clause only applies to queries against the FK table, and we
>> don’t currently fail if there isn’t a matching index on the fk column
>> when creating a FK (I’ve been bitten by that before).
>>
>> We fail if there isn’t a unique index on the referenced
>> table/column(s), but queries against that table on insert/update not
>> the FK table are unchanged (save that we don’t bother with them at all
>> if the where clause expression fails for the given tuple).
> Oh.  I misinterpreted what this feature was about, then.  I thought it
> was about restricting the reference to a subset of the *referenced*
> table, but it seems to be about restricting the constraint to a subset
> of the *referencing* table.  I guess they're both useful, but the
> syntax...
>
> REFERENCES tab(col) WHERE (stuff)
>
> ...sure looks like the WHERE clause is syntactically associated with
> the table being referenced.  What would we do if we eventually wanted
> to support both variants?
>


Well I guess we could say something like:
   FOREIGN KEY (a-col) WHERE (a-condition) REFERENCES b(b-col) WHERE   (b-condition)


But it's somewhat ugly.

The case of restricting the allowed referent rows does look slightly
like a solution in search of a problem, but maybe that's just because I
haven't thought of a use for it yet.

cheers

andrew

Re: Proposed feature: Selective Foreign Keys

От

Robert Haas

Дата:

04 декабря 2013 г., 17:00:13

On Wed, Dec 4, 2013 at 11:44 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> Oh.  I misinterpreted what this feature was about, then.  I thought it
>> was about restricting the reference to a subset of the *referenced*
>> table, but it seems to be about restricting the constraint to a subset
>> of the *referencing* table.  I guess they're both useful, but the
>> syntax...
>>
>> REFERENCES tab(col) WHERE (stuff)
>>
>> ...sure looks like the WHERE clause is syntactically associated with
>> the table being referenced.  What would we do if we eventually wanted
>> to support both variants?
>>
>
> Well I guess we could say something like:
>
>    FOREIGN KEY (a-col) WHERE (a-condition) REFERENCES b(b-col) WHERE
>    (b-condition)
>
> But it's somewhat ugly.
>
> The case of restricting the allowed referent rows does look slightly like a
> solution in search of a problem, but maybe that's just because I haven't
> thought of a use for it yet.

Interestingly, the variant for which you can't think of a use case is
the one I've missed most.  Typical examples in my experience are
things like project.project_manager_id references person (id) where
person.is_project_manager, or alert (device_id) references device (id)
where not device.deleted.

Different strokes for different folks, I guess.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Proposed feature: Selective Foreign Keys

От

Andrew Dunstan

Дата:

04 декабря 2013 г., 17:19:19

On 12/04/2013 12:00 PM, Robert Haas wrote:
> On Wed, Dec 4, 2013 at 11:44 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
>>> Oh.  I misinterpreted what this feature was about, then.  I thought it
>>> was about restricting the reference to a subset of the *referenced*
>>> table, but it seems to be about restricting the constraint to a subset
>>> of the *referencing* table.  I guess they're both useful, but the
>>> syntax...
>>>
>>> REFERENCES tab(col) WHERE (stuff)
>>>
>>> ...sure looks like the WHERE clause is syntactically associated with
>>> the table being referenced.  What would we do if we eventually wanted
>>> to support both variants?
>>>
>> Well I guess we could say something like:
>>
>>     FOREIGN KEY (a-col) WHERE (a-condition) REFERENCES b(b-col) WHERE
>>     (b-condition)
>>
>> But it's somewhat ugly.
>>
>> The case of restricting the allowed referent rows does look slightly like a
>> solution in search of a problem, but maybe that's just because I haven't
>> thought of a use for it yet.
> Interestingly, the variant for which you can't think of a use case is
> the one I've missed most.  Typical examples in my experience are
> things like project.project_manager_id references person (id) where
> person.is_project_manager, or alert (device_id) references device (id)
> where not device.deleted.
>

OK, those make sense. I wonder whether this should be done via a USING 
clause on the constraint that pointed to the partial unique index. Or 
would that be too obscure?

cheers

andrew

Re: Proposed feature: Selective Foreign Keys

От

Robert Haas

Дата:

04 декабря 2013 г., 17:55:44

On Wed, Dec 4, 2013 at 12:18 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> Interestingly, the variant for which you can't think of a use case is
>> the one I've missed most.  Typical examples in my experience are
>> things like project.project_manager_id references person (id) where
>> person.is_project_manager, or alert (device_id) references device (id)
>> where not device.deleted.
>>
> OK, those make sense. I wonder whether this should be done via a USING
> clause on the constraint that pointed to the partial unique index. Or would
> that be too obscure?

I wondered that, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Proposed feature: Selective Foreign Keys

От

Tom Lane

Дата:

04 декабря 2013 г., 19:40:09

Andrew Dunstan <andrew@dunslane.net> writes:
>>> Well I guess we could say something like:
>>> 
>>> FOREIGN KEY (a-col) WHERE (a-condition) REFERENCES b(b-col) WHERE
>>> (b-condition)
>>> 
>>> But it's somewhat ugly.

> OK, those make sense. I wonder whether this should be done via a USING 
> clause on the constraint that pointed to the partial unique index. Or 
> would that be too obscure?

I like what you have above.  Yeah, it requires the more verbose syntax
for declaring a foreign key, but this feature is not going to be so
heavily used that anyone will be in danger of worsening their carpal
tunnel syndrome.
        regards, tom lane

Re: Proposed feature: Selective Foreign Keys

От

Andrew Dunstan

Дата:

04 декабря 2013 г., 20:11:45

On 12/04/2013 02:40 PM, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>>>> Well I guess we could say something like:
>>>>
>>>> FOREIGN KEY (a-col) WHERE (a-condition) REFERENCES b(b-col) WHERE
>>>> (b-condition)
>>>>
>>>> But it's somewhat ugly.
>> OK, those make sense. I wonder whether this should be done via a USING
>> clause on the constraint that pointed to the partial unique index. Or
>> would that be too obscure?
> I like what you have above.  Yeah, it requires the more verbose syntax
> for declaring a foreign key, but this feature is not going to be so
> heavily used that anyone will be in danger of worsening their carpal
> tunnel syndrome.
>
>             


Fair enough. I guess in terms of *this* feature TomD would then need to 
adjust the location of his WHERE clause so it's before the REFERENCES 
clause.

cheers

andrew

Re: Proposed feature: Selective Foreign Keys

От

Tom Dunstan

Дата:

04 декабря 2013 г., 23:20:06

On 5 Dec 2013, at 06:10, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>>>> Well I guess we could say something like:
>>>>
>>>> FOREIGN KEY (a-col) WHERE (a-condition) REFERENCES b(b-col) WHERE
>>>> (b-condition)
>>>>
>
> I like what you have above.

Yeah. Given both the apparent ambiguity of the current placement, and the fact that the current placement would be
rightwhere you’d put a where clause on the referenced table, that’s the only sane way to do it. And it’s not so bad. 

Cheers

Tom

Re: Proposed feature: Selective Foreign Keys

От

Tom Dunstan

Дата:

04 декабря 2013 г., 23:26:54

On 5 Dec 2013, at 03:48, Andrew Dunstan <andrew@dunslane.net> wrote:
>>> Well I guess we could say something like:
>>>
>>>    FOREIGN KEY (a-col) WHERE (a-condition) REFERENCES b(b-col) WHERE
>>>    (b-condition)
>>>
>
> OK, those make sense. I wonder whether this should be done via a USING clause on the constraint that pointed to the
partialunique index. Or would that be too obscure? 

Well you could put a USING clause on the end and it would read pretty unambiguously. Requiring that the user specify it
ratherthan trying to guess which index to use would also probably be an easier path to getting that feature in, at
leastfor a first cut. 

I won’t be doing work towards putting a where clause on the referenced side just yet, though. One thing at a time.

Cheers

Tom

Re: Proposed feature: Selective Foreign Keys

От

Gavin Wahl

Дата:

20 декабря 2013 г., 03:29:54

This a great solution to this problem, one I've found to be very common in web development. The technique will work to add RI to Django's generic foreign keys[1], which are implemented with an id column and a type-flag column.

[1]: https://docs.djangoproject.com/en/dev/ref/contrib/contenttypes/#generic-relations

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Proposed feature: Selective Foreign Keys

Вложения