Обсуждение: FK v.s unique indexes

Поиск
Список
Период
Сортировка

FK v.s unique indexes

От
Rafal Pietrak
Дата:
Hi,

For some time now, I'm withholding new features in my DB application as
I wasn't able to have unique constraints on partitioned tables. PG-v11
now has it and I've given it a try, but to my surprise it does not give
it fully to the application. Those indexes don't support FK! At this
point I've also checked partial indexes to see if they could support a
sort of "FK duality" I have in my datasets, but they don't either (see
below EXPLAINING).

I'd like to understand why.

I'd appreciate it if somebody could shred some light on the technical
reasons/background behind those restrictions.

EXPLAINING:
----------------------------------
psql (11beta2 (Debian 11~beta2-1))
Type "help" for help.

tst=# create table test1(load bigint, a int, b int, c bool) partition by
list (c);
CREATE TABLE
tst=# create table test1_true  partition of test1 for values in (true);
CREATE TABLE
tst=# create table test1_false  partition of test1 for values in (false);
CREATE TABLE
tst=# create unique index load ON test1 (load,a,b,c);
CREATE INDEX
tst=# create table info_text1 (load text, a int, b int, c bool, info
text,  foreign key (load,a,b,c) references test1(load,a,b,c)) ;
ERROR:  cannot reference partitioned table "test1"
----------------------------------

Why is this forbidden?

For my application I could live without partitioning. Just using partial
indexes would be enough. Still, this does not work either:
-------------------------------
tst=# create table test2(load bigint, a int, b int, c bool) ;
CREATE TABLE
tst=# create unique index test2_true ON test2 (load,a) where c is true ;
CREATE INDEX
tst=# create unique index test2_false ON test2 (load,b) where c is false;
CREATE INDEX
tst=# create table info_text2 (load text, a int, info text, more_info
text, foreign key (load,a) references test2(load,a)) ;
ERROR:  there is no unique constraint matching given keys for referenced
table "test2"
----------------------------

I cannot see any reasons why this functionality is blocked.

In particular, contrary to what the ERROR says, the target table *does
have* a "unique constraint matching given keys", admittedly only
partial. Yet, why should that matter at all? A unique index, partial or
not, always yield a single row, and that's all what matters for FK. Right?

I would very much like to understand the reasoning behind the above
restrictions (on the use of indexes for FK targets), as this would
probably help me avoid poor decisions in my database schema design. So I
would appreciate it very very much if somebody could point me to
reasoning behind such implementation.

Regards,

-R


Re: FK v.s unique indexes

От
David Rowley
Дата:
On 3 July 2018 at 19:30, Rafal Pietrak <rafal@ztk-rp.eu> wrote:
> tst=# create table test1(load bigint, a int, b int, c bool) partition by
> list (c);
> CREATE TABLE
> tst=# create table test1_true  partition of test1 for values in (true);
> CREATE TABLE
> tst=# create table test1_false  partition of test1 for values in (false);
> CREATE TABLE
> tst=# create unique index load ON test1 (load,a,b,c);
> CREATE INDEX
> tst=# create table info_text1 (load text, a int, b int, c bool, info
> text,  foreign key (load,a,b,c) references test1(load,a,b,c)) ;
> ERROR:  cannot reference partitioned table "test1"
> ----------------------------------
>
> Why is this forbidden?

I don't think there were any actual roadblocks, it was more of just
not enough time in the cycle to make it work due to a few technical
details that required extra effort to make work.

Alvaro managed to simplify the problem and allow foreign keys to be
defined on partitioned tables and get that into PG11.

So it was a case of 50% is better than 0%, which I very agree with.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: FK v.s unique indexes

От
Tim Cross
Дата:
Rafal Pietrak <rafal@ztk-rp.eu> writes:

>
> In particular, contrary to what the ERROR says, the target table *does
> have* a "unique constraint matching given keys", admittedly only
> partial. Yet, why should that matter at all? A unique index, partial or
> not, always yield a single row, and that's all what matters for FK. Right?
>
Is that correct? I would have thought that if you have a multi-key
unique index and you only provide values for some of the keys in the
index, you would have no guarantee of a single row being returned. If
this was true, then the additional keys are superfluous.

Have you tried doing the same thing where the fk keys and remote unique
index keys are equal in number?

-- 
Tim Cross


Re: FK v.s unique indexes

От
"David G. Johnston"
Дата:
On Tuesday, July 3, 2018, Rafal Pietrak <rafal@ztk-rp.eu> wrote:

ERROR:  there is no unique constraint matching given keys for referenced
table "test2"
----------------------------

I cannot see any reasons why this functionality is blocked.

In particular, contrary to what the ERROR says, the target table *does
have* a "unique constraint matching given keys", admittedly only
partial.

You are making the common error of confusing the distinct concepts of constraints and indexs.  Table constraints cannot be partial by definition, and are a logical concept constraining the data model.  Indexes are physical objects that only aid in the execution of queries.  The only crossover is that the implementation of a unique table constraint uses a full unique index as an implementation detail.

The error says 'constraint' and indeed you have nit defined a relevant constraint in your schema, just indexes.

David J.

Re: FK v.s unique indexes

От
Rafal Pietrak
Дата:

W dniu 04.07.2018 o 00:55, David G. Johnston pisze:
> On Tuesday, July 3, 2018, Rafal Pietrak <rafal@ztk-rp.eu
> <mailto:rafal@ztk-rp.eu>> wrote:
> 
> 
>     ERROR:  there is no unique constraint matching given keys for referenced
>     table "test2"
>     ----------------------------
> 
>     I cannot see any reasons why this functionality is blocked.
> 
>     In particular, contrary to what the ERROR says, the target table *does
>     have* a "unique constraint matching given keys", admittedly only
>     partial.
> 
> 
> You are making the common error of confusing the distinct concepts of
> constraints and indexs.  Table constraints cannot be partial by
> definition, and are a logical concept constraining the data model. 

Hmmm..

This does not match "my reality". Naturally I may be wrong, but the
example I've posted reflects my actual data I'm putting into the RDBMS.
That is:
1. the data has unique constraint on (load,a,b,c)
2. and the data have additional unique constraints on (load,a), provided
c is true, and (load,b) whenever c is false.

Pls consider in real life: load (a person), can have either a (a kind of
brest cancer); or b (a kind of prostrate) - this is only a cooked
example attemping to illustrate, that one may need to put additional
constraints on the entire dataset.

I'm creating partial indexes *meaning* enforcing constraints to:
1. avoid accumulation of dataset errors due to illegal data entry, and...
2. to allow for other tables to reference only parts of the main table.
Those other tables contain data relevant only to the parts of main table
they are referring. As of now, to have it FK to main table, I have to
unnecessarily keep in those other tables constant data, which is
"obvious" to them (like b & c values, which are irrelevant for unique
<load,a> cases).

But. IMHO, my usage of indexes in this case is actually setting up a
partial constraint. As I cannot use them for FK, there is no real use
for them as query execution support. And if I drop them, I'm actually
allowing data in my table, which is inconsistent with reality .... so
they actually do play a role of constraints.

So I do have data which require partial constraints. I could have make
them as sort of "checks". But it is much easier for me to just have a
partial unique index. And it's very naturally meaningful to anybody
"reading the schema" in the future. I don't think there is any harm in it.


> Indexes are physical objects that only aid in the execution of queries. 
> The only crossover is that the implementation of a unique table
> constraint uses a full unique index as an implementation detail.
> 
> The error says 'constraint' and indeed you have nit defined a relevant
> constraint in your schema, just indexes.

Yes, but my goal isn't just enforcing constraints. My goal is to make
other tables refer to the main datatable with as short FK as possible.
To make those references (in postgresql) I have to have a unique index.
Still, I cannot do that on partials..... My question is: Why? Why
postgresql have introduced such restriction. It serves no purpose.
Without it, everything would just work (I think :).


-R


Re: FK v.s unique indexes

От
Rafal Pietrak
Дата:

W dniu 03.07.2018 o 11:23, David Rowley pisze:
> On 3 July 2018 at 19:30, Rafal Pietrak <rafal@ztk-rp.eu> wrote:
[---------------]
>>
>> Why is this forbidden?
> 
> I don't think there were any actual roadblocks, it was more of just
> not enough time in the cycle to make it work due to a few technical
> details that required extra effort to make work.
> 
> Alvaro managed to simplify the problem and allow foreign keys to be
> defined on partitioned tables and get that into PG11.
> 
> So it was a case of 50% is better than 0%, which I very agree with.
> 

I see.

It's great and usefull, yes. And I'm looking forward to see the other 50% :)

Thenx!!

-R


Re: FK v.s unique indexes

От
Rob Sargent
Дата:

> On Jul 5, 2018, at 1:30 AM, Rafal Pietrak <rafal@ztk-rp.eu> wrote:
>
>
>
> W dniu 04.07.2018 o 00:55, David G. Johnston pisze:
>> On Tuesday, July 3, 2018, Rafal Pietrak <rafal@ztk-rp.eu
>> <mailto:rafal@ztk-rp.eu>> wrote:
>>
>>
>>    ERROR:  there is no unique constraint matching given keys for referenced
>>    table "test2"
>>    ----------------------------
>>
>>    I cannot see any reasons why this functionality is blocked.
>>
>>    In particular, contrary to what the ERROR says, the target table *does
>>    have* a "unique constraint matching given keys", admittedly only
>>    partial.
>>
>>
>> You are making the common error of confusing the distinct concepts of
>> constraints and indexs.  Table constraints cannot be partial by
>> definition, and are a logical concept constraining the data model.
>
> Hmmm..
>
> This does not match "my reality". Naturally I may be wrong, but the
> example I've posted reflects my actual data I'm putting into the RDBMS.
> That is:
> 1. the data has unique constraint on (load,a,b,c)
> 2. and the data have additional unique constraints on (load,a), provided
> c is true, and (load,b) whenever c is false.
>
> Pls consider in real life: load (a person), can have either a (a kind of
> brest cancer); or b (a kind of prostrate) - this is only a cooked
> example attemping to illustrate, that one may need to put additional
> constraints on the entire dataset.
>

It’s difficult enough to define a unique person (without mother and father) and certainly this weeks definition of
burdenis not likely to help matters.  If you’re main worry is data consistency you might be better off normalizing your
structure- either with separate tables per cancer type (person id, cancer specifics; unique on person) or in a single
tableone per line (person id, cancer type, cancer description; unique on person). You can reconstitue
person,breast,prostatefrom either of those.  We won’t quibble on one person having both (though remotely possible, men
doget breast cancer). 



Re: FK v.s unique indexes

От
pinker
Дата:
David Rowley-3 wrote
> I don't think there were any actual roadblocks, it was more of just
> not enough time in the cycle to make it work due to a few technical
> details that required extra effort to make work.
> 
> Alvaro managed to simplify the problem and allow foreign keys to be
> defined on partitioned tables and get that into PG11.
> 
> So it was a case of 50% is better than 0%, which I very agree with.

That's a really great news. I was waiting for this feature for many years.
Finally! Thank you guys!



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


Re: FK v.s unique indexes

От
"David G. Johnston"
Дата:
On Thursday, July 5, 2018, Rafal Pietrak <rafal@ztk-rp.eu> wrote:
W dniu 04.07.2018 o 00:55, David G. Johnston pisze:
> On Tuesday, July 3, 2018, Rafal Pietrak <rafal@ztk-rp.eu
> <mailto:rafal@ztk-rp.eu>> wrote:
>
>
>     ERROR:  there is no unique constraint matching given keys for referenced
>     table "test2"
>     ----------------------------
>
>     I cannot see any reasons why this functionality is blocked.
>
>     In particular, contrary to what the ERROR says, the target table *does
>     have* a "unique constraint matching given keys", admittedly only
>     partial.
>
>
> You are making the common error of confusing the distinct concepts of
> constraints and indexs.  Table constraints cannot be partial by
> definition, and are a logical concept constraining the data model.

Hmmm..

This does not match "my reality". Naturally I may be wrong, but the
example I've posted reflects my actual data I'm putting into the RDBMS.
That is:
1. the data has unique constraint on (load,a,b,c)
2. and the data have additional unique constraints on (load,a), provided
c is true, and (load,b) whenever c is false.

I probably generalized too much, especially since exclusion constrains can be partial, so I'll be specific here.  Foreign Keys don't have where clauses and thus cannot target partial constraints.  If you want to overcome that limitation you can write a custom trigger.

I'm sure at least some of this is simply due to desirability as opposed to some fundamental limitation, but in the end that is how the system works today.  Integer-table FK relationships are defined over the entire PK table, not a subset.

David J.

Re: FK v.s unique indexes

От
Rafal Pietrak
Дата:

W dniu 05.07.2018 o 15:18, David G. Johnston pisze:
> On Thursday, July 5, 2018, Rafal Pietrak <rafal@ztk-rp.eu
> <mailto:rafal@ztk-rp.eu>> wrote:
> 
>     W dniu 04.07.2018 o 00:55, David G. Johnston pisze:
>     > On Tuesday, July 3, 2018, Rafal Pietrak <rafal@ztk-rp.eu
>     <mailto:rafal@ztk-rp.eu>
>     > <mailto:rafal@ztk-rp.eu <mailto:rafal@ztk-rp.eu>>> wrote:
>     >
>     >
>     >     ERROR:  there is no unique constraint matching given keys for
>     referenced
>     >     table "test2"
>     >     ----------------------------
>     >
>     >     I cannot see any reasons why this functionality is blocked.
>     >
>     >     In particular, contrary to what the ERROR says, the target
>     table *does
>     >     have* a "unique constraint matching given keys", admittedly only
>     >     partial.
>     >
>     >
>     > You are making the common error of confusing the distinct concepts of
>     > constraints and indexs.  Table constraints cannot be partial by
>     > definition, and are a logical concept constraining the data model.
> 
>     Hmmm..
> 
>     This does not match "my reality". Naturally I may be wrong, but the
>     example I've posted reflects my actual data I'm putting into the RDBMS.
>     That is:
>     1. the data has unique constraint on (load,a,b,c)
>     2. and the data have additional unique constraints on (load,a), provided
>     c is true, and (load,b) whenever c is false.
> 
> 
> I probably generalized too much, especially since exclusion constrains
> can be partial, so I'll be specific here.  Foreign Keys don't have where
> clauses and thus cannot target partial constraints.  If you want to
> overcome that limitation you can write a custom trigger.

Actually. It looks like I'm a really really slow learner :(

Only now I realized how should I code the scenario in question. For
those interested, I'm currently implementing it like this: Instead of
bool column "C", I'm putting there a column, which will keep a *copy* of
value from column A or B depending on "traditional value" (true/false)
of earlier column C. Now I can have a plain ordinary unique index over
(load,C), and have it as FK target for other tables. win-win.

I haven't realized it for years (this is how long the design stays with
me to this day).

> 
> I'm sure at least some of this is simply due to desirability as opposed
> to some fundamental limitation, but in the end that is how the system
> works today.  Integer-table FK relationships are defined over the entire
> PK table, not a subset.

In my simplistic view of postgresql internal, I was thinking, that the
engine (e.g. the set of internal triggers maintaining all FK consistency
as layed down by application schema), having a row of data (freshly
inserted or updated) just looks up an index it has associated with that
particilar FK, and uses it to see if the other end "is comliant", or
"has to change, too", or whatever else.

So I was thinking, that when FK has an index to use, it shouldn't matter
if it's complete or partial.

I was thinking, that when "add constraint" cannot choose appropriate
index, may be some explicit help (like ... using <index_name>;) would be
due.

But form the above explanation I fear that there is significantly more
to the full picture than I though. I only king of hoped those
"fundamental limitations" would be something I could grasp.

Anyway, although indireclty, this thread brought me a solution. This is
good.

thenx,

-R



Re: FK v.s unique indexes

От
Rafal Pietrak
Дата:

W dniu 05.07.2018 o 10:11, Rob Sargent pisze:
[---------------]
>>
>> Pls consider in real life: load (a person), can have either a (a kind of
>> brest cancer); or b (a kind of prostrate) - this is only a cooked
>> example attemping to illustrate, that one may need to put additional
>> constraints on the entire dataset.
>>
>  
> It’s difficult enough to define a unique person (without mother and father) and certainly this weeks definition of
burdenis not likely to help matters.  If you’re main worry is data consistency you might be better off normalizing your
structure- either with separate tables per cancer type (person id, cancer specifics; unique on person) or in a single
tableone per line (person id, cancer type, cancer description; unique on person). You can reconstitue
person,breast,prostatefrom either of those.  We won’t quibble on one person having both (though remotely possible, men
doget breast cancer).
 
> 

No, no.

This was just cooked example, my reality (messaged between parties,
whose identity is stored in columns A and B, and column C keeping a flag
indicating who was the sender). I've just used gender example to limit
the discussion of whether A can be linked to something, or if B cannot
be linked to something else. It cannot, the constraints I described are
as real, as prostrate cancer never happening to a women. I tried to
eliminate from the discussion all but mathematical relations between
values of those columns. Which are:

(load,a,b,c) is unique
(load,a,true) is unique
(load,b,false) is unique

Nothing else matters.

And I need FK to such dataset. Partial.

There is one table with data only referencing (load,a, true)... and I'd
prefere that table NOT NEED to have the spurous column (c) containing
value "true" for every row in that table.

Same goes for (load,b,false).

Now it occured to me, I can do:
create table (load text, a int, b int, c int);
instead of...
create table (load text, a int, b int, c bool);

With the new layout I'll just do:
insert (load,a,b,c) values (<load>, <a>, <b>, <a>); for the cases of
"true", and...
insert (load,a,b,c) values (<load>, <a>, <b>, <b>); for other cases

(load,c) will be unique over the entire dataset.

Now I can FK to (a,c) from whenever I want. A computed column (d bool) =
(a = c) could be helpfull, but I don't think it'll be indispensable.

Sorry to mislead you.

-R


Re: FK v.s unique indexes

От
"David G. Johnston"
Дата:
On Thu, Jul 5, 2018 at 1:45 PM, Rafal Pietrak <rafal@ztk-rp.eu> wrote:
I was thinking, that when "add constraint" cannot choose appropriate
index, may be some explicit help (like ... using <index_name>;) would be
due.

​Basically all the FK trigger does is:

SELECT EXISTS(SELECT 1 FROM pk_table WHERE pk_col1 = val1 AND pk_col2 = val2)

And fails if query returns false.  The planner is still free to use the index or not to execute the query just as if you had written it by hand.  For a small table fully in memory it likely would prefer a sequential scan and it would be perfectly within its rights to do so.

David J.

Re: FK v.s unique indexes

От
Rafal Pietrak
Дата:

W dniu 05.07.2018 o 23:04, David G. Johnston pisze:
> On Thu, Jul 5, 2018 at 1:45 PM, Rafal Pietrak <rafal@ztk-rp.eu
> <mailto:rafal@ztk-rp.eu>>wrote:
> 
>     I was thinking, that when "add constraint" cannot choose appropriate
>     index, may be some explicit help (like ... using <index_name>;) would be
>     due.
> 
> 
> ​Basically all the FK trigger does is:
> 
> SELECT EXISTS(SELECT 1 FROM pk_table WHERE pk_col1 = val1 AND pk_col2 =
> val2)
> ​
> And fails if query returns false.  The planner is still free to use the
> index or not to execute the query just as if you had written it by
> hand.  For a small table fully in memory it likely would prefer a
> sequential scan and it would be perfectly within its rights to do so.
> 

I must assume, that the above "WHERE pk_col1 - val1..." is not the same
place as the one you've mentioned earlier, where "FK don't have where
clausures".

Thus, the bulk of code that maintains FK consistency "does not use where
clausures", would break on partial indexes. But in the above SELECT,
with partial indexed, that select will never fail. So it is not the
blocker... on the other hand, I would rather rewrite it along the lines of:
SELECT 1 = (SELECT count(1) FROM pk_table WHERE pk_col1 = val1 AND
pk_col2 = val2)

Naturally, if the planner choses to do a seq scan, everything would
break down - inappropriate rows would get hit; but that's a different story:
1. one may think, doing a "SELECT * FROM pk_table WHERE... INTO current"
and making all constraint trigger functions use that "current" instead
of making more lookups, could solve the problem.
2. or with some syntax help during FK creation (aka: ... add constraint
... using <index_name>) one could force the planner to always use
indicated index.

Still, whatever way to go, it is well beyond my level.

-R


Re: FK v.s unique indexes

От
"David G. Johnston"
Дата:
On Thu, Jul 5, 2018 at 2:36 PM, Rafal Pietrak <rafal@ztk-rp.eu> wrote:


W dniu 05.07.2018 o 23:04, David G. Johnston pisze:
> On Thu, Jul 5, 2018 at 1:45 PM, Rafal Pietrak <rafal@ztk-rp.eu
> <mailto:rafal@ztk-rp.eu>>wrote:
>
>     I was thinking, that when "add constraint" cannot choose appropriate
>     index, may be some explicit help (like ... using <index_name>;) would be
>     due.
>
>
> ​Basically all the FK trigger does is:
>
> SELECT EXISTS(SELECT 1 FROM pk_table WHERE pk_col1 = val1 AND pk_col2 =
> val2)
> ​
> And fails if query returns false.  The planner is still free to use the
> index or not to execute the query just as if you had written it by
> hand.  For a small table fully in memory it likely would prefer a
> sequential scan and it would be perfectly within its rights to do so.
>

I must assume, that the above "WHERE pk_col1 - val1..." is not the same
place as the one you've mentioned earlier, where "FK don't have where
clausures".

​The FK definition doesn't have a WHERE clause so the only (and all) columns used in the trigger are those defined by the constraint itself.  So, if a partial index for the above was:

UNIQUE INDEX (pk_col1, pk_col2) WHERE pktblcol3 = false;

and

FK (col1, col2) REFERENCES pk_table (pk_col1, pk_col2)​

​There is no place on the FK to reference "pktblcol3​" so that the effective trigger query would become:

WHERE pk_col1 = val2 AND pk_col2 = val2 AND pktblcol3 = ???

And without pktblcol3 more than one row could be returned (so, not really EXISTS...)
 

Thus, the bulk of code that maintains FK consistency "does not use where
clausures", would break on partial indexes. But in the above SELECT,
with partial indexed, that select will never fail. So it is not the
blocker... on the other hand, I would rather rewrite it along the lines of:
SELECT 1 = (SELECT count(1) FROM pk_table WHERE pk_col1 = val1 AND
pk_col2 = val2)

​Yeah, that's closer to reality

Naturally, if the planner choses to do a seq scan, everything would
break down - inappropriate rows would get hit; but that's a different story:
1. one may think, doing a "SELECT * FROM pk_table WHERE... INTO current"
and making all constraint trigger functions use that "current" instead
of making more lookups, could solve the problem.
2. or with some syntax help during FK creation (aka: ... add constraint
... using <index_name>) one could force the planner to always use
indicated index.

​This comes back to constraints don't directly target indexes even though indexes are used in implementation (it would be violation of scope).  They target other constraints at a logical level.  Whether the code is written exactly like that without evidence to the contrary it provides a sound mental model to operate from.

I think we might get away from this for INSERT ON CONFLICT but I'm not that well versed nor have time to look into it right now.

David J.