Обсуждение: Proposal: Add a UNIQUE NOT ENFORCED constraint

Поиск
Список
Период
Сортировка

Proposal: Add a UNIQUE NOT ENFORCED constraint

От
Jacob Jackson
Дата:
I am working through a case that requires very high throughput inserts
that would be slowed down by a unique index, but that has externally
guaranteed unique values and could benefit from the query planning and
documentation provided by a unique constraint.

To fill this need, I propose adding a UNIQUE NOT ENFORCED constraint,
which would tell Postgres to assume uniqueness without creating a
unique index to enforce it, leaving ensuring uniqueness to the
developer, similar to unenforced CHECK and foreign key constraints
(https://commitfest.postgresql.org/patch/5285/).

-- Reasoning --

In many cases, the idiomatic/generally best way to write a query
requires a uniqueness check (a SELECT WHERE ... = ANY()/IN or really
any semi-join when optimized to an inner join, UNION, etc), meaning a
Unique/HashAggregate node will be added, increasing overhead unless
there is an explicit unique constraint. An unenforced unique
constraint would allow developers to use their knowledge of the
data/previous validation procedures to eliminate the extra node. This
would also help with documentation/observability tools by providing
more context on the data without adding overhead.

A number of SQL databases already implement this, not only including
most data warehouses but also many more traditional databases (Db2
(https://www.ibm.com/docs/en/ias?topic=constraints-informational),
SingleStore (https://docs.singlestore.com/cloud/create-a-database/specifying-unenforced-unique-constraints/),
etc).

-- Implementation --

The hardest part of implementing this (as far as I can tell) would be
updating the query planner to consider these unenforced constraints. I
see two ways to do that:

1. Modify relation/query_has_unique_index (or some other function in
the chain to determine uniqueness) to consider constraints instead of
just indexes. At plan time, Postgres could include unenforced unique
constraints in RelOptInfo to be retrieved by the relevant function.

2. Create a new "dummy index" index type. This would not include any
update triggers and would have an infinite cost to prevent usage in
query planning, but it would still serve the purpose of proving the
existence of a unique index.

I am leaning towards the first solution because it would be much
simpler and less hacky to implement, although it would lose the
flexibility given by custom unique indexes. On the other hand, a dummy
index as described in the 2nd solution would likely require much more
code. It also might be less intuitive because it redefines what an
index means in that specific case, and would expand unique indexes
into their own type instead of being a type of B-Tree index.

Most other implementation details should be fairly similar to other
unenforced constraints, with some differences when altering
constraints. Unlike other unenforced constraints, we can't mark the
constraint as NOT VALID, because a NOT VALID constraint doesn't
validate any data already inserted, and an enforced UNIQUE must be
able to scan all data to build the unique index and properly validate
uniqueness.

Instead, converting a unique constraint will follow the same logic as
adding a unique constraint and either block for an extended period to
create the index when altering the constraint, or create the index in
parallel and then alter the constraint.

Unenforced unique constraints also differ from other unenforced
constraints in that their use in query planning can yield poor
results, rather than just worsening planning estimates. This problem
should be clearly documented, and error handling will likely need some
changes, but ultimately, it is not fundamentally different from the
many other ways that developers can write faulty queries.

Let me know what you think.

Thanks!
Jacob



Re: Proposal: Add a UNIQUE NOT ENFORCED constraint

От
Jacob Jackson
Дата:
If anyone has thoughts on the feasibility, outward design, or
implementation plans for this proposal, I would appreciate any
feedback you have.

Thanks again,
Jacob


On Sun, Jan 4, 2026 at 12:34 PM Jacob Jackson <jej.jackson.08@gmail.com> wrote:
>
> I am working through a case that requires very high throughput inserts
> that would be slowed down by a unique index, but that has externally
> guaranteed unique values and could benefit from the query planning and
> documentation provided by a unique constraint.
>
> To fill this need, I propose adding a UNIQUE NOT ENFORCED constraint,
> which would tell Postgres to assume uniqueness without creating a
> unique index to enforce it, leaving ensuring uniqueness to the
> developer, similar to unenforced CHECK and foreign key constraints
> (https://commitfest.postgresql.org/patch/5285/).
>
> -- Reasoning --
>
> In many cases, the idiomatic/generally best way to write a query
> requires a uniqueness check (a SELECT WHERE ... = ANY()/IN or really
> any semi-join when optimized to an inner join, UNION, etc), meaning a
> Unique/HashAggregate node will be added, increasing overhead unless
> there is an explicit unique constraint. An unenforced unique
> constraint would allow developers to use their knowledge of the
> data/previous validation procedures to eliminate the extra node. This
> would also help with documentation/observability tools by providing
> more context on the data without adding overhead.
>
> A number of SQL databases already implement this, not only including
> most data warehouses but also many more traditional databases (Db2
> (https://www.ibm.com/docs/en/ias?topic=constraints-informational),
> SingleStore (https://docs.singlestore.com/cloud/create-a-database/specifying-unenforced-unique-constraints/),
> etc).
>
> -- Implementation --
>
> The hardest part of implementing this (as far as I can tell) would be
> updating the query planner to consider these unenforced constraints. I
> see two ways to do that:
>
> 1. Modify relation/query_has_unique_index (or some other function in
> the chain to determine uniqueness) to consider constraints instead of
> just indexes. At plan time, Postgres could include unenforced unique
> constraints in RelOptInfo to be retrieved by the relevant function.
>
> 2. Create a new "dummy index" index type. This would not include any
> update triggers and would have an infinite cost to prevent usage in
> query planning, but it would still serve the purpose of proving the
> existence of a unique index.
>
> I am leaning towards the first solution because it would be much
> simpler and less hacky to implement, although it would lose the
> flexibility given by custom unique indexes. On the other hand, a dummy
> index as described in the 2nd solution would likely require much more
> code. It also might be less intuitive because it redefines what an
> index means in that specific case, and would expand unique indexes
> into their own type instead of being a type of B-Tree index.
>
> Most other implementation details should be fairly similar to other
> unenforced constraints, with some differences when altering
> constraints. Unlike other unenforced constraints, we can't mark the
> constraint as NOT VALID, because a NOT VALID constraint doesn't
> validate any data already inserted, and an enforced UNIQUE must be
> able to scan all data to build the unique index and properly validate
> uniqueness.
>
> Instead, converting a unique constraint will follow the same logic as
> adding a unique constraint and either block for an extended period to
> create the index when altering the constraint, or create the index in
> parallel and then alter the constraint.
>
> Unenforced unique constraints also differ from other unenforced
> constraints in that their use in query planning can yield poor
> results, rather than just worsening planning estimates. This problem
> should be clearly documented, and error handling will likely need some
> changes, but ultimately, it is not fundamentally different from the
> many other ways that developers can write faulty queries.
>
> Let me know what you think.
>
> Thanks!
> Jacob



Re: Proposal: Add a UNIQUE NOT ENFORCED constraint

От
"Joel Jacobson"
Дата:
On Sun, Jan 4, 2026, at 18:34, Jacob Jackson wrote:
> I am working through a case that requires very high throughput inserts
> that would be slowed down by a unique index, but that has externally
> guaranteed unique values and could benefit from the query planning and
> documentation provided by a unique constraint.
>
> To fill this need, I propose adding a UNIQUE NOT ENFORCED constraint,
> which would tell Postgres to assume uniqueness without creating a
> unique index to enforce it, leaving ensuring uniqueness to the
> developer, similar to unenforced CHECK and foreign key constraints
> (https://commitfest.postgresql.org/patch/5285/).

Can you please share some more details on your use-case?
I'm curious to learn more about this "unique" column.
Don't you ever need to do lookups/joins on it (e.g. WHERE col = ?)?

/Joel



Re: Proposal: Add a UNIQUE NOT ENFORCED constraint

От
Neil Chen
Дата:
Hi Jacob,

On Mon, Jan 5, 2026 at 1:34 AM Jacob Jackson <jej.jackson.08@gmail.com> wrote:

In many cases, the idiomatic/generally best way to write a query
requires a uniqueness check (a SELECT WHERE ... = ANY()/IN or really
any semi-join when optimized to an inner join, UNION, etc), meaning a
Unique/HashAggregate node will be added, increasing overhead unless
there is an explicit unique constraint. An unenforced unique
constraint would allow developers to use their knowledge of the
data/previous validation procedures to eliminate the extra node. This
would also help with documentation/observability tools by providing
more context on the data without adding overhead.


I'm not very familiar with the UNIQUE NOT ENFORCED constraint, so apologies if I make any mistakes here. If we want the query planner to generate an execution plan as if a column were unique, would setting n_distinct = -1 in the table statistics achieve the same effect? 

Re: Proposal: Add a UNIQUE NOT ENFORCED constraint

От
Matthias van de Meent
Дата:
On Sun, 4 Jan 2026, 18:34 Jacob Jackson, <jej.jackson.08@gmail.com> wrote:
>
> I am working through a case that requires very high throughput inserts
> that would be slowed down by a unique index, but that has externally
> guaranteed unique values and could benefit from the query planning and
> documentation provided by a unique constraint.
>
> To fill this need, I propose adding a UNIQUE NOT ENFORCED constraint,
> which would tell Postgres to assume uniqueness without creating a
> unique index to enforce it, leaving ensuring uniqueness to the
> developer, similar to unenforced CHECK and foreign key constraints
> (https://commitfest.postgresql.org/patch/5285/).
>
> -- Reasoning --
>
> In many cases, the idiomatic/generally best way to write a query
> requires a uniqueness check (a SELECT WHERE ... = ANY()/IN or really
> any semi-join when optimized to an inner join, UNION, etc), meaning a
> Unique/HashAggregate node will be added, increasing overhead unless
> there is an explicit unique constraint. An unenforced unique
> constraint would allow developers to use their knowledge of the
> data/previous validation procedures to eliminate the extra node.

Unenforced constraints should be considered invalid (how would we
prove they're valid?), and because the planner is not allowed to rely
on invalid constraints, why would the planner be allowed to remove
these nodes for unenforced-and-therefore-invalid unique constraints?

It's quite likely that the query output would be incorrect when we
expected the data to be unique for unenforced constraints, while the
data in the table isn't actually unique for that unenforced unique
constraint; and knowingly giving incorrect results is not something I
can agree to.

> This
> would also help with documentation/observability tools by providing
> more context on the data without adding overhead.

Adding such constraints as metadata (without backing index) seems
fine, but we can't use the constraint in the planner, nor in the
executor, for the reason mentioned above.

> A number of SQL databases already implement this, not only including
> most data warehouses but also many more traditional databases (Db2
> (https://www.ibm.com/docs/en/ias?topic=constraints-informational),
> SingleStore (https://docs.singlestore.com/cloud/create-a-database/specifying-unenforced-unique-constraints/),
> etc).

I'm concerned they don't actually comply with the SQL standard when
they process data from those tables, if they assume unenforced unique
constraints to always be valid.

> 2. Create a new "dummy index" index type. This would not include any
> update triggers and would have an infinite cost to prevent usage in
> query planning, but it would still serve the purpose of proving the
> existence of a unique index.

It would be faster than a btree, but it'd still have the issue of
adding the overhead of column projection costs, and it'd be a (false?)
target for LR's replication identity, which I'm concerned about.

> Unenforced unique constraints also differ from other unenforced
> constraints in that their use in query planning can yield poor
> results, rather than just worsening planning estimates.

Does a table with unenforced constraints produce different plans than
one without those unenforced constraints? That seems wrong to me;
there is no material difference to the information we have on such a
table that we can consider true.

> This problem
> should be clearly documented, and error handling will likely need some
> changes, but ultimately, it is not fundamentally different from the
> many other ways that developers can write faulty queries.
>
> Let me know what you think.

I think this feature would be a loaded footgun, especially if the
planner starts to consider unenforced constraints as valid.

Kind regards,

Matthias van de Meent
Databricks (https://www.databricks.com)



Re: Proposal: Add a UNIQUE NOT ENFORCED constraint

От
Peter Eisentraut
Дата:
On 07.01.26 14:21, Matthias van de Meent wrote:
>> In many cases, the idiomatic/generally best way to write a query
>> requires a uniqueness check (a SELECT WHERE ... = ANY()/IN or really
>> any semi-join when optimized to an inner join, UNION, etc), meaning a
>> Unique/HashAggregate node will be added, increasing overhead unless
>> there is an explicit unique constraint. An unenforced unique
>> constraint would allow developers to use their knowledge of the
>> data/previous validation procedures to eliminate the extra node.
> Unenforced constraints should be considered invalid (how would we
> prove they're valid?), and because the planner is not allowed to rely
> on invalid constraints, why would the planner be allowed to remove
> these nodes for unenforced-and-therefore-invalid unique constraints?
> 
> It's quite likely that the query output would be incorrect when we
> expected the data to be unique for unenforced constraints, while the
> data in the table isn't actually unique for that unenforced unique
> constraint; and knowingly giving incorrect results is not something I
> can agree to.

Some other SQL implementations have optional implementation-specific 
modes on top of NOT ENFORCED that mean, the constraint is not enforced, 
but you can assume it holds for query planning purposes.  This is 
probably what the thinking here was, but that is not how PostgreSQL 
currently works, so additional work would need to be done to add this 
additional mode.

There are probably also multiple levels to this.  It's one thing to use 
an unenforced-but-probably-true constraint for things like selectivity 
estimation, where you just get a worse plan if it's not quite true, but 
it's another to assume something is unique when it might not be.  So 
there is some research to be done here.




Re: Proposal: Add a UNIQUE NOT ENFORCED constraint

От
Jacob Jackson
Дата:
On Tue, Jan 6, 2026 at 1:44 AM Joel Jacobson <joel@compiler.org> wrote:
> Can you please share some more details on your use-case?
> I'm curious to learn more about this "unique" column.
> Don't you ever need to do lookups/joins on it (e.g. WHERE col = ?)

The unique column (technically 2 columns that combined are unique)
functions as a pretty typical id. Yes, I do run joins on it, but the
performance advantage of an additional index is outweighed by the
ingest performance hit due to the cardinality of the outer relation
(hash joins are typically used) and the relatively small row size.



Re: Proposal: Add a UNIQUE NOT ENFORCED constraint

От
Jacob Jackson
Дата:
On Wed, Jan 7, 2026 at 5:22 AM Neil Chen <carpenter.nail.cz@gmail.com> wrote:
> If we want the query planner to generate an execution plan as if a column were unique, would setting n_distinct = -1
inthe table statistics achieve the same effect? 

Setting n_distinct is less clear in describing the data (it isn't tied
to the table schema itself and can be ambiguous in whether values are
actually totally unique or just close enough, which can complicate
things), and, because it only impacts statistics estimations, it still
doesn't help in queries where uniqueness is required. Postgres will
still add a node to ensure uniqueness in that case. There are also
some other limitations with n_distinct (e.g. extended stats are
required for multi-column pairs, which further complicates
documentation and adds complexity/overhead).



Re: Proposal: Add a UNIQUE NOT ENFORCED constraint

От
Jacob Jackson
Дата:
On Wed, Jan 7, 2026 at 8:21 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
> I'm concerned they don't actually comply with the SQL standard when
> they process data from those tables, if they assume unenforced unique
> constraints to always be valid.

If you are referring to how they violate rules around consistent query
results, some of them include a TRUSTED/RELY option (see the
SingleStore link) that allows developers to enable the
documentation/statistics features without potentially changing results
if the data isn't truly unique. That could be implemented similarly in
Postgres to force developers to opt in to potentially inconsistent
behaviour while still giving them the chance to take advantage of the
other features of UNIQUE NOT ENFORCED.

> > 2. Create a new "dummy index" index type. This would not include any
> > update triggers and would have an infinite cost to prevent usage in
> > query planning, but it would still serve the purpose of proving the
> > existence of a unique index.
>
> It would be faster than a btree, but it'd still have the issue of
> adding the overhead of column projection costs, and it'd be a (false?)
> target for LR's replication identity, which I'm concerned about.

Yeah, I am thinking a constraint only would be the better approach. I
don't see any reason why the LR index search couldn't be rewritten to
exclude any new flagged dummy index, but it does seem like unnecessary
added complexity, and, of course, the additional overhead of the index
in general isn't great.

> I think this feature would be a loaded footgun, especially if the
> planner starts to consider unenforced constraints as valid.

As mentioned above, there are ways we could mitigate the potential
risks by isolating the more risky functionality in the constraint
interface.



Re: Proposal: Add a UNIQUE NOT ENFORCED constraint

От
Jacob Jackson
Дата:
On Wed, Jan 7, 2026 at 2:12 PM Peter Eisentraut <peter@eisentraut.org> wrote:
> Some other SQL implementations have optional implementation-specific
> modes on top of NOT ENFORCED that mean, the constraint is not enforced,
> but you can assume it holds for query planning purposes.  This is
> probably what the thinking here was, but that is not how PostgreSQL
> currently works, so additional work would need to be done to add this
> additional mode.
>
> There are probably also multiple levels to this.  It's one thing to use
> an unenforced-but-probably-true constraint for things like selectivity
> estimation, where you just get a worse plan if it's not quite true, but
> it's another to assume something is unique when it might not be.  So
> there is some research to be done here.

Yeah, after reading through Matthias's and your comments, I am
increasingly convinced that the best option is to add a separate
trusted/untrusted flag that would decide whether Postgres allows any
query plans that could change results if the data isn't actually
unique. It seems feasible, but I haven't worked on implementing it
yet.



Re: Proposal: Add a UNIQUE NOT ENFORCED constraint

От
Robert Treat
Дата:
On Wed, Jan 7, 2026 at 8:39 PM Jacob Jackson <jej.jackson.08@gmail.com> wrote:
> On Wed, Jan 7, 2026 at 5:22 AM Neil Chen <carpenter.nail.cz@gmail.com> wrote:
> > If we want the query planner to generate an execution plan as if a column were unique, would setting n_distinct =
-1in the table statistics achieve the same effect? 
>
> Setting n_distinct is less clear in describing the data (it isn't tied
> to the table schema itself and can be ambiguous in whether values are
> actually totally unique or just close enough, which can complicate
> things), and, because it only impacts statistics estimations, it still
> doesn't help in queries where uniqueness is required. Postgres will
> still add a node to ensure uniqueness in that case. There are also
> some other limitations with n_distinct (e.g. extended stats are
> required for multi-column pairs, which further complicates
> documentation and adds complexity/overhead).
>

Just thinking out loud here, but I wonder if you might be able to
modify pg_hint_plan to have a "NoUnique" (or "NoOpUnique") hint which
causes the planner to either skip any unique nodes, or have them
function as just a passthrough. I'm not entirely sure that's doable
actually; we don't currently have hints for other types of aggregation
nodes, and at the moment I don't think we have any hints that can
produce wrong answers, and this one feels like it could...

Robert Treat
https://xzilla.net



Re: Proposal: Add a UNIQUE NOT ENFORCED constraint

От
Jacob Jackson
Дата:
On Wed, Jan 7, 2026 at 11:13 PM Robert Treat <rob@xzilla.net> wrote:
> Just thinking out loud here, but I wonder if you might be able to
> modify pg_hint_plan to have a "NoUnique" (or "NoOpUnique") hint which
> causes the planner to either skip any unique nodes, or have them
> function as just a passthrough.

This might have its own use cases, but I wouldn't consider that a true
replacement for a unique constraint. It still doesn't solve the table
metadata documentation problem, and it would require modifying all
relevant queries. It also feels like the mental model of everything
then becomes a little convoluted (instead of associating unique
constraints with the data itself, we are now associating it with
queries).

> I'm not entirely sure that's doable
> actually; we don't currently have hints for other types of aggregation
> nodes, and at the moment I don't think we have any hints that can
> produce wrong answers, and this one feels like it could...

It does seem like the ability of these nodes to change results would
introduce difficulties. As I understand, pg_hint_plan's negative hints
primarily operate by removing plans with the node and, therefore,
forcing Postgres to choose a plan without the node. In this case, I
don't think Postgres would present plans without the node in the first
place (or, at least, if the planner did, the plans wouldn't be
otherwise equivalent) because the query planner, not understanding the
unique constraint, treats the plan without the unique node as a
fundamentally different query that will return different results.