Обсуждение: Remove inner joins based on foreign keys

Поиск
Список
Период
Сортировка

Remove inner joins based on foreign keys

От
Richard Guo
Дата:
Currently, the planner can remove useless left joins if the join
condition cannot match more than one RHS row, and the RHS rel is not
referenced above the join.  I'd like to propose a similar optimization
for inner joins.

For an inner join to be removable, we need to make sure that the join
condition matches exactly one (no more, no less) RHS row.  We can
leverage foreign key constraints to achieve that.  But we need to be
careful about the case where the referencing relation's foreign key
columns are null.  We can inject an IS NOT NULL filter for any foreign
key columns that are not defined as NOT NULL.

Here is an example to show this removal:

CREATE TABLE users (id int primary key, name text);
CREATE TABLE orders (id int, user_id int references users(id), amount int);

EXPLAIN (COSTS OFF)
SELECT o.* FROM orders o JOIN users u ON o.user_id = u.id;
           QUERY PLAN
---------------------------------
 Seq Scan on orders o
   Filter: (user_id IS NOT NULL)
(2 rows)

One interesting aspect of this patch is that we can actually allow the
usage of the referenced key columns in ECs even above the join, which
can help bridge multi-hop joins.  This is pretty useful for the
brand-new graph queries.

As an example, consider:

CREATE TABLE knows (
    edge_id int primary key,
    src_id int not null references users(id),
    dst_id int not null references users(id),
    creation_date date
);

CREATE PROPERTY GRAPH social_graph
    VERTEX TABLES (users)
    EDGE TABLES (knows
        SOURCE KEY (src_id) REFERENCES users(id)
        DESTINATION KEY (dst_id) REFERENCES users(id)
    );

Suppose we only want to find the creation dates of a 3-hop connection
chain:

EXPLAIN (COSTS OFF)
SELECT e1_date, e2_date, e3_date
FROM GRAPH_TABLE (social_graph
    MATCH (u1 IS users)
          -[e1 IS knows]-> (u2 IS users)
          -[e2 IS knows]-> (u3 IS users)
          -[e3 IS knows]-> (u4 IS users)
    COLUMNS (
        e1.creation_date AS e1_date,
        e2.creation_date AS e2_date,
        e3.creation_date AS e3_date
    )
);
                     QUERY PLAN
----------------------------------------------------
 Hash Join
   Hash Cond: (knows_1.dst_id = knows_2.src_id)
   ->  Hash Join
         Hash Cond: (knows.dst_id = knows_1.src_id)
         ->  Seq Scan on knows
         ->  Hash
               ->  Seq Scan on knows knows_1
   ->  Hash
         ->  Seq Scan on knows knows_2
(9 rows)

As a result, all user nodes are removed, and this query is reduced
from a 7-table join down to a 3-table join.

Please see the draft commit message in the attached patch for the
implementation details.  Any thoughts and feedback are very welcome.

- Richard

Вложения

Re: Remove inner joins based on foreign keys

От
Michael Paquier
Дата:
On Sat, Mar 21, 2026 at 11:47:03AM +0900, Richard Guo wrote:
> Please see the draft commit message in the attached patch for the
> implementation details.  Any thoughts and feedback are very welcome.

You did not mention anything about the version that would be impacted
by this change.  At this stage of the release cycle, is it right to
imply that this would be for v20 and not v19?
--
Michael

Вложения

Re: Remove inner joins based on foreign keys

От
Richard Guo
Дата:
On Sat, Mar 21, 2026 at 4:42 PM Michael Paquier <michael@paquier.xyz> wrote:
> You did not mention anything about the version that would be impacted
> by this change.  At this stage of the release cycle, is it right to
> imply that this would be for v20 and not v19?

Right.  This is for v20.

- Richard



Re: Remove inner joins based on foreign keys

От
David Rowley
Дата:
On Sat, 21 Mar 2026 at 15:47, Richard Guo <guofenglinux@gmail.com> wrote:
> Currently, the planner can remove useless left joins if the join
> condition cannot match more than one RHS row, and the RHS rel is not
> referenced above the join.  I'd like to propose a similar optimization
> for inner joins.

I tried this many years ago and it was pretty much a dead end with how
the current foreign key implementation deferring the cascade of the
foreign key until the end of the query.

There's plenty of discussion. See [1] and [2]. In particular, read and
follow along from what Heikki mentions in [3]. You should read all of
that and understand it to prevent prompting the same discussions all
over again.

It doesn't look like your patch has anything to protect against any of
the issues mentioned in [1], so I assume you weren't aware of that
work.

David

[1]
https://www.postgresql.org/message-id/flat/CAApHDvpDXXvKE%2B%3Dug1kA--nKfa%3DbjrjvK8Gp9G8UYwv6nHckVg%40mail.gmail.com#2f33cb1e196e863eba0e1376084cdf4d
[2] https://www.postgresql.org/message-id/CAApHDvocUEYdt1uT%2BDLDPs2xEu%3Dv3qJGT6HeXKonQM4rY_OsSA%40mail.gmail.com
[3] https://www.postgresql.org/message-id/54240C51.8040304%40vmware.com



Re: Remove inner joins based on foreign keys

От
Richard Guo
Дата:
On Sun, Mar 22, 2026 at 6:09 PM David Rowley <dgrowleyml@gmail.com> wrote:
> On Sat, 21 Mar 2026 at 15:47, Richard Guo <guofenglinux@gmail.com> wrote:
> > Currently, the planner can remove useless left joins if the join
> > condition cannot match more than one RHS row, and the RHS rel is not
> > referenced above the join.  I'd like to propose a similar optimization
> > for inner joins.

> I tried this many years ago and it was pretty much a dead end with how
> the current foreign key implementation deferring the cascade of the
> foreign key until the end of the query.

Thanks for pointing this out!  I failed to find the prior work, and I
missed the fatal flaw introduced by the AFTER ROW trigger mechanism
for foreign key constraints.  I had been making a mental analogy to
UNIQUE and NOT NULL constraints, but those are enforced immediately at
the heap/B-tree level.

Just for the sake of archives, the timeline of the trap during a
cascading delete looks like this:

T0: DELETE FROM users WHERE id = 1;

T1: The executor finds users row 1 and sets its xmax, physically
marking it as dead.

T2: [The Gap] The executor pushes the RI trigger into a queue to deal
with orders later.  Right now, the orders row still exists, but its
referenced row in users is dead.

T3: The statement finishes, the trigger fires, and the orders row is
finally deleted.

The "T2 Gap" is small, but there are several ways to execute user code
inside that window, such as RETURNING clauses, volatile functions, or
user-defined AFTER ROW triggers.

Since the planner operates on static logical schema guarantees and
cannot predict dynamic execution-time trigger queues, it seems any
plan-time optimization that relies on foreign keys for correctness is
effectively a dead end.  Maybe the only solution would be to handle
the join removal in the executor (where the trigger state is known),
but I noticed you explored that a decade ago and it seems far too
invasive.

Thanks again for the save.  You saved me a lot of time and effort
chasing a dead end.

- Richard



Re: Remove inner joins based on foreign keys

От
Pavel Stehule
Дата:
Hi

po 23. 3. 2026 v 7:13 odesílatel Richard Guo <guofenglinux@gmail.com> napsal:
On Sun, Mar 22, 2026 at 6:09 PM David Rowley <dgrowleyml@gmail.com> wrote:
> On Sat, 21 Mar 2026 at 15:47, Richard Guo <guofenglinux@gmail.com> wrote:
> > Currently, the planner can remove useless left joins if the join
> > condition cannot match more than one RHS row, and the RHS rel is not
> > referenced above the join.  I'd like to propose a similar optimization
> > for inner joins.

> I tried this many years ago and it was pretty much a dead end with how
> the current foreign key implementation deferring the cascade of the
> foreign key until the end of the query.

Thanks for pointing this out!  I failed to find the prior work, and I
missed the fatal flaw introduced by the AFTER ROW trigger mechanism
for foreign key constraints.  I had been making a mental analogy to
UNIQUE and NOT NULL constraints, but those are enforced immediately at
the heap/B-tree level.

Just for the sake of archives, the timeline of the trap during a
cascading delete looks like this:

T0: DELETE FROM users WHERE id = 1;

T1: The executor finds users row 1 and sets its xmax, physically
marking it as dead.

T2: [The Gap] The executor pushes the RI trigger into a queue to deal
with orders later.  Right now, the orders row still exists, but its
referenced row in users is dead.

T3: The statement finishes, the trigger fires, and the orders row is
finally deleted.

The "T2 Gap" is small, but there are several ways to execute user code
inside that window, such as RETURNING clauses, volatile functions, or
user-defined AFTER ROW triggers.

Since the planner operates on static logical schema guarantees and
cannot predict dynamic execution-time trigger queues, it seems any
plan-time optimization that relies on foreign keys for correctness is
effectively a dead end.  Maybe the only solution would be to handle
the join removal in the executor (where the trigger state is known),
but I noticed you explored that a decade ago and it seems far too
invasive.

Thanks again for the save.  You saved me a lot of time and effort
chasing a dead end.

Maybe you can push this analysis to some README in the code.  

Regards

Pavel


- Richard


Re: Remove inner joins based on foreign keys

От
Richard Guo
Дата:
On Mon, Mar 23, 2026 at 3:12 PM Richard Guo <guofenglinux@gmail.com> wrote:
> Just for the sake of archives, the timeline of the trap during a
> cascading delete looks like this:
>
> T0: DELETE FROM users WHERE id = 1;
>
> T1: The executor finds users row 1 and sets its xmax, physically
> marking it as dead.
>
> T2: [The Gap] The executor pushes the RI trigger into a queue to deal
> with orders later.  Right now, the orders row still exists, but its
> referenced row in users is dead.
>
> T3: The statement finishes, the trigger fires, and the orders row is
> finally deleted.
>
> The "T2 Gap" is small, but there are several ways to execute user code
> inside that window, such as RETURNING clauses, volatile functions, or
> user-defined AFTER ROW triggers.

I've been exploring ways to detect whether a query is executing within
the RI trigger gap.  It seems one promising approach is to leverage
the lock manager.

Every DML statement acquires RowExclusiveLock on its target table
before modifying any rows, so if the trigger gap is active, the lock
is guaranteed to be held.  We can verify this during planning by
calling CheckRelationOidLockedByMe(), which is a quite efficient
check.  (As a point of precedent, the planner already relies on the
local lock manager's state in several existing code paths.)

For prepared statements and cached plans, a generic plan that was
built with FK join removal could be reused in a context where the
trigger gap is active.  To handle this, we can modify
choose_custom_plan() to check whether any relation involved in an
FK-based join removal currently holds RowExclusiveLock, and choose
custom plan if so.

The trade-off is false positives: because RowExclusiveLock persists
for the entire transaction while the trigger gap is intra-statement,
the optimization is also skipped after the DML completes but within
the same transaction, after ROLLBACK TO a savepoint, or when
RowExclusiveLock is held for other reasons (e.g., LOCK TABLE).  These
seem like uncommon cases, and the query simply falls back to executing
the join normally as it would without the optimization at all, so I
think this is an acceptable trade-off.

Please see the v2 patch for the implementation details.

I didn't find any mention of this approach in the 2014 thread.  I'd
appreciate any thoughts or feedback on this direction.

- Richard

Вложения

Re: Remove inner joins based on foreign keys

От
Richard Guo
Дата:
On Wed, Apr 1, 2026 at 6:45 PM Richard Guo <guofenglinux@gmail.com> wrote:
> Please see the v2 patch for the implementation details.
>
> I didn't find any mention of this approach in the 2014 thread.  I'd
> appreciate any thoughts or feedback on this direction.

Here is a new rebase over 20efbdffe.  It also tightens up some checks
in inner_join_is_removable().

I've reconsidered the trigger-gap issue and I think I now have a
cleaner understanding.  A snapshot captured inside the trigger-gap
window can outlive the window's closure -- for example via a STABLE
function that inherits an older snapshot from its caller, as Tom
pointed out back in 2015 [1].  A query executed against such a stale
snapshot would still observe the inconsistency even after the trigger
queue has drained.  So the predicate guarding the optimization has to
remain positive at least as long as any in-transaction snapshot could
still be referenced.

I went through several alternatives (a per-statement counter
incremented in ExecInitModifyTable, AfterTriggerPendingOnRel, etc.)
and convinced myself that all of them go silent before the relevant
snapshots are gone.  The lock-based predicate is still the best
correct approach I can think of: RowExclusiveLock is released only at
end of transaction, which trivially exceeds the lifetime of any
in-transaction snapshot.  By the time the lock is released, every
in-transaction snapshot has been released, so no stale gap-window
snapshot can still be referenced.  Maybe the false positives are just
the price we need to pay for that lifetime guarantee.

[1] https://postgr.es/m/32139.1427667410@sss.pgh.pa.us

- Richard

Вложения

Re: Remove inner joins based on foreign keys

От
Tender Wang
Дата:
Hi, Richard,

Richard Guo <guofenglinux@gmail.com> 于2026年4月28日周二 18:09写道:
>
> On Wed, Apr 1, 2026 at 6:45 PM Richard Guo <guofenglinux@gmail.com> wrote:
> > Please see the v2 patch for the implementation details.

Thanks for this optimization. I look through this patch, and I have
two questions about this implementation.
This is my test case:
CREATE TABLE users  (id int primary key, name text);
CREATE TABLE orders (id int primary key, user_id int references
users(id), amount int);
create table nation (id int, name text);

postgres=# explain select n.* from nation n  join (orders o join users
u on o.user_id = u.id) on n.id = o.id;
                               QUERY PLAN
------------------------------------------------------------------------
 Hash Join  (cost=55.78..81.82 rows=1264 width=36)
   Hash Cond: (n.id = o.id)
   ->  Seq Scan on nation n  (cost=0.00..22.70 rows=1270 width=36)
   ->  Hash  (cost=30.40..30.40 rows=2030 width=8)
         ->  Seq Scan on orders o  (cost=0.00..30.40 rows=2030 width=8)
               Filter: (user_id IS NOT NULL)
(6 rows)

The above plan is what we want. The inner join between orders and
users is removed thanks to the FK constraint.
But if I change the most left-side inner join to an outer join, the
plan doesn't seem to be what we want. Maybe I'm wrong.
postgres=# explain select n.* from nation n left join (orders o join
users u on o.user_id = u.id) on n.id = o.id;
                                    QUERY PLAN
-----------------------------------------------------------------------------------
 Hash Left Join  (cost=99.85..140.01 rows=1270 width=36)
   Hash Cond: (n.id = o.id)
   ->  Seq Scan on nation n  (cost=0.00..22.70 rows=1270 width=36)
   ->  Hash  (cost=74.35..74.35 rows=2040 width=4)
         ->  Hash Join  (cost=38.58..74.35 rows=2040 width=4)
               Hash Cond: (o.user_id = u.id)
               ->  Seq Scan on orders o  (cost=0.00..30.40 rows=2040 width=8)
               ->  Hash  (cost=22.70..22.70 rows=1270 width=4)
                     ->  Seq Scan on users u  (cost=0.00..22.70
rows=1270 width=4)
(9 rows)

The inner join between orders and users is not removed. It appears
this can be removed in this query. This is my first question.

The second question may not be related to this $SUBJECT.
I did some research about why the inner join cannot be removed in the
second query.
In the inner_join_is_removable(), we have the following check:
...
/*
     * If the referenced relation has any restriction clauses or non-equality
     * join clauses, they act as explicit filters.  Since we cannot perform
     * variable substitution to rewrite these clauses, we must abort.
     */
    if (ref_rel->baserestrictinfo || ref_rel->joininfo)
        return false;
...
The ref_rel is the users relation, and ref_rel->baserestrictinfo is
NIL, but the ref_rel->joininfo is not NIL.
In this query, the ref_rel->joininfo is (n.id = o.id), because it is a
left join, so n.id may not equal o.id.
So we have  "non-equality join clauses" in the above comments.
My question is that the (n.id = o.id) clause seems not related to
users. Why does the planner add it to the users' RelOptInfo's
joininfo?

I only know that the thing happens in the
reconsider_outer_join_clauses(), the restrictinfo->required_relids
contains {1,2,3}, where the rtindex = 3 is the users relation.



--
Thanks,
Tender Wang



Re: Remove inner joins based on foreign keys

От
Richard Guo
Дата:
On Thu, Apr 30, 2026 at 4:50 PM Tender Wang <tndrwang@gmail.com> wrote:
> The inner join between orders and users is not removed. It appears
> this can be removed in this query. This is my first question.

Right.  This is a mis-optimization case I was aware of.  The inner
join can be removed in this case.  inner_join_is_removable() is
currently overly conservative about ref_rel->joininfo: it bails as
long as the list is non-empty.  We can relax this to bail only when a
clause actually references ref_rel via clause_relids.  A clause can
appear in ref_rel->joininfo without its expression referencing
ref_rel, which connects to your second question.

> The second question may not be related to this $SUBJECT.
> My question is that the (n.id = o.id) clause seems not related to
> users. Why does the planner add it to the users' RelOptInfo's
> joininfo?
>
> I only know that the thing happens in the
> reconsider_outer_join_clauses(), the restrictinfo->required_relids
> contains {1,2,3}, where the rtindex = 3 is the users relation.

Because the clause "n.id = o.id" is an outer join ON clause that is
non-degenerate (one that references the non-nullable side of the join)
and we need to force it to be evaluated exactly at the level of the
outer join.  To handle that, we add the join's minimum input relid set
to required_relids.  That's why it appears in users' joininfo without
referencing users.

Attached patch relaxes the check against ref_rel->joininfo.  Nothing
else has changed.

- Richard

Вложения

Re: Remove inner joins based on foreign keys

От
Tender Wang
Дата:
Richard Guo <guofenglinux@gmail.com> 于2026年5月1日周五 16:25写道:
> Because the clause "n.id = o.id" is an outer join ON clause that is
> non-degenerate (one that references the non-nullable side of the join)
> and we need to force it to be evaluated exactly at the level of the
> outer join.  To handle that, we add the join's minimum input relid set
> to required_relids.  That's why it appears in users' joininfo without
> referencing users.

Thanks for the explanation.

>
> Attached patch relaxes the check against ref_rel->joininfo.  Nothing
> else has changed.

CREATE TABLE users  (id int primary key, name text);
CREATE TABLE orders (id int primary key, user_id int not null
references users(id), amount int);
CREATE TABLE nation (id int primary key, name text);
postgres=# explain select  n.* from nation n left join (orders o join
users u on o.user_id = u.id) on n.id = o.id;
                               QUERY PLAN
-------------------------------------------------------------------------
 Hash Right Join  (cost=38.58..74.34 rows=1270 width=36)
   Hash Cond: (o.id = n.id)
   ->  Seq Scan on orders o  (cost=0.00..30.40 rows=2040 width=8)
   ->  Hash  (cost=22.70..22.70 rows=1270 width=36)
         ->  Seq Scan on nation n  (cost=0.00..22.70 rows=1270 width=36)
(5 rows)

postgres=# explain select n.* from nation n left join orders o on n.id = o.id;
                         QUERY PLAN
-------------------------------------------------------------
 Seq Scan on nation n  (cost=0.00..22.70 rows=1270 width=36)
(1 row)

Recently, I encountered the two plans above.  The first plan can
continue to transform into the second plan after inner-join removing.
But in current logic, we cannot do this. Because we do left-join
removable first. Then we do other join(semi-join,
self-join,inner-join) removable.
We can only remove outer-join, whose min-righthand is single.
Maybe we can call remove_useless_joins() again to remove the outer
join that the function couldn't remove in the first call.
I'm not sure it is worth doing this.
Any thoughts?



Re: Remove inner joins based on foreign keys

От
Alexandra Wang
Дата:
Hi Richard,

On Fri, May 1, 2026 at 1:25 AM Richard Guo <guofenglinux@gmail.com> wrote:
> Attached patch relaxes the check against ref_rel->joininfo.  Nothing
> else has changed.

Thanks for the patch! I've tested and reviewed v4. Overall, this looks
good to me. Please find individual feedback below.

On Wed, Apr 1, 2026 at 2:46 AM Richard Guo <guofenglinux@gmail.com> wrote:
> On Mon, Mar 23, 2026 at 3:12 PM Richard Guo <guofenglinux@gmail.com> wrote:
> > Just for the sake of archives, the timeline of the trap during a
> > cascading delete looks like this:
> >
> > T0: DELETE FROM users WHERE id = 1;
> >
> > T1: The executor finds users row 1 and sets its xmax, physically
> > marking it as dead.
> >
> > T2: [The Gap] The executor pushes the RI trigger into a queue to deal
> > with orders later.  Right now, the orders row still exists, but its
> > referenced row in users is dead.
> >
> > T3: The statement finishes, the trigger fires, and the orders row is
> > finally deleted.
> >
> > The "T2 Gap" is small, but there are several ways to execute user code
> > inside that window, such as RETURNING clauses, volatile functions, or
> > user-defined AFTER ROW triggers.
>
> I've been exploring ways to detect whether a query is executing within
> the RI trigger gap.  It seems one promising approach is to leverage
> the lock manager.
>
> Every DML statement acquires RowExclusiveLock on its target table
> before modifying any rows, so if the trigger gap is active, the lock
> is guaranteed to be held.  We can verify this during planning by
> calling CheckRelationOidLockedByMe(), which is a quite efficient
> check.  (As a point of precedent, the planner already relies on the
> local lock manager's state in several existing code paths.)
>
> For prepared statements and cached plans, a generic plan that was
> built with FK join removal could be reused in a context where the
> trigger gap is active.  To handle this, we can modify
> choose_custom_plan() to check whether any relation involved in an
> FK-based join removal currently holds RowExclusiveLock, and choose
> custom plan if so.
>
> The trade-off is false positives: because RowExclusiveLock persists
> for the entire transaction while the trigger gap is intra-statement,
> the optimization is also skipped after the DML completes but within
> the same transaction, after ROLLBACK TO a savepoint, or when
> RowExclusiveLock is held for other reasons (e.g., LOCK TABLE).  These
> seem like uncommon cases, and the query simply falls back to executing
> the join normally as it would without the optimization at all, so I
> think this is an acceptable trade-off.

On Tue, Apr 28, 2026 at 3:09 AM Richard Guo <guofenglinux@gmail.com> wrote:
> On Wed, Apr 1, 2026 at 6:45 PM Richard Guo <guofenglinux@gmail.com> wrote:
> > Please see the v2 patch for the implementation details.
> >
> > I didn't find any mention of this approach in the 2014 thread.  I'd
> > appreciate any thoughts or feedback on this direction.
>
> Here is a new rebase over 20efbdffe.  It also tightens up some checks
> in inner_join_is_removable().
>
> I've reconsidered the trigger-gap issue and I think I now have a
> cleaner understanding.  A snapshot captured inside the trigger-gap
> window can outlive the window's closure -- for example via a STABLE
> function that inherits an older snapshot from its caller, as Tom
> pointed out back in 2015 [1].  A query executed against such a stale
> snapshot would still observe the inconsistency even after the trigger
> queue has drained.  So the predicate guarding the optimization has to
> remain positive at least as long as any in-transaction snapshot could
> still be referenced.
>
> I went through several alternatives (a per-statement counter
> incremented in ExecInitModifyTable, AfterTriggerPendingOnRel, etc.)
> and convinced myself that all of them go silent before the relevant
> snapshots are gone.  The lock-based predicate is still the best
> correct approach I can think of: RowExclusiveLock is released only at
> end of transaction, which trivially exceeds the lifetime of any
> in-transaction snapshot.  By the time the lock is released, every
> in-transaction snapshot has been released, so no stale gap-window
> snapshot can still be referenced.  Maybe the false positives are just
> the price we need to pay for that lifetime guarantee. 

Checking RowExclusiveLock makes sense to me.

One nitpick on wording: the RowExclusiveLock acquired at the beginning
of the DML command can be released early by ROLLBACK TO a savepoint
established before that command, so it doesn't strictly "exceed the
lifetime of any in-transaction snapshot." In practice this doesn't
matter, though, because the lock is guaranteed to outlive any snapshot
taken during the trigger gap of the DML command, which is the only
window where the FK invariant can be violated.

IIUC, as long as we ensure that the snapshot of the join statement --
whether it is an after-row trigger, RETURNING clause, cursor, etc. --
is taken outside of the trigger gap, which IS guarded by
RowExclusiveLock, removing the join from its plan should be safe.

So, setting aside the false positives you mentioned, which I think are
acceptable, I believe this approach is correct.

I then tried to find edge cases similar to the 2014 threads and found
one bug with TABLESAMPLE:

-- If the patch removes the join when the referenced table uses TABLESAMPLE,
-- it will return MORE rows than correct (all child rows instead of only
-- those whose parent appeared in the sample).

CREATE TABLE exp_parent (id int PRIMARY KEY, data text);
CREATE TABLE exp_child (
    id int PRIMARY KEY,
    parent_id int NOT NULL REFERENCES exp_parent(id)
);

INSERT INTO exp_parent SELECT g, 'row_' || g FROM generate_series(1, 100) g;
INSERT INTO exp_child SELECT g, g FROM generate_series(1, 100) g;
ANALYZE exp_parent;
ANALYZE exp_child;

-- This join should NOT be removed because TABLESAMPLE limits the parent rows
EXPLAIN (COSTS OFF)
SELECT c.id
FROM exp_child c
INNER JOIN exp_parent p TABLESAMPLE BERNOULLI(1) REPEATABLE(42) ON c.parent_id = p.id;

-- Verify: if join removal fires, this would return ~100 rows.
-- With the join, it should return ~1 row (1% sample).
SELECT count(*)
FROM exp_child c
INNER JOIN exp_parent p TABLESAMPLE BERNOULLI(1) REPEATABLE(42) ON c.parent_id = p.id;

We should add a condition for TABLESAMPLE in inner_join_is_removable()
to disallow join removal.

Minor things I think might be worth mentioning, but I don't have
strong opinions:

1. Self-join removal has a GUC: enable_self_join_elimination. I didn't
investigate why that was needed, and I don't know in what cases people
would want to turn it off, maybe for debugging purposes? Do you think
we need a similar GUC for inner joins? Disclaimer: I'm not a fan of
unnecessary GUCs, just putting the question out there.

2. Nitpick: should we consider renaming remove_useless_joins() to
remove_useless_left_joins() to reflect what it actually does?

Best,
Alex

--
Alexandra Wang