Обсуждение: Question about nested loops..

Поиск
Список
Период
Сортировка

Question about nested loops..

От
Frits Jalvingh
Дата:
Dear list,

We have databases with rather large tables, and we do bulk queries on them, i.e. those queries always have a large result set.
We execute those queries with enable_nestloop set to false, because having nested loops in those queries always cause large performance issues. This usually leads to perfectly acceptable execution plans which use table scans, hash joins and merge joins. We have work_mem set to 64MB to 512MB (depending on the actual database size) to help with these.

I have a query that does not want to work at all without a nested loop and a materialize, while a trivial (but for me strange) change will allow it to make a good plan, and I'm wondering why that is.
The below examples are stripped versions of the real queries (which have a lot more filter conditions) but they exhibit the issue. I tested them on both Postgres 16 and 18, on Ubuntu 25.04, with the same results.

The slow query is this one:

select
    eenheid.id_s as id_s
,   eenheid.identificatie as id
,   exists(
    select
        1
    from datavault.l_cluster_eenheid_ssm
    inner join datavault.s_h_huurovereenkomst_ssm huurovereenkomst_s
        on eenheid.id_h_eenheid = huurovereenkomst_s._l_eenheid
    where eenheid.id_h_eenheid = l_cluster_eenheid_ssm.id_h_eenheid
    and l_cluster_eenheid_ssm.dv_start_dts <= timestamp '2025-10-08 00:00:00'
    and l_cluster_eenheid_ssm.dv_end_dts > timestamp '2025-10-08 00:00:00'
) as qd001_e
from datavault.s_h_eenheid_ssm eenheid

It produces the following execution plan:
Seq Scan on s_h_eenheid_ssm eenheid  (cost=0.00..172110004562434.50 rows=17211 width=21)
  SubPlan 1
    ->  Nested Loop  (cost=10000000000.00..10000007950.48 rows=30 width=0)
          ->  Seq Scan on l_cluster_eenheid_ssm  (cost=0.00..6768.46 rows=15 width=0)
                Filter: ((dv_start_dts <= '2025-10-08 00:00:00'::timestamp without time zone) AND (dv_end_dts > '2025-10-08 00:00:00'::timestamp without time zone) AND (id_h_eenheid = eenheid.id_h_eenheid))
          ->  Materialize  (cost=0.00..1181.65 rows=2 width=0)
                ->  Seq Scan on s_h_huurovereenkomst_ssm huurovereenkomst_s  (cost=0.00..1181.64 rows=2 width=0)
                      Filter: (_l_eenheid = eenheid.id_h_eenheid)

I can make the query very fast by adding a self-join to the root table (s_h_eenheid_ssm) inside the subquery:

select
    eenheid.id_s as id_s
,   eenheid.identificatie as id
,   exists(
    select
        1
    from datavault.s_h_eenheid_ssm eenheid_s
    inner join datavault.l_cluster_eenheid_ssm l_cluster_eenheid_ssm
        on eenheid_s.id_h_eenheid = l_cluster_eenheid_ssm.id_h_eenheid
    inner join datavault.s_h_huurovereenkomst_ssm huurovereenkomst_s
        on eenheid_s.id_h_eenheid = huurovereenkomst_s._l_eenheid
    where eenheid.id_h_eenheid = eenheid_s.id_h_eenheid
    and l_cluster_eenheid_ssm.dv_start_dts <= timestamp '2025-10-08 00:00:00'
    and l_cluster_eenheid_ssm.dv_end_dts > timestamp '2025-10-08 00:00:00'
) as DQ87_e
from datavault.s_h_eenheid_ssm eenheid

which produces a way better plan:

QUERY PLAN
Seq Scan on s_h_eenheid_ssm eenheid  (cost=0.00..344220005300951.31 rows=17211 width=21)
  SubPlan 2
    ->  Hash Join  (cost=3244.40..14039.76 rows=371258 width=4)
          Hash Cond: (l_cluster_eenheid_ssm.id_h_eenheid = eenheid_s.id_h_eenheid)
          ->  Seq Scan on l_cluster_eenheid_ssm  (cost=0.00..6139.82 rows=251455 width=4)
                Filter: ((dv_start_dts <= '2025-10-08 00:00:00'::timestamp without time zone) AND (dv_end_dts > '2025-10-08 00:00:00'::timestamp without time zone))
          ->  Hash  (cost=2926.76..2926.76 rows=25411 width=8)
                ->  Hash Join  (cost=1459.25..2926.76 rows=25411 width=8)
                      Hash Cond: (huurovereenkomst_s._l_eenheid = eenheid_s.id_h_eenheid)
                      ->  Seq Scan on s_h_huurovereenkomst_ssm huurovereenkomst_s  (cost=0.00..1118.11 rows=25411 width=4)
                      ->  Hash  (cost=1244.11..1244.11 rows=17211 width=4)
                            ->  Seq Scan on s_h_eenheid_ssm eenheid_s  (cost=0.00..1244.11 rows=17211 width=4)

The difference in execution speed is large; the slow query takes about 30 seconds on a test system while the fast one is under a second.

I do not understand why the simpler query (without the self join) produces a plan that seems to require nested loops, I hope someone can explain.

Re: Question about nested loops..

От
Andrei Lepikhov
Дата:
On 9/10/2025 10:52, Frits Jalvingh wrote:
> I do not understand why the simpler query (without the self join) 
> produces a plan that seems to require nested loops, I hope someone can 
> explain.
It seems obvious. You have a join clause:
'enheid.id_h_eenheid = huurovereenkomst_s._l_eenheid'

One side of this clause fits the underlying relation, but another one 
does not (references the external relation).
According to the HashJoin rules, the hash join clause is quite strict: 
each side should depend on only the left or right side of the join. So, 
it is just impossible here.
What can be done here without rewriting the query? This subject requires 
investigation to determine if the outer part of the hash clause can be 
parameterised or not.

-- regards, Andrei Lepikhov
pgEdge