Обсуждение: Add rows removed by hash join clause to instrumentation

Поиск
Список
Период
Сортировка

Add rows removed by hash join clause to instrumentation

От
Andrei Lepikhov
Дата:
Hi,

Playing around [1] to understand how, in practice, an engineer should 
identify potential indexes, I found that nfiltered1 and nfiltered2 are 
sufficient enough to detect issues with leaf table scan operators. But 
the situation is worse when it comes to joins.

The idea behind JOIN optimisation is that sometimes a highly selective, 
parameterised NestLoop is more performant than a HashJoin. What we need 
is to identify that only a tiny part of the hash table or a sorted 
MergeJoin input has been used to produce the JOIN result.

Thanks to [2 - 5], we have metrics showing how many tuples are removed 
by joinqual and otherquals in a JOIN operator. That’s good for starters. 
But some cases aren’t covered yet: how many tuples filtered by 
hashclauses or mergeclauses.

In the attached file, you can see that for the same query, NestLoop 
exposes 100k filtered tuples, but HashJoin shows nothing. Original 
threads argued that ‘Filtered out’ characteristics should expose extra 
work done by the operator. Hashing operation, especially on a set of 
variable-length columns sometimes quite expensive. Further filtering, 
involving hashclauses looks pretty similar to the joinqual filtering.

For me, ‘filtered’ value represents a flag that some massive part of the 
input is not needed at all and using proper parameterisation and 
indexing, we could optimise such a JOIN with NestLoop or MergeJoin.

 From this point of view, it seems logical to add a nfiltered3 
instrumentation field and account rows, filtered out by a ‘special’ join 
clause like hashclauses or mergeclauses.

In the attachment, I propose a sketch on how to calculate these metrics. 
MergeJoin looks more complicated and I don't propose it for now, but 
HashJoin is quite trivial.

Known issues:
- Hash clause failures are counted in nfiltered1, which is shared with
   join filter removals. If both are present, counts are combined.
- The metric only counts outer tuples with zero hash-value matches,
   not hash collisions within buckets.

Thoughts?

[1] Proposal: Add rows_filtered column to pg_stat_statements for index 
opportunity 

detectionhttps://www.postgresql.org/message-id/CAM527d-r%2BRsaAeYsyAPmYtnmWB3rJFJtixUq4bnJW59nN%3DZo3w%40mail.gmail.com

[2] RFD: Discarded tuple count for SeqScan nodes in EXPLAIN ANALYZE 
https://www.postgresql.org/message-id/flat/4A16A8AF.2080508@anarazel.de[3] 
EXPLAIN and nfiltered - Mailing list pgsql-hackers
https://www.postgresql.org/message-id/4CE54A13.7090609@cs.helsinki.fi

[4] Re: REVIEW: EXPLAIN and nfiltered
https://www.postgresql.org/message-id/9053.1295888538%40sss.pgh.pa.us

[5] EXPLAIN and nfiltered, take two
https://www.postgresql.org/message-id/flat/4E68B108.1090907%402ndquadrant.com

-- 
regards, Andrei Lepikhov,
pgEdge

Вложения

Re: Add rows removed by hash join clause to instrumentation

От
Andrei Lepikhov
Дата:
On 17/1/26 00:51, Andrei Lepikhov wrote:
> Thoughts?
> 
In this letter, I want to demonstrate how to use the proposed feature.

Install PostgreSQL 18 and Join-Order-Benchmark [1]. We need just one 
pass, executing each query one-by-one. To identify issues, I use the 
pg_track_optimizer extension [2] to examine the ‘bad join’ criterion. 
For each JOIN node, the following is calculated:

jf_max = (nfiltered1 + nfiltered2 + nfiltered3) / nloops

Extension stores MAX(jf_max) for each query to detect potentially 
optimisable ones. So, let’s benchmark both with and without the proposed 
change. TOP-5 bad queries expose quite a different picture:

SELECT queryid,LEFT(query,12) AS query,floor(jf_max) AS jf_max
FROM pg_track_optimizer WHERE jf_max > 0
ORDER BY jf_max DESC LIMIT 5;

Without the patch:

    queryid        |    query     | jf_max
----------------------±-------------±-------
4335842597099666660 | /* 27c.sql * | 490348

patched Postgres and the extension:

    queryid        |    query     | jf_max
----------------------±-------------±-------
-3135863217603061370 | /* 32a.sql * | 871591
6389747766960672879  | /* 27c.sql * | 491129
-6029921280260605067 | /* 6a.sql /  | 218912
5025464644910963332  | / 25b.sql *  |  25753
1798119524484989875  | /* 10c.sql * |  22939

In the patched variant, we see more potential cases. Let’s take a look 
at the first three of them.

Query 27c.sql (see analysis-27c.sql in attachment) is detected as 
filtering too much in both cases because, on occasion, intensive 
filtering occurs in a NestLoop node. Index identification is a separate 
topic, but here I used a quite simple approach: I just asked Claude to 
collect join clauses that filter out many rows and propose potential 
indexes. It is not for production, but worked with such a simple and 
limited set of queries.

So, without the patch, we have extra indexes:

CREATE INDEX ON movie_info(movie_id);
CREATE INDEX ON movie_companies(movie_id);

Patched version finds one more problematic join, and the recommendation 
changes a little:

CREATE INDEX ind_3 ON movie_companies(movie_id,company_id,company_type_id);

After applying these indexes query speeds up 3340 ms → 32 ms (x100)

Now, undetected without patch 32a.sql, exposes that we can employ the 
following indexes:

CREATE INDEX ON movie_keyword (keyword_id,movie_id);
CREATE INDEX ON movie_link(movie_id);

It speeds up query: 250ms → 16ms (x15)

And the last one, 6a.sql, needs two iterations:

The first one demonstrates we need the index:
CREATE INDEX ON cast_info(movie_id,person_id);

An additional pass shows we can reduce the number of fetched tuples with 
the following index:
CREATE INDEX ON movie_keyword(keyword_id,movie_id);

Finally, that speeds up query: 150ms->220ms → 26ms (x7)

As you can see, the main idea is to draw attention to specific queries 
and having specific methodology propose indexes that may reduce the size 
of initially fetched data.

[1] https://github.com/danolivo/jo-bench
[2] https://github.com/danolivo/pg_track_optimizer/tree/nfiltered3

-- 
regards, Andrei Lepikhov,
pgEdge
Вложения

Re: Add rows removed by hash join clause to instrumentation

От
Alena Rybakina
Дата:
Hi Andrei,

On 17.01.2026 02:51, Andrei Lepikhov wrote:
> Hi,
>
> Playing around [1] to understand how, in practice, an engineer should 
> identify potential indexes, I found that nfiltered1 and nfiltered2 are 
> sufficient enough to detect issues with leaf table scan operators. But 
> the situation is worse when it comes to joins.
>
> The idea behind JOIN optimisation is that sometimes a highly 
> selective, parameterised NestLoop is more performant than a HashJoin. 
> What we need is to identify that only a tiny part of the hash table or 
> a sorted MergeJoin input has been used to produce the JOIN result.
>
> Thanks to [2 - 5], we have metrics showing how many tuples are removed 
> by joinqual and otherquals in a JOIN operator. That’s good for 
> starters. But some cases aren’t covered yet: how many tuples filtered 
> by hashclauses or mergeclauses.
>
> In the attached file, you can see that for the same query, NestLoop 
> exposes 100k filtered tuples, but HashJoin shows nothing. Original 
> threads argued that ‘Filtered out’ characteristics should expose extra 
> work done by the operator. Hashing operation, especially on a set of 
> variable-length columns sometimes quite expensive. Further filtering, 
> involving hashclauses looks pretty similar to the joinqual filtering.
>
> For me, ‘filtered’ value represents a flag that some massive part of 
> the input is not needed at all and using proper parameterisation and 
> indexing, we could optimise such a JOIN with NestLoop or MergeJoin.
>
> From this point of view, it seems logical to add a nfiltered3 
> instrumentation field and account rows, filtered out by a ‘special’ 
> join clause like hashclauses or mergeclauses.
Thanks for the detailed write-up and examples - I totally agree that the 
proposed metric makes sense, especially for HashJoin where a large 
amount of work is currently invisible in EXPLAIN ANALYZE and it can 
create lied vision about effectiveness.
>
> In the attachment, I propose a sketch on how to calculate these 
> metrics. MergeJoin looks more complicated and I don't propose it for 
> now, but HashJoin is quite trivial.
>
> Known issues:
> - Hash clause failures are counted in nfiltered1, which is shared with
>   join filter removals. If both are present, counts are combined.
> - The metric only counts outer tuples with zero hash-value matches,
>   not hash collisions within buckets.
I’ve looked through the patch and didn’t spot any obvious issues. 
Instrumentation of this kind is not something we usually test in 
regression tests, but I’ve added a small patch with some test coverage 
and attached it here. For now, I don't have any more suggestions.

Thanks for working on this.


Best regards,
Alena Rybakina
Вложения

Re: Add rows removed by hash join clause to instrumentation

От
Andrei Lepikhov
Дата:
On 19/1/26 20:33, Alena Rybakina wrote:
> I’ve looked through the patch and didn’t spot any obvious issues. 
> Instrumentation of this kind is not something we usually test in 
> regression tests, but I’ve added a small patch with some test coverage 
> and attached it here. For now, I don't have any more suggestions.

I doubt if the 'Rows Removed by Hash Matching' has a platform-stable 
value. It makes 'Removed By Join Filter' unstable, too. So, not sure if 
specific values may be used in regression tests. Am I wrong?

-- 
regards, Andrei Lepikhov,
pgEdge