Обсуждение: pg_trgm upgrade to 1.6 led to load average increase

Поиск
Список
Период
Сортировка

pg_trgm upgrade to 1.6 led to load average increase

От
Nicolas Seinlet
Дата:
Hello,

we've upgraded the pg_trgm extension from 1.0 to 1.6 on our production database, while sticking our postgresql cluster version to 16. This led to an increase in the load average of the server (twice the load average on our use case). After investigation, we found our issue was linked to :
https://github.com/postgres/postgres/commit/935f6666502250abde8615bc7805a6e5aa05a066

We issue queries like :
SELECT model, res_id FROM ir_model_data WHERE module='base' AND name='public_user';

With 1.0 extension, the query is planned with a matching btree index:
"ir_model_data_module_name_uniq_index" UNIQUE, btree (module, name)

With 1.6 extension, the query is planned with a gist index:
"ir_model_data_name_idx2" gist (name gist_trgm_ops)

1.0 extension executes the query in 0.1ms, while 1.6 in 100ms

Our solution was to revert to pg_trgm 1.5, so remove operation 11 from gist_trgm_ops. After the removal, the load average was back to normal. 

Is there another way of preventing PostgreSQL to use the gist index when a btree exactly match the condition? Is it viable to stick with the extension in 1.6, but with the operation 11 removed from gist_trgm_ops?

PostgreSQL 16 contains https://github.com/postgres/postgres/commit/cd9479af2af25d7fa9bfd24dd4dcf976b360f077 , but is this applicable to gist?

Thanks in advance,

Nicolas


Re: pg_trgm upgrade to 1.6 led to load average increase

От
Adrian Klaver
Дата:
On 1/20/26 00:50, Nicolas Seinlet wrote:
> Hello,
> 
> we've upgraded the pg_trgm extension from 1.0 to 1.6 on our production 
> database, while sticking our postgresql cluster version to 16. This led 
> to an increase in the load average of the server (twice the load average 
> on our use case). After investigation, we found our issue was linked to :
> https://github.com/postgres/postgres/ 
> commit/935f6666502250abde8615bc7805a6e5aa05a066 <https://github.com/ 
> postgres/postgres/commit/935f6666502250abde8615bc7805a6e5aa05a066>
> 
> We issue queries like :
> SELECT model, res_id FROM ir_model_data WHERE module='base' AND 
> name='public_user';


You need to provide the table definition, including indexes.

> Thanks in advance,
> 
> Nicolas
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: pg_trgm upgrade to 1.6 led to load average increase

От
Nicolas Seinlet
Дата:
On Tuesday, January 20th, 2026 at 17:34, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

>
>
> On 1/20/26 00:50, Nicolas Seinlet wrote:
>
> > Hello,
> >
> > we've upgraded the pg_trgm extension from 1.0 to 1.6 on our production
> > database, while sticking our postgresql cluster version to 16. This led
> > to an increase in the load average of the server (twice the load average
> > on our use case). After investigation, we found our issue was linked to :
> > https://github.com/postgres/postgres/
> > commit/935f6666502250abde8615bc7805a6e5aa05a066 <https://github.com/
> > postgres/postgres/commit/935f6666502250abde8615bc7805a6e5aa05a066>
> >
> > We issue queries like :
> > SELECT model, res_id FROM ir_model_data WHERE module='base' AND
> > name='public_user';
>
>
>
> You need to provide the table definition, including indexes.
>

Hello,

here is the table definition:

                                         Table "public.ir_model_data"
   Column    |            Type             | Collation | Nullable |                  Default
-------------+-----------------------------+-----------+----------+-------------------------------------------
 id          | integer                     |           | not null | nextval('ir_model_data_id_seq'::regclass)
 create_uid  | integer                     |           |          |
 create_date | timestamp without time zone |           |          | timezone('UTC'::text, now())
 write_date  | timestamp without time zone |           |          | timezone('UTC'::text, now())
 write_uid   | integer                     |           |          |
 noupdate    | boolean                     |           |          | false
 name        | character varying           |           | not null |
 module      | character varying           |           | not null |
 model       | character varying           |           | not null |
 res_id      | integer                     |           |          |
Indexes:
    "ir_model_data_pkey" PRIMARY KEY, btree (id)
    "ir_model_data_create_uid_idx" btree (create_uid) WHERE create_uid IS NOT NULL AND (create_uid <> ALL (ARRAY[1,
208196]))
    "ir_model_data_model_name_multilang_index" btree (name) WHERE model::text = ANY (ARRAY['account.account'::character
varying::text,'account.group'::character varying::text, 'account.tax'::character varying::text,
'account.fiscal.position'::charactervarying::text]) 
    "ir_model_data_model_res_id_index" btree (model, res_id)
    "ir_model_data_module_name_uniq_index" UNIQUE, btree (module, name)
    "ir_model_data_name_idx2" gist (name gist_trgm_ops)
    "ir_model_data_write_uid_idx" btree (write_uid) WHERE write_uid IS NOT NULL AND (write_uid <> ALL (ARRAY[1,
208196]))
Check constraints:
    "ir_model_data_name_nospaces" CHECK (name::text !~~ '% %'::text)

> > Thanks in advance,
> >
> > Nicolas
>
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com





Re: pg_trgm upgrade to 1.6 led to load average increase

От
Erik Wienhold
Дата:
On 2026-01-20 09:50 +0100, Nicolas Seinlet wrote:
> we've upgraded the pg_trgm extension from 1.0 to 1.6 on our production
> database, while sticking our postgresql cluster version to 16. This
> led to an increase in the load average of the server (twice the load
> average on our use case). After investigation, we found our issue was
> linked to :
> https://github.com/postgres/postgres/commit/935f6666502250abde8615bc7805a6e5aa05a066
> 
> We issue queries like :
> SELECT model, res_id FROM ir_model_data WHERE module='base' AND name='public_user';
> 
> With 1.0 extension, the query is planned with a matching btree index:
> "ir_model_data_module_name_uniq_index" UNIQUE, btree (module, name)
> 
> With 1.6 extension, the query is planned with a gist index:
> "ir_model_data_name_idx2" gist (name gist_trgm_ops)
> 
> 1.0 extension executes the query in 0.1ms, while 1.6 in 100ms
> 
> Our solution was to revert to pg_trgm 1.5, so remove operation 11 from
> gist_trgm_ops. After the removal, the load average was back to normal.
> 
> Is there another way of preventing PostgreSQL to use the gist index
> when a btree exactly match the condition?

It's been suggested to move the GiST index to another tablespace
with increased random_page_cost. [1]

> Is ivt viable to stick with the extension in 1.6, but with the
> operation 11 removed from gist_trgm_ops?

This would be the same as sticking with 1.5 since the new equality
operator is the only change in 1.6.

[1] https://www.postgresql.org/message-id/CAApHDvp3W7G8Oo4=wjt0ceTbic35SHJ=qfoD_CnPXSnZVzCkhQ@mail.gmail.com

-- 
Erik Wienhold



Re: pg_trgm upgrade to 1.6 led to load average increase

От
Tom Lane
Дата:
Nicolas Seinlet <nicolas@seinlet.com> writes:
> We issue queries like :
> SELECT model, res_id FROM ir_model_data WHERE module='base' AND name='public_user';

> With 1.0 extension, the query is planned with a matching btree index:
> "ir_model_data_module_name_uniq_index" UNIQUE, btree (module, name)

> With 1.6 extension, the query is planned with a gist index:
> "ir_model_data_name_idx2" gist (name gist_trgm_ops)

> 1.0 extension executes the query in 0.1ms, while 1.6 in 100ms

It seems quite odd that the planner would prefer an index
using only one of the two clauses.  It must be concluding that
the gist indexscan will be cheaper, but it's hard to see why
that would be.

Could we see EXPLAIN ANALYZE output for those two cases?
Are the statistics for the table up-to-date?

            regards, tom lane



Re: pg_trgm upgrade to 1.6 led to load average increase

От
Nicolas Seinlet
Дата:
On Wednesday, January 21st, 2026 at 17:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:

>
>
> Nicolas Seinlet nicolas@seinlet.com writes:
>
> > We issue queries like :
> > SELECT model, res_id FROM ir_model_data WHERE module='base' AND name='public_user';
>
> > With 1.0 extension, the query is planned with a matching btree index:
> > "ir_model_data_module_name_uniq_index" UNIQUE, btree (module, name)
>
> > With 1.6 extension, the query is planned with a gist index:
> > "ir_model_data_name_idx2" gist (name gist_trgm_ops)
>
> > 1.0 extension executes the query in 0.1ms, while 1.6 in 100ms
>
>
> It seems quite odd that the planner would prefer an index
> using only one of the two clauses. It must be concluding that
> the gist indexscan will be cheaper, but it's hard to see why
> that would be.
>
> Could we see EXPLAIN ANALYZE output for those two cases?
> Are the statistics for the table up-to-date?
>
> regards, tom lane

That table is not that much updated, so we could expect the statistics are quite up-to-date, especially because our
analysetrigger parameters are quite aggressive. But, to ensure they are decently correct, I'll include a vacuum analyse
inmy test. 

With pg_trgm=1.5:
QUERY PLAN
-----------
 Index Scan using ir_model_data_module_name_uniq_index on ir_model_data  (cost=0.56..2.58 rows=1 width=17) (actual
time=0.027..0.028rows=1 loops=1) 
   Index Cond: (((module)::text = 'base'::text) AND ((name)::text = 'public_user'::text))
 Planning Time: 0.093 ms
 Execution Time: 0.050 ms
(4 rows)

With pg_trgm=1.6:
QUERY PLAN
----------
 Index Scan using ir_model_data_name_idx2 on ir_model_data  (cost=0.42..2.44 rows=1 width=17) (actual
time=8403.936..9847.983rows=1 loops=1) 
   Index Cond: ((name)::text = 'public_user'::text)
   Rows Removed by Index Recheck: 10
   Filter: ((module)::text = 'base'::text)
 Planning Time: 1.564 ms
 Execution Time: 9848.027 ms
(6 rows)

For the tests, the random_page_cost=1.

Let's try with random_page_cost=2
QUERY PLAN
----------
 Index Scan using ir_model_data_name_idx2 on ir_model_data  (cost=0.42..4.44 rows=1 width=17) (actual
time=106.136..191.606rows=1 loops=1) 
   Index Cond: ((name)::text = 'public_user'::text)
   Rows Removed by Index Recheck: 10
   Filter: ((module)::text = 'base'::text)
 Planning Time: 0.096 ms
 Execution Time: 191.623 ms
(6 rows)

And finally, after a vacuum analyse and set random_page_cost to 1:
QUERY PLAN
-----------
 Index Scan using ir_model_data_name_idx2 on ir_model_data  (cost=0.42..4.44 rows=1 width=17) (actual
time=104.866..189.119rows=1 loops=1) 
   Index Cond: ((name)::text = 'public_user'::text)
   Rows Removed by Index Recheck: 10
   Filter: ((module)::text = 'base'::text)
 Planning Time: 0.352 ms
 Execution Time: 189.134 ms
(6 rows)

Thanks for all,

Nicolas.



Re: pg_trgm upgrade to 1.6 led to load average increase

От
Nicolas Seinlet
Дата:
On Thursday, January 22nd, 2026 at 13:11, Nicolas Seinlet <nicolas@seinlet.com> wrote:

>
>
> On Wednesday, January 21st, 2026 at 17:28, Tom Lane tgl@sss.pgh.pa.us wrote:
>
> > Nicolas Seinlet nicolas@seinlet.com writes:
> >
> > > We issue queries like :
> > > SELECT model, res_id FROM ir_model_data WHERE module='base' AND name='public_user';
> >
> > > With 1.0 extension, the query is planned with a matching btree index:
> > > "ir_model_data_module_name_uniq_index" UNIQUE, btree (module, name)
> >
> > > With 1.6 extension, the query is planned with a gist index:
> > > "ir_model_data_name_idx2" gist (name gist_trgm_ops)
> >
> > > 1.0 extension executes the query in 0.1ms, while 1.6 in 100ms
> >
> > It seems quite odd that the planner would prefer an index
> > using only one of the two clauses. It must be concluding that
> > the gist indexscan will be cheaper, but it's hard to see why
> > that would be.
> >
> > Could we see EXPLAIN ANALYZE output for those two cases?
> > Are the statistics for the table up-to-date?
> >
> > regards, tom lane
>
>
> That table is not that much updated, so we could expect the statistics are quite up-to-date, especially because our
analysetrigger parameters are quite aggressive. But, to ensure they are decently correct, I'll include a vacuum analyse
inmy test. 
>
> With pg_trgm=1.5:
> QUERY PLAN
> -----------
> Index Scan using ir_model_data_module_name_uniq_index on ir_model_data (cost=0.56..2.58 rows=1 width=17) (actual
time=0.027..0.028rows=1 loops=1) 
> Index Cond: (((module)::text = 'base'::text) AND ((name)::text = 'public_user'::text))
> Planning Time: 0.093 ms
> Execution Time: 0.050 ms
> (4 rows)
>
> With pg_trgm=1.6:
> QUERY PLAN
> ----------
> Index Scan using ir_model_data_name_idx2 on ir_model_data (cost=0.42..2.44 rows=1 width=17) (actual
time=8403.936..9847.983rows=1 loops=1) 
> Index Cond: ((name)::text = 'public_user'::text)
> Rows Removed by Index Recheck: 10
> Filter: ((module)::text = 'base'::text)
> Planning Time: 1.564 ms
> Execution Time: 9848.027 ms
> (6 rows)
>
> For the tests, the random_page_cost=1.
>
> Let's try with random_page_cost=2
> QUERY PLAN
> ----------
> Index Scan using ir_model_data_name_idx2 on ir_model_data (cost=0.42..4.44 rows=1 width=17) (actual
time=106.136..191.606rows=1 loops=1) 
> Index Cond: ((name)::text = 'public_user'::text)
> Rows Removed by Index Recheck: 10
> Filter: ((module)::text = 'base'::text)
> Planning Time: 0.096 ms
> Execution Time: 191.623 ms
> (6 rows)
>
> And finally, after a vacuum analyse and set random_page_cost to 1:
> QUERY PLAN
> -----------
> Index Scan using ir_model_data_name_idx2 on ir_model_data (cost=0.42..4.44 rows=1 width=17) (actual
time=104.866..189.119rows=1 loops=1) 
> Index Cond: ((name)::text = 'public_user'::text)
> Rows Removed by Index Recheck: 10
> Filter: ((module)::text = 'base'::text)
> Planning Time: 0.352 ms
> Execution Time: 189.134 ms
> (6 rows)
>
> Thanks for all,
>
> Nicolas.

I've also tried to put the random_page_cost to a huge value, and it does not change anything in index selection,
despitethe second cost increase. 
QUERY PLAN
----------
 Index Scan using ir_model_data_name_idx2 on ir_model_data  (cost=0.42..400.44 rows=1 width=17) (actual
time=141.930..256.600rows=1 loops=1) 
   Index Cond: ((name)::text = 'public_user'::text)
   Rows Removed by Index Recheck: 10
   Filter: ((module)::text = 'base'::text)
 Planning Time: 1.588 ms
 Execution Time: 256.640 ms
(6 rows)



Re: pg_trgm upgrade to 1.6 led to load average increase

От
Tom Lane
Дата:
Nicolas Seinlet <nicolas@seinlet.com> writes:
> With pg_trgm=1.5:
> QUERY PLAN
> -----------
>  Index Scan using ir_model_data_module_name_uniq_index on ir_model_data  (cost=0.56..2.58 rows=1 width=17) (actual
time=0.027..0.028rows=1 loops=1) 
>    Index Cond: (((module)::text = 'base'::text) AND ((name)::text = 'public_user'::text))
>  Planning Time: 0.093 ms
>  Execution Time: 0.050 ms
> (4 rows)

> With pg_trgm=1.6:
> QUERY PLAN
> ----------
>  Index Scan using ir_model_data_name_idx2 on ir_model_data  (cost=0.42..2.44 rows=1 width=17) (actual
time=8403.936..9847.983rows=1 loops=1) 
>    Index Cond: ((name)::text = 'public_user'::text)
>    Rows Removed by Index Recheck: 10
>    Filter: ((module)::text = 'base'::text)
>  Planning Time: 1.564 ms
>  Execution Time: 9848.027 ms
> (6 rows)

Hmm, so the cost estimates are nearly the same for both index types.
I can reproduce that locally if the gist-indexed column is estimated
to have only one match; then the additional index condition isn't
believed to improve the selectivity any, so the planner has no
motivation to prefer using the two-column index.

What I don't reproduce here is the huge differential in actual cost.
The gist index is a bit slower for me, but not five orders of
magnitude slower.  However, my test table isn't huge --- about
80K rows --- and I suppose from these numbers that yours must be
far larger.

So I'm thinking that the default signature length for gist_trgm_ops
isn't large enough for your data set and what you want to do is
make it bigger:

CREATE INDEX ... USING GIST (name gist_trgm_ops(siglen = X));

The default value is 12 bytes, so maybe try 24 or 32.  That will
result in an index that's physically larger, but with luck
less of it will need to be scanned for any one query.

            regards, tom lane