Обсуждение: PgSQL 15.3: Execution plan not using index as expected

Поиск
Список
Период
Сортировка

PgSQL 15.3: Execution plan not using index as expected

От
Dürr Software
Дата:
Dear list,

i have a strange problem when migrating a DB from version 9.3.4 to 15.3:
An index which seems perfect for the query and is used in 9.3.4 as 
expected is not used in 15.3.
I just wonder, whether the bug is on my side or on PgSQL's..
The details:
===========================================================================
PostgreSQL 9.3.4:

test=# select version();
version
----------------------------------------------------------------------------------------------
  PostgreSQL 9.3.4 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 
4.8.3-2) 4.8.3, 64-bit
(1 row)

test=> \d client_session
                                             Table "client_session"
     Column     |              Type |                             Modifiers
---------------+--------------------------------+-------------------------------------------------------------------
  id            | bigint                         | not null default 
nextval('client_session_id_seq'::regclass)
  tstamp_start  | timestamp(3) without time zone | not null default now()
  permit_id     | character varying(63)          | not null default 
"current_user"()
  user_id       | character varying(63)          | not null default 
"session_user"()
Indexes:
     "client_session_pkey" PRIMARY KEY, btree (id)
     "client_session_user_id_idx" btree (user_id, tstamp_start DESC)

vdws=# explain analyze SELECT permit_id FROM client_session WHERE 
user_id=SESSION_USER::VARCHAR ORDER BY tstamp_start DESC LIMIT 1;
QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=0.57..2.37 rows=1 width=23) (actual time=0.134..0.134 
rows=1 loops=1)
    ->  Index Scan using client_session_user_id_idx on client_session  
(cost=0.57..52337.99 rows=29181 width=23) (actual time=0.133..0.133 
rows=1 loops=1)
          Index Cond: ((user_id)::text = (("session_user"())::character 
varying)::text)
  Total runtime: 0.165 ms
(4 rows)

===========================================================================
PostgreSQL 15.3:

test=# select version();
version
---------------------------------------------------------------------------------------------------------------------
  PostgreSQL 15.3 (Debian 15.3-1.pgdg120+1) on x86_64-pc-linux-gnu, 
compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
(1 Zeile)

test=# \d client_session
                                                   Tabelle »client_session«
     Spalte     |              Typ               | Sortierfolge | NULL 
erlaubt? |                   Vorgabewert

---------------+--------------------------------+--------------+---------------+--------------------------------------------------
  id            | bigint                         |              | not 
null      | nextval('client_session_id_seq'::regclass)
  tstamp_start  | timestamp(3) without time zone |              | not 
null      | now()
  permit_id     | character varying(63)          |              | not 
null      | "current_user"()
  user_id       | character varying(63)          |              | not 
null      | "session_user"()
Indexe:
     "client_session_pkey" PRIMARY KEY, btree (id)
     "client_session_user_id_idx" btree (user_id, tstamp_start DESC)

test=# explain analyze SELECT permit_id FROM client_session WHERE 
user_id=SESSION_USER::VARCHAR ORDER BY tstamp_start DESC LIMIT 1;
QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=3778568.38..3778568.50 rows=1 width=152) (actual 
time=8431.320..8437.169 rows=1 loops=1)
    ->  Gather Merge  (cost=3778568.38..3853392.64 rows=641306 
width=152) (actual time=8383.774..8389.622 rows=1 loops=1)
          Workers Planned: 2
          Workers Launched: 2
          ->  Sort  (cost=3777568.36..3778369.99 rows=320653 width=152) 
(actual time=8372.263..8372.263 rows=0 loops=3)
                Sort Key: tstamp_start DESC
                Sort Method: quicksort  Memory: 25kB
                Worker 0:  Sort Method: quicksort  Memory: 25kB
                Worker 1:  Sort Method: quicksort  Memory: 25kB
                ->  Parallel Seq Scan on client_session 
(cost=0.00..3775965.09 rows=320653 width=152) (actual 
time=6150.412..8372.191 rows=1 loops=3)
                      Filter: ((user_id)::text = 
((SESSION_USER)::character varying)::text)
                      Rows Removed by Filter: 51303778
  Planning Time: 0.203 ms
  JIT:
    Functions: 13
    Options: Inlining true, Optimization true, Expressions true, 
Deforming true
    Timing: Generation 1.644 ms, Inlining 120.073 ms, Optimization 
70.361 ms, Emission 28.476 ms, Total 220.554 ms
  Execution Time: 8438.307 ms
(18 rows)

Thanks a lot for your help

-- 
======================================
Dürr Software Entw.
Guggenberg 26, DE-82380 Peißenberg
fon: +49-8803-4899016  fax: +49-8803-4899017
info@fduerr.de

Вложения

Re: PgSQL 15.3: Execution plan not using index as expected

От
Adrian Klaver
Дата:
On 8/9/23 01:14, Dürr Software wrote:
> Dear list,
> 
> i have a strange problem when migrating a DB from version 9.3.4 to 15.3:
> An index which seems perfect for the query and is used in 9.3.4 as 
> expected is not used in 15.3.

Did you run ANALYZE on the 15.3 database after the migration?


-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: PgSQL 15.3: Execution plan not using index as expected

От
Adrian Klaver
Дата:
On 8/11/23 03:11, Dürr Software wrote:

Please reply to list also
Ccing list
> Dear Adrian,
> 
> thanks for the reply. Of course i ran ANALYZE on the 15.3 system, its in 
> the second part of my post, but here again, FYI:

That is EXPLAIN ANALYZE where it is an option to the command:

https://www.postgresql.org/docs/current/sql-explain.html

ANALYZE

     Carry out the command and show actual run times and other 
statistics. This parameter defaults to FALSE.


What I was talking about was the ANALYZE command:

https://www.postgresql.org/docs/current/sql-analyze.html

ANALYZE collects statistics about the contents of tables in the 
database, and stores the results in the pg_statistic system catalog. 
Subsequently, the query planner uses these statistics to help determine 
the most efficient execution plans for queries.

> 
> test=# \d client_session
>                                                   Tabelle »client_session«
>     Spalte     |              Typ               | Sortierfolge | NULL 
> erlaubt? |                   Vorgabewert
>
---------------+--------------------------------+--------------+---------------+--------------------------------------------------
> id            | bigint                         |              | not null 
>       | nextval('admin.client_session_id_seq'::regclass)
> tstamp_start  | timestamp(3) without time zone |              | not null 
>       | now()
> permit_id     | character varying(63)          |              | not null 
>       | "current_user"()
> user_id       | character varying(63)          |              | not null 
>       | "session_user"()
> 
> Indexe:
>      "client_session_pkey" PRIMARY KEY, btree (id)
>      "client_session_user_id_idx" btree (user_id, tstamp_start DESC)
> 
> test=# explain analyze SELECT permit_id FROM client_session WHERE 
> user_id::character varying(63)=SESSION_USER::character varying(63) ORDER 
> BY tstamp_start DESC LIMIT 1;
>                                                                     QUERY PLAN
>
---------------------------------------------------------------------------------------------------------------------------------------------------
> Limit  (cost=2852336.36..2852336.48 rows=1 width=23) (actual 
> time=5994.540..6000.702 rows=1 loops=1)
>    ->  Gather Merge  (cost=2852336.36..2852697.59 rows=3096 width=23) 
> (actual time=5946.422..5952.583 rows=1 loops=1)
>          Workers Planned: 2
>          Workers Launched: 2
>          ->  Sort  (cost=2851336.34..2851340.21 rows=1548 width=23) 
> (actual time=5934.963..5934.964 rows=1 loops=3)
>                Sort Key: tstamp_start DESC
>                Sort Method: quicksort  Memory: 25kB
>                Worker 0:  Sort Method: quicksort  Memory: 25kB
>                Worker 1:  Sort Method: quicksort  Memory: 25kB
>                ->  Parallel Seq Scan on client_session 
>   (cost=0.00..2851328.60 rows=1548 width=23) (actual 
> time=3885.774..5934.915 rows=1 loops=3)
>                      Filter: ((user_id)::text = 
> ((SESSION_USER)::character varying(63))::text)
>                      Rows Removed by Filter: 37163374
> Planning Time: 0.167 ms
> JIT:
>    Functions: 13
>    Options: Inlining true, Optimization true, Expressions true, 
> Deforming true
>    Timing: Generation 0.940 ms, Inlining 119.027 ms, Optimization 79.333 
> ms, Emission 29.624 ms, Total 228.924 ms
> Execution Time: 6001.014 ms
> (18 Zeilen)
> 
> Funny thing: if i create an index on tstamp_start alone, it is used just 
> perfectly:
> 
> Indexe:
>      "client_session_pkey" PRIMARY KEY, btree (id)
>      "client_session_tstamp_start" btree (tstamp_start)
>      "client_session_user_id_idx" btree (user_id, tstamp_start DESC)
> 
> test=# explain analyze SELECT permit_id FROM admin.client_session WHERE 
> user_id::character varying(63)=SESSION_USER::character varying(63) ORDER 
> BY tstamp_start DESC LIMIT 1;
> 
> QUERY PLAN
>
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
>   Limit  (cost=0.57..1787.85 rows=1 width=23) (actual time=0.721..0.723 
> rows=1 loops=1)
>     ->  Index Scan Backward using client_session_tstamp_start on 
> client_session  (cost=0.57..6639766.39 rows=3715 width=23) (actual 
> time=0.719..0.719 rows=1 loops=1)
>           Filter: ((user_id)::text = ((SESSION_USER)::character 
> varying(63))::text)
>   Planning Time: 0.227 ms
>   Execution Time: 0.761 ms
> (5 Zeilen)
> 
> ======================================
> Dürr Software Entw.
> Guggenberg 26, DE-82380 Peißenberg
> fon: +49-8803-4899016  fax: +49-8803-4899017
> info@fduerr.de
> 
> Am 10.08.23 um 16:41 schrieb Adrian Klaver:
>> On 8/9/23 01:14, Dürr Software wrote:
>>> Dear list,
>>>
>>> i have a strange problem when migrating a DB from version 9.3.4 to 15.3:
>>> An index which seems perfect for the query and is used in 9.3.4 as 
>>> expected is not used in 15.3.
>>
>> Did you run ANALYZE on the 15.3 database after the migration?
>>
>>
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: PgSQL 15.3: Execution plan not using index as expected

От
rob stone
Дата:
Hello,

> ---------------------------------------------------------------------
> -------------------------
>   PostgreSQL 9.3.4 on x86_64-unknown-linux-gnu, compiled by gcc
> (Debian
> 4.8.3-2) 4.8.3, 64-bit
>
>
> ---------------------------------------------------------------------
> ------------------------------------------------
>   PostgreSQL 15.3 (Debian 15.3-1.pgdg120+1) on x86_64-pc-linux-gnu,
> compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
> (1 Zeile)
>
>
9.3 plan
->  Index Scan using client_session_user_id_idx on client_session

Looks like a collation issue given the difference in compilers used.
In the 9.3 plan the index is used.
Maybe try a reindex of the table.

HTH,
Rob




Re: PgSQL 15.3: Execution plan not using index as expected

От
Dürr Software
Дата:
Thanks Adrian,

sorry for the misunderstanding.
I ran ANALYZE, it didn't change a thing (as expected).
Anyway, I pinned the problem down now: It's the use of CURRENT_USER (or 
SESSION_USER etc.) in the WHERE condition.
If i replace it with 'postgres' (the result of CURRENT_USER) the planner 
works as expected..
The old 9.x - version of PgSQL didn't have that problem.

Test case:

-- our test table with index on user_id
CREATE TABLE tt (
  user_id VARCHAR(63) NOT NULL DEFAULT SESSION_USER
);
CREATE INDEX tt_user_id_idx ON tt(user_id);

-- fill with test data
INSERT INTO tt(user_id) select 'U' || i from generate_series(1,100000) as i;
INSERT INTO tt(user_id) select SESSION_USER from generate_series(1,100);

-- query using CURRENT_USER as WHERE-condition - doesn't use index
EXPLAIN ANALYZE SELECT * FROM tt WHERE user_id::character 
varying(63)=CURRENT_USER::character varying(63) LIMIT 1;
                                                QUERY PLAN
---------------------------------------------------------------------------------------------------------
  Limit  (cost=0.00..21.65 rows=1 width=6) (actual time=18.143..18.143 
rows=1 loops=1)
    ->  Seq Scan on tt  (cost=0.00..2446.00 rows=113 width=6) (actual 
time=18.141..18.141 rows=1 loops=1)
          Filter: ((user_id)::text = ((CURRENT_USER)::character 
varying(63))::text)
          Rows Removed by Filter: 100000
  Planning Time: 0.154 ms
  Execution Time: 18.163 ms
(6 Zeilen)

SELECT CURRENT_USER;
  current_user
--------------
  postgres
(1 Zeile)

-- query using result of CURRENT_USER as WHERE-condition - uses index
EXPLAIN ANALYZE SELECT * FROM tt WHERE user_id::character 
varying(63)='postgres'::character varying(63) LIMIT 1;
                                                            QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=0.42..0.47 rows=1 width=6) (actual time=0.018..0.019 
rows=1 loops=1)
    ->  Index Only Scan using tt_user_id_idx on tt (cost=0.42..6.39 
rows=113 width=6) (actual time=0.017..0.018 rows=1 loops=1)
          Index Cond: (user_id = 'postgres'::text)
          Heap Fetches: 0
  Planning Time: 0.081 ms
  Execution Time: 0.026 ms
(6 Zeilen)

-- CURRENT_USER is not expensive..
EXPLAIN ANALYZE SELECT CURRENT_USER;
                                      QUERY PLAN
-------------------------------------------------------------------------------------
  Result  (cost=0.00..0.01 rows=1 width=64) (actual time=0.005..0.006 
rows=1 loops=1)
  Planning Time: 0.031 ms
  Execution Time: 0.025 ms
(3 Zeilen)

I hope that this should clarify the problem.
Thanks and kind regards

======================================
Dürr Software Entw.
info@fduerr.de

Am 11.08.23 um 16:32 schrieb Adrian Klaver:
> On 8/11/23 03:11, Dürr Software wrote:
>
> Please reply to list also
> Ccing list
>> Dear Adrian,
>>
>> thanks for the reply. Of course i ran ANALYZE on the 15.3 system, its 
>> in the second part of my post, but here again, FYI:
>
> That is EXPLAIN ANALYZE where it is an option to the command:
>
> https://www.postgresql.org/docs/current/sql-explain.html
>
> ANALYZE
>
>     Carry out the command and show actual run times and other 
> statistics. This parameter defaults to FALSE.
>
>
> What I was talking about was the ANALYZE command:
>
> https://www.postgresql.org/docs/current/sql-analyze.html
>
> ANALYZE collects statistics about the contents of tables in the 
> database, and stores the results in the pg_statistic system catalog. 
> Subsequently, the query planner uses these statistics to help 
> determine the most efficient execution plans for queries.
>
>>
>> test=# \d client_session
>>                                                   Tabelle 
>> »client_session«
>>     Spalte     |              Typ               | Sortierfolge | NULL 
>> erlaubt? |                   Vorgabewert
>>
---------------+--------------------------------+--------------+---------------+--------------------------------------------------

>>
>> id            | bigint                         |              | not 
>> null       | nextval('admin.client_session_id_seq'::regclass)
>> tstamp_start  | timestamp(3) without time zone |              | not 
>> null       | now()
>> permit_id     | character varying(63)          |              | not 
>> null       | "current_user"()
>> user_id       | character varying(63)          |              | not 
>> null       | "session_user"()
>>
>> Indexe:
>>      "client_session_pkey" PRIMARY KEY, btree (id)
>>      "client_session_user_id_idx" btree (user_id, tstamp_start DESC)
>>
>> test=# explain analyze SELECT permit_id FROM client_session WHERE 
>> user_id::character varying(63)=SESSION_USER::character varying(63) 
>> ORDER BY tstamp_start DESC LIMIT 1;
>>                                                                     QUERY 
>> PLAN
>>
---------------------------------------------------------------------------------------------------------------------------------------------------

>>
>> Limit  (cost=2852336.36..2852336.48 rows=1 width=23) (actual 
>> time=5994.540..6000.702 rows=1 loops=1)
>>    ->  Gather Merge  (cost=2852336.36..2852697.59 rows=3096 width=23) 
>> (actual time=5946.422..5952.583 rows=1 loops=1)
>>          Workers Planned: 2
>>          Workers Launched: 2
>>          ->  Sort  (cost=2851336.34..2851340.21 rows=1548 width=23) 
>> (actual time=5934.963..5934.964 rows=1 loops=3)
>>                Sort Key: tstamp_start DESC
>>                Sort Method: quicksort  Memory: 25kB
>>                Worker 0:  Sort Method: quicksort  Memory: 25kB
>>                Worker 1:  Sort Method: quicksort  Memory: 25kB
>>                ->  Parallel Seq Scan on client_session 
>>   (cost=0.00..2851328.60 rows=1548 width=23) (actual 
>> time=3885.774..5934.915 rows=1 loops=3)
>>                      Filter: ((user_id)::text = 
>> ((SESSION_USER)::character varying(63))::text)
>>                      Rows Removed by Filter: 37163374
>> Planning Time: 0.167 ms
>> JIT:
>>    Functions: 13
>>    Options: Inlining true, Optimization true, Expressions true, 
>> Deforming true
>>    Timing: Generation 0.940 ms, Inlining 119.027 ms, Optimization 
>> 79.333 ms, Emission 29.624 ms, Total 228.924 ms
>> Execution Time: 6001.014 ms
>> (18 Zeilen)
>>
>> Funny thing: if i create an index on tstamp_start alone, it is used 
>> just perfectly:
>>
>> Indexe:
>>      "client_session_pkey" PRIMARY KEY, btree (id)
>>      "client_session_tstamp_start" btree (tstamp_start)
>>      "client_session_user_id_idx" btree (user_id, tstamp_start DESC)
>>
>> test=# explain analyze SELECT permit_id FROM admin.client_session 
>> WHERE user_id::character varying(63)=SESSION_USER::character 
>> varying(63) ORDER BY tstamp_start DESC LIMIT 1;
>>
>> QUERY PLAN
>>
---------------------------------------------------------------------------------------------------------------------------------------------------------------------

>>
>>   Limit  (cost=0.57..1787.85 rows=1 width=23) (actual 
>> time=0.721..0.723 rows=1 loops=1)
>>     ->  Index Scan Backward using client_session_tstamp_start on 
>> client_session (cost=0.57..6639766.39 rows=3715 width=23) (actual 
>> time=0.719..0.719 rows=1 loops=1)
>>           Filter: ((user_id)::text = ((SESSION_USER)::character
>> varying(63))::text)
>>   Planning Time: 0.227 ms
>>   Execution Time: 0.761 ms
>> (5 Zeilen)
>>
>> ======================================
>> Dürr Software Entw.
>> info@fduerr.de
>>
>> Am 10.08.23 um 16:41 schrieb Adrian Klaver:
>>> On 8/9/23 01:14, Dürr Software wrote:
>>>> Dear list,
>>>>
>>>> i have a strange problem when migrating a DB from version 9.3.4 to 
>>>> 15.3:
>>>> An index which seems perfect for the query and is used in 9.3.4 as 
>>>> expected is not used in 15.3.
>>>
>>> Did you run ANALYZE on the 15.3 database after the migration?
>>>
>>>
>>
>

Вложения

Re: PgSQL 15.3: Execution plan not using index as expected

От
Dürr Software
Дата:
Thanks Rob,

no it's not a problem with the index. It's a problem with the use of 
CURRENT_USER in the WHERE
I submitted a new post on this matter with a test case.

Kind regards

======================================
Dürr Software Entw.
Guggenberg 26, DE-82380 Peißenberg
fon: +49-8803-4899016  fax: +49-8803-4899017
info@fduerr.de

Am 12.08.23 um 04:25 schrieb rob stone:
> Hello,
>
>> ---------------------------------------------------------------------
>> -------------------------
>>    PostgreSQL 9.3.4 on x86_64-unknown-linux-gnu, compiled by gcc
>> (Debian
>> 4.8.3-2) 4.8.3, 64-bit
>>
>>
>> ---------------------------------------------------------------------
>> ------------------------------------------------
>>    PostgreSQL 15.3 (Debian 15.3-1.pgdg120+1) on x86_64-pc-linux-gnu,
>> compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
>> (1 Zeile)
>>
>>
> 9.3 plan
> ->  Index Scan using client_session_user_id_idx on client_session
>
> Looks like a collation issue given the difference in compilers used.
> In the 9.3 plan the index is used.
> Maybe try a reindex of the table.
>
> HTH,
> Rob
>

Вложения