Query performance

Поиск

Список

Период

Сортировка

От	Dirschel, Steve-CW
Тема	Query performance
Дата	27 апреля 18:52:06
Msg-id	BYAPR06MB447199DE1FCF363EF389EF1A8A362@BYAPR06MB4471.namprd06.prod.outlook.com обсуждение
Ответ на	Re: how to switch user in postgres ("David G. Johnston" <david.g.johnston@gmail.com>)
Ответы	Re: Query performance
Список	pgsql-performance

Дерево обсуждения

Aurora Postgres version 17.4.

Table in question:

\d poslog_publisher_rms_stage

Table "public.poslog_publisher_rms_stage"

Column | Type | Collation | Nullable | Default

-------------------+--------------------------+-----------+----------+---------

stage_id | uuid | | not null |

status | character varying(100) | | |

message_body | text | | not null |

error_code | character varying(100) | | |

error_category | character varying(100) | | |

error_message | text | | |

error_retry_count | integer | | | 0

create_date | timestamp with time zone | | not null | now()

modified_date | timestamp with time zone | | not null | now()

Indexes:

"poslog_publisher_rms_stage_pkey" PRIMARY KEY, btree (stage_id)

"idx_poslog_publisher_stage_create_date_col" btree (create_date)

"idx_poslog_publisher_stage_status_error_retry_count_modi_date_c" btree (status, error_retry_count, modified_date)

Referenced by:

TABLE "poslog_publisher_rms_detail" CONSTRAINT "fk_poslog_publisher_detail_stage_id" FOREIGN KEY (stage_id) REFERENCES poslog_publisher_rms_stage(stage_id)

Publications:

“sashpsrms_publication"

The table is constantly getting loaded into. Rows are inserted with a status ready and then there is a different process looking for that status and will update to processed after processing the row. We have multiple tables like this.

Then every 2 hours a different process runs this query looking for failed or unprocessed rows:

select

ppse.stage_id as stageId,

ppse.status as status,

ppse.message_body as messageBody

from

poslog_publisher_rms_stage ppse

where

ppse.status in ('UNPROCESSED','FAILED')

and ppse.error_retry_count < 3

order by

ppse.create_date

limit 100;

If I run that query with explain it is doing a ton of work to find 0 rows. The index it uses seems appropriate for the query.

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Limit (cost=10.92..10.93 rows=1 width=1019) (actual time=66566.823..66566.824 rows=0 loops=1)

Buffers: shared hit=1509768 read=2011479

I/O Timings: shared read=79792.017

-> Sort (cost=10.92..10.93 rows=1 width=1019) (actual time=66566.821..66566.821 rows=0 loops=1)

Sort Key: create_date

Sort Method: quicksort Memory: 25kB

Buffers: shared hit=1509768 read=2011479

I/O Timings: shared read=79792.017

-> Index Scan using idx_poslog_publisher_stage_status_error_retry_count_modi_date_c on poslog_publisher_rms_stage ppse (cost=0.57..10.91 rows=1 width=1019) (actual time=66566.761..66566.761 rows=0 loops=1)

Index Cond: (((status)::text = ANY ('{UNPROCESSED,FAILED}'::text[])) AND (error_retry_count < 3))

Buffers: shared hit=1509765 read=2011479

I/O Timings: shared read=79792.017

Planning:

Buffers: shared hit=195 read=1

I/O Timings: shared read=1.038

Planning Time: 2.909 ms

Execution Time: 66581.498 ms

The query did 3.5 million block reads when scanning the index of which 1.5 million were in memory and 2 million were from disk. 5 seconds later I ran the exact same query again:

QUERY PLAN

Limit (cost=10.92..10.93 rows=1 width=1019) (actual time=23.589..23.591 rows=0 loops=1)

Buffers: shared hit=18736

-> Sort (cost=10.92..10.93 rows=1 width=1019) (actual time=23.588..23.589 rows=0 loops=1)

Sort Key: create_date

Sort Method: quicksort Memory: 25kB

Buffers: shared hit=18736

-> Index Scan using idx_poslog_publisher_stage_status_error_retry_count_modi_date_c on poslog_publisher_rms_stage ppse (cost=0.57..10.91 rows=1 width=1019) (actual time=23.583..23.584 rows=0 loops=1)

Index Cond: (((status)::text = ANY ('{UNPROCESSED,FAILED}'::text[])) AND (error_retry_count < 3))

Buffers: shared hit=18736

Planning Time: 0.118 ms

Execution Time: 23.628 ms

Now it only did 18k block reads all in memory. It used the same index, it also returned 0 rows. Between those 2 runs I looked at pg_stat_user_tables and could see the n_tup_ins increased by 13, n_tup_del increased by 13, n_live_tup increased by 13, and n_dead_tup increased by 13. n_live_tup was 96 million and n_dead_tup was 8.4 million. 18k logical reads to find 0 rows is still high but I believe that is most likely caused by the 8.4 million n_dead_tups.

15 minutes later I ran the query again:

QUERY PLAN

Limit (cost=10.92..10.93 rows=1 width=1019) (actual time=59431.795..59431.796 rows=0 loops=1)

Buffers: shared hit=1286676 read=1734000

I/O Timings: shared read=70153.578

-> Sort (cost=10.92..10.93 rows=1 width=1019) (actual time=59431.794..59431.794 rows=0 loops=1)

Sort Key: create_date

Sort Method: quicksort Memory: 25kB

Buffers: shared hit=1286676 read=1734000

I/O Timings: shared read=70153.578

Index Cond: (((status)::text = ANY ('{UNPROCESSED,FAILED}'::text[])) AND (error_retry_count < 3))

Buffers: shared hit=1286676 read=1734000

I/O Timings: shared read=70153.578

Planning Time: 0.114 ms

Execution Time: 59431.839 ms

Total blocks reads increased from 18k to 3 million. n_tup_ins, n_tup_del, n_live_tup, and n_dead_tup all increased by 10,320 over that 15 minute period of time.

What is going on here where this query has to do 3+ million block reads to find 0 rows? And how is it possible when I run the query 2 times in a row the logical reads from the 2nd run comes down significantly? Is this somehow related to determining if rows are visible or something like that? When I waited 15 minutes between runs the inserted/updated rows only increased by 10.3k yet total block reads increased by 3 million.

Thanks

В списке pgsql-performance по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Query performance