In progress INSERT wrecks plans on table
От | Mark Kirkwood |
---|---|
Тема | In progress INSERT wrecks plans on table |
Дата | |
Msg-id | 5179E77B.7050601@catalyst.net.nz обсуждение исходный текст |
Список | pgsql-performance |
Recently we encountered the following unhappy sequence of events: 1/ system running happily 2/ batch load into table begins 3/ very quickly (some) preexisting queries on said table go orders of magnitude slower 4/ database instance becomes unresponsive 5/ application outage After looking down a few false leads, We've isolated the cause to the following: The accumulating in-progress row changes are such that previously optimal plans are optimal no longer. Now this situation will fix itself when the next autoanalyze happens (and new plan will be chosen) - however that cannot occur until the batch load is completed and committed (approx 70 seconds). However during that time there is enough of a performance degradation for queries still using the old plan to cripple the server. Now that we know what is happening we can work around it. But I'm wondering - is there any way (or if not should there be one) to let postgres handle this automatically? I experimented with a quick hack to src/backend/commands/analyze.c (attached) that lets another session's ANALYZE see in progress rows - which works but a) may cause other problems and b) does not help autoaanalyze which has to wait for COMMIT + stats message. I've attached a (synthetic) test case that shows the issue, I'll reproduce the output below to hopefully make the point obvious: Table "public.plan" Column | Type | Modifiers --------+-----------------------------+----------- id | integer | not null typ | integer | not null dat | timestamp without time zone | val | text | not null Indexes: "plan_id" UNIQUE, btree (id) "plan_dat" btree (dat) "plan_typ" btree (typ) [Session 1] EXPLAIN ANALYZE SELECT * FROM plan WHERE typ = 3 AND dat IS NOT NULL; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Index Scan using plan_dat on plan (cost=0.00..265.47 rows=55 width=117) (actual time=0.130..4.409 rows=75 loops=1) Index Cond: (dat IS NOT NULL) Filter: (typ = 3) Rows Removed by Filter: 5960 Total runtime: 4.440 ms (5 rows) [Session 2] BEGIN; INSERT INTO plan SELECT id + 2000001,typ,current_date + id * '1 seconds'::interval ,val FROM plan ; [Session 1] EXPLAIN ANALYZE SELECT * FROM plan WHERE typ = 3 AND dat IS NOT NULL; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Index Scan using plan_dat on plan (cost=0.00..551.35 rows=91 width=117) (actual time=0.131..202.699 rows=75 loops=1) Index Cond: (dat IS NOT NULL) Filter: (typ = 3) Rows Removed by Filter: 5960 Total runtime: 202.729 ms (5 rows) [Session 2] COMMIT; [Session 1...wait for autoanalyze to finish then] EXPLAIN ANALYZE SELECT * FROM plan WHERE typ = 3 AND dat IS NOT NULL; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on plan (cost=407.87..44991.95 rows=10116 width=117) (actual time=2.692..6.582 rows=75 loops=1) Recheck Cond: (typ = 3) Filter: (dat IS NOT NULL) Rows Removed by Filter: 19925 -> Bitmap Index Scan on plan_typ (cost=0.00..405.34 rows=20346 width=0) (actual time=2.573..2.573 rows=20000 loops=1) Index Cond: (typ = 3) Total runtime: 6.615 ms Regards Mark
Вложения
В списке pgsql-performance по дате отправления:
Предыдущее
От: Misa SimicДата:
Сообщение: different plans for the same query - different filter values