Обсуждение: How to deal with analyze gathering irrelevant stats
Hi,
Thanks in advance for your help. I'm putting as much context and details as possible, but let me know if you have any questions.
What?
We are experiencing some slow queries due to the query planner using an incorrect index. It is using an unoptimized index because the stats are computed during the night when the data is not the same as during the day.
Context
We have a table conversations like that
|id|status|user_id|
and 2 indexes:
CREATE INDEX index_conversations_on_user_id_and_status ON public.conversations USING btree (user_id, status);
CREATE INDEX index_conversations_on_status ON public.conversations USING btree (status)
The slow query is the following:
SELECT id FROM conversations WHERE status = 'in_progress' AND user_id = 123
We expect the query planner to use the index_conversations_on_user_id_and_status but it sometimes uses the other one.
What's happening ?
There are hundreds of conversations with a status 'in_progress' at a given time during the day but virtually none during the night.
So when the analyze is run during the night, PG then thinks that using the index_conversations_on_status will return almost no rows and so it uses this index instead of the combined one.
When the analyze is run during the day, PG correctly uses the right index (index_conversations_on_user_id_and_status)
[With an analyze run during the day]
Limit (cost=0.43..8.45 rows=1 width=8) (actual time=1.666..1.666 rows=0 loops=1)
-> Index Scan using index_conversations_on_user_id_and_status on conversations (cost=0.43..8.45 rows=1 width=8) (actual_time=1.665..1.665 rows:0 loops:1)
Index Cond: ((user_id = 123) AND ((status)::text = 'in_progress'::text))
Filter: (id <> 1)
Planning Time: 8.642 ms
Execution Time: 1.693 ms
[With an analyze run during the night]
Limit (cost=0.43..8.46 rows=1 width=8) (actual time=272.812..272.812 rows=0 loops=1)
-> Index Scan using index_conversations_on_status on conversations (cost=0.43..8.46 rows=1 width=8) (actual_time=272.812..272.812 rows:0 loops:1)
Index Cond: ((status)::text = 'in_progress'::text))
Filter: (id <> 1) AND (user_id = 123)
Rows Removed by Filter: 559
Planning Time: 0.133 ms
Execution Time: 272.886 ms
The question
We currently run a manual weekly vacuum analyze during the night. I'm wondering what are our possible solutions. One is to manually run the analyze during the day but is there a way to tell PG to run the auto analyze at a given time of the day for example ? I guess we are not the first ones to have data patterns that differ between when the analyze is run and the query is run.
Config
Postgres version: 11
Table Metadata
SELECT relname, relpages, reltuples, relallvisible, relkind, relnatts, relhassubclass, reloptions, pg_table_size(oid) FROM pg_class WHERE relname='conversations';
relname | relpages | reltuples | relallvisible | relkind | relnatts | relhassubclass | reloptions | pg_table_size
-------------+----------+------------+---------------+---------+----------+----------------+------------+---------------
conversations | 930265 | 7.3366e+06 | 902732 | r | 16 | f | | 7622991872
Maintenance Setup
We have manual vacuum analyze every week during the night.
GUC Settings
"autovacuum_analyze_threshold" = "50"
"autovacuum_max_workers" = "3",
"autovacuum_naptime" = "60"
"autovacuum_vacuum_threshold" = "50"
Statistics: n_distinct, MCV, histogram
SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, attname, inherited, null_frac, n_distinct, array_length(most_common_vals,1) n_mcv, array_length(histogram_bounds,1) n_hist, correlation FROM pg_stats WHERE attname='status' AND tablename='conversations' ORDER BY 1 DESC;
frac_mcv | tablename | attname | inherited | null_frac | n_distinct | n_mcv | n_hist | correlation
----------+-------------+---------+-----------+-----------+------------+-------+--------+-------------
0.999967 | conversations | status  | f         |         0 |          6 |     5 |        |    0.967121
On Mon, Jan 11, 2021 at 04:50:12PM +0100, Rémi Chatenay wrote: > We are experiencing some slow queries due to the query planner using an > incorrect index. It is using an unoptimized index because the stats are > computed during the night when the data is not the same as during the day. > > CREATE INDEX index_conversations_on_user_id_and_status ON > public.conversations USING btree (user_id, status); > > CREATE INDEX index_conversations_on_status ON public.conversations USING > btree (status) > > The slow query is the following: > > SELECT id FROM conversations WHERE status = 'in_progress' AND user_id = 123 > > There are hundreds of conversations with a status 'in_progress' at a given > time during the day but virtually none during the night. > > So when the analyze is run during the night, PG then thinks that using the > index_conversations_on_status will return almost no rows and so it uses > this index instead of the combined one. > > When the analyze is run during the day, PG correctly uses the right index > (index_conversations_on_user_id_and_status) > We currently run a manual weekly vacuum analyze during the night. I'm > wondering what are our possible solutions. One is to manually run the > analyze during the day but is there a way to tell PG to run the auto > analyze at a given time of the day for example ? I guess we are not the > first ones to have data patterns that differ between when the analyze is > run and the query is run. I think you could run manual ANALYZE during the day just for this one column: ANALYZE conversations (status); If it takes too long or causes a performance issue, you could do: SET default_statistics_target=10; ANALYZE conversations (status); You could also change to make autovacuum do this on its own, by setting: ALTER TABLE conversations SET (autovacuum_analyze_scale_factor=0.005); If that works but too slow, then maybe ALTER TABLE .. SET STATISTICS 10. -- Justin
Why have an index on the status column at all? My guess would be that there are 2-10 statuses, but many many rows in the table for most of those statuses. Having a low cardinality index that changes frequently seems prone to mis-use by the system.
What is the usage pattern of the conversations table? Is getting many inserts during the day, or updates of status mostly?
Why have an index on the status column at all? My guess would be that there are 2-10 statuses, but many many rows in the table for most of those statuses. Having a low cardinality index that changes frequently seems prone to mis-use by the system.
I'd say it's a 1 insert for 5 - 10 updates.As for the index on the status, it's because we have a job that runs every night that deals with conversations in specific statuses. Having a low cardinality index that changes frequently seems prone to mis-use by the system. -> What would be an alternative ?