On Mon, Jan 11, 2021 at 04:50:12PM +0100, Rémi Chatenay wrote:
> We are experiencing some slow queries due to the query planner using an
> incorrect index. It is using an unoptimized index because the stats are
> computed during the night when the data is not the same as during the day.
>
> CREATE INDEX index_conversations_on_user_id_and_status ON
> public.conversations USING btree (user_id, status);
>
> CREATE INDEX index_conversations_on_status ON public.conversations USING
> btree (status)
>
> The slow query is the following:
>
> SELECT id FROM conversations WHERE status = 'in_progress' AND user_id = 123
>
> There are hundreds of conversations with a status 'in_progress' at a given
> time during the day but virtually none during the night.
>
> So when the analyze is run during the night, PG then thinks that using the
> index_conversations_on_status will return almost no rows and so it uses
> this index instead of the combined one.
>
> When the analyze is run during the day, PG correctly uses the right index
> (index_conversations_on_user_id_and_status)
> We currently run a manual weekly vacuum analyze during the night. I'm
> wondering what are our possible solutions. One is to manually run the
> analyze during the day but is there a way to tell PG to run the auto
> analyze at a given time of the day for example ? I guess we are not the
> first ones to have data patterns that differ between when the analyze is
> run and the query is run.
I think you could run manual ANALYZE during the day just for this one column:
ANALYZE conversations (status);
If it takes too long or causes a performance issue, you could do:
SET default_statistics_target=10;
ANALYZE conversations (status);
You could also change to make autovacuum do this on its own, by setting:
ALTER TABLE conversations SET (autovacuum_analyze_scale_factor=0.005);
If that works but too slow, then maybe ALTER TABLE .. SET STATISTICS 10.
--
Justin