Обсуждение: Query planner wants to use seq scan

Поиск
Список
Период
Сортировка

Query planner wants to use seq scan

От
Bertrand Paquet
Дата:
Hi all,

We have a slow query. After analyzing, the planner decision seems to be discutable : the query is faster when disabling seqscan. See below the two query plan, and an extract from pg_stats.

Any idea about what to change to help the planner ?

An information which can be useful : the number on distinct value on organization_id is very very low, may be the planner does not known that, and take the wrong decision.

Regards,

Bertrand

# explain analyze SELECT  1 AS one FROM "external_sync_messages"  WHERE "external_sync_messages"."organization_id" = 1612 AND ("external_sync_messages"."status" NOT IN ('sent_to_proxy', 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS NULL LIMIT 1;

                                                                 QUERY PLAN                                                                 

--------------------------------------------------------------------------------------------------------------------------------------------

 Limit  (cost=0.00..12.39 rows=1 width=0) (actual time=232.212..232.213 rows=1 loops=1)

   ->  Seq Scan on external_sync_messages  (cost=0.00..79104.69 rows=6385 width=0) (actual time=232.209..232.209 rows=1 loops=1)

         Filter: ((handled_by IS NULL) AND (organization_id = 1612) AND ((status)::text <> ALL ('{sent_to_proxy,in_progress,ok}'::text[])))

         Rows Removed by Filter: 600140

 Planning time: 0.490 ms

 Execution time: 232.246 ms

(6 rows)

# set enable_seqscan = off;

SET

# explain analyze SELECT  1 AS one FROM "external_sync_messages"  WHERE "external_sync_messages"."organization_id" = 1612 AND ("external_sync_messages"."status" NOT IN ('sent_to_proxy', 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS NULL LIMIT 1;

                                                                                      QUERY PLAN                                                                                      

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

 Limit  (cost=0.42..39.88 rows=1 width=0) (actual time=0.030..0.030 rows=1 loops=1)

   ->  Index Scan using index_external_sync_messages_on_organization_id on external_sync_messages  (cost=0.42..251934.05 rows=6385 width=0) (actual time=0.028..0.028 rows=1 loops=1)

         Index Cond: (organization_id = 1612)

         Filter: ((handled_by IS NULL) AND ((status)::text <> ALL ('{sent_to_proxy,in_progress,ok}'::text[])))

 Planning time: 0.103 ms

 Execution time: 0.052 ms

(6 rows)

# SELECT attname, inherited, n_distinct, array_to_string(most_common_vals, E'\n') as most_common_vals FROM pg_stats WHERE tablename = 'external_sync_messages' and attname IN ('status', 'organization_id', 'handled_by');

     attname     | inherited | n_distinct | most_common_vals 

-----------------+-----------+------------+------------------

 handled_by      | f         |          3 | 3               +

                 |           |            | 236140          +

                 |           |            | 54413

 organization_id | f         |         22 | 1612            +

                 |           |            | 287             +

                 |           |            | 967             +

                 |           |            | 1223            +

                 |           |            | 1123            +

                 |           |            | 1930            +

                 |           |            | 841             +

                 |           |            | 1814            +

                 |           |            | 711             +

                 |           |            | 1513            +

                 |           |            | 1794            +

                 |           |            | 1246            +

                 |           |            | 1673            +

                 |           |            | 1552            +

                 |           |            | 1747            +

                 |           |            | 2611            +

                 |           |            | 2217            +

                 |           |            | 2448            +

                 |           |            | 2133            +

                 |           |            | 1861            +

                 |           |            | 2616            +

                 |           |            | 2796

 status          | f         |          6 | ok              +

                 |           |            | ignored         +

                 |           |            | channel_error   +

                 |           |            | in_progress     +

                 |           |            | error           +

                 |           |            | sent_to_proxy

(3 rows)

# select count(*) from external_sync_messages;

 count  

--------

 992912

(1 row)


Re: Query planner wants to use seq scan

От
Alex Ignatov
Дата:
On 27.10.2015 12:35, Bertrand Paquet wrote:
> Hi all,
>
> We have a slow query. After analyzing, the planner decision seems to
> be discutable : the query is faster when disabling seqscan. See below
> the two query plan, and an extract from pg_stats.
>
> Any idea about what to change to help the planner ?
>
> An information which can be useful : the number on distinct value on
> organization_id is very very low, may be the planner does not known
> that, and take the wrong decision.
>
> Regards,
>
> Bertrand
>
> # explain analyze SELECT  1 AS one FROM "external_sync_messages"
> WHERE "external_sync_messages"."organization_id" = 1612 AND
> ("external_sync_messages"."status" NOT IN ('sent_to_proxy',
> 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS
> NULL LIMIT 1;
>
>                               QUERY PLAN
>
>
--------------------------------------------------------------------------------------------------------------------------------------------
>
>  Limit  (cost=0.00..12.39 rows=1 width=0) (actual
> time=232.212..232.213 rows=1 loops=1)
>
>    ->  Seq Scan on external_sync_messages  (cost=0.00..79104.69
> rows=6385 width=0) (actual time=232.209..232.209 rows=1 loops=1)
>
>          Filter: ((handled_by IS NULL) AND (organization_id = 1612)
> AND ((status)::text <> ALL ('{sent_to_proxy,in_progress,ok}'::text[])))
>
>          Rows Removed by Filter: 600140
>
>  Planning time: 0.490 ms
>
>  Execution time: 232.246 ms
>
> (6 rows)
>
> # set enable_seqscan = off;
>
> SET
>
> # explain analyze SELECT  1 AS one FROM "external_sync_messages"
> WHERE "external_sync_messages"."organization_id" = 1612 AND
> ("external_sync_messages"."status" NOT IN ('sent_to_proxy',
> 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS
> NULL LIMIT 1;
>
>                                                     QUERY PLAN
>
>
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>  Limit  (cost=0.42..39.88 rows=1 width=0) (actual time=0.030..0.030
> rows=1 loops=1)
>
>    ->  Index Scan using
> index_external_sync_messages_on_organization_id on
> external_sync_messages  (cost=0.42..251934.05 rows=6385 width=0)
> (actual time=0.028..0.028 rows=1 loops=1)
>
>          Index Cond: (organization_id = 1612)
>
>          Filter: ((handled_by IS NULL) AND ((status)::text <> ALL
> ('{sent_to_proxy,in_progress,ok}'::text[])))
>
>  Planning time: 0.103 ms
>
>  Execution time: 0.052 ms
>
> (6 rows)
>
> # SELECT attname, inherited,
> n_distinct, array_to_string(most_common_vals, E'\n') as
> most_common_vals FROM pg_stats WHERE tablename =
> 'external_sync_messages' and attname IN ('status', 'organization_id',
> 'handled_by');
>
>      attname     | inherited | n_distinct | most_common_vals
>
> -----------------+-----------+------------+------------------
>
>  handled_by      | f         |       3 | 3               +
>
>                  |           |         | 236140          +
>
>                  |           |         | 54413
>
>  organization_id | f         |     22 | 1612            +
>
>                  |           |         | 287             +
>
>                  |           |         | 967             +
>
>                  |           |         | 1223            +
>
>                  |           |         | 1123            +
>
>                  |           |         | 1930            +
>
>                  |           |         | 841             +
>
>                  |           |         | 1814            +
>
>                  |           |         | 711             +
>
>                  |           |         | 1513            +
>
>                  |           |         | 1794            +
>
>                  |           |         | 1246            +
>
>                  |           |         | 1673            +
>
>                  |           |         | 1552            +
>
>                  |           |         | 1747            +
>
>                  |           |         | 2611            +
>
>                  |           |         | 2217            +
>
>                  |           |         | 2448            +
>
>                  |           |         | 2133            +
>
>                  |           |         | 1861            +
>
>                  |           |         | 2616            +
>
>                  |           |         | 2796
>
>  status          | f         |       6 | ok              +
>
>                  |           |         | ignored         +
>
>                  |           |         | channel_error   +
>
>                  |           |         | in_progress     +
>
>                  |           |         | error           +
>
>                  |           |         | sent_to_proxy
>
> (3 rows)
>
> # select count(*) from external_sync_messages;
>
>  count
>
> --------
>
>  992912
>
> (1 row)
>
>
Hello, Bertrand!
May be statistics on external_sync_messages is wrong? i.e planner give
us rows=6385 but seq scan give us Rows Removed by Filter: 600140
Maybe you should recalc it by VACUUM ANALYZE it?

--
Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Query planner wants to use seq scan

От
Bertrand Paquet
Дата:
Yes, I have run VACUUM ANALYZE, no effect.

Bertrand

2015-10-27 12:08 GMT+01:00 Alex Ignatov <a.ignatov@postgrespro.ru>:
On 27.10.2015 12:35, Bertrand Paquet wrote:
Hi all,

We have a slow query. After analyzing, the planner decision seems to be discutable : the query is faster when disabling seqscan. See below the two query plan, and an extract from pg_stats.

Any idea about what to change to help the planner ?

An information which can be useful : the number on distinct value on organization_id is very very low, may be the planner does not known that, and take the wrong decision.

Regards,

Bertrand

# explain analyze SELECT  1 AS one FROM "external_sync_messages"  WHERE "external_sync_messages"."organization_id" = 1612 AND ("external_sync_messages"."status" NOT IN ('sent_to_proxy', 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS NULL LIMIT 1;

                              QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------

 Limit  (cost=0.00..12.39 rows=1 width=0) (actual time=232.212..232.213 rows=1 loops=1)

   ->  Seq Scan on external_sync_messages  (cost=0.00..79104.69 rows=6385 width=0) (actual time=232.209..232.209 rows=1 loops=1)

         Filter: ((handled_by IS NULL) AND (organization_id = 1612) AND ((status)::text <> ALL ('{sent_to_proxy,in_progress,ok}'::text[])))

         Rows Removed by Filter: 600140

 Planning time: 0.490 ms

 Execution time: 232.246 ms

(6 rows)

# set enable_seqscan = off;

SET

# explain analyze SELECT  1 AS one FROM "external_sync_messages"  WHERE "external_sync_messages"."organization_id" = 1612 AND ("external_sync_messages"."status" NOT IN ('sent_to_proxy', 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS NULL LIMIT 1;

                                                    QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

 Limit  (cost=0.42..39.88 rows=1 width=0) (actual time=0.030..0.030 rows=1 loops=1)

   ->  Index Scan using index_external_sync_messages_on_organization_id on external_sync_messages  (cost=0.42..251934.05 rows=6385 width=0) (actual time=0.028..0.028 rows=1 loops=1)

         Index Cond: (organization_id = 1612)

         Filter: ((handled_by IS NULL) AND ((status)::text <> ALL ('{sent_to_proxy,in_progress,ok}'::text[])))

 Planning time: 0.103 ms

 Execution time: 0.052 ms

(6 rows)

# SELECT attname, inherited, n_distinct, array_to_string(most_common_vals, E'\n') as most_common_vals FROM pg_stats WHERE tablename = 'external_sync_messages' and attname IN ('status', 'organization_id', 'handled_by');

     attname     | inherited | n_distinct | most_common_vals

-----------------+-----------+------------+------------------

 handled_by      | f         |       3 | 3               +

                 |           |         | 236140          +

                 |           |         | 54413

 organization_id | f         |     22 | 1612            +

                 |           |         | 287             +

                 |           |         | 967             +

                 |           |         | 1223            +

                 |           |         | 1123            +

                 |           |         | 1930            +

                 |           |         | 841             +

                 |           |         | 1814            +

                 |           |         | 711             +

                 |           |         | 1513            +

                 |           |         | 1794            +

                 |           |         | 1246            +

                 |           |         | 1673            +

                 |           |         | 1552            +

                 |           |         | 1747            +

                 |           |         | 2611            +

                 |           |         | 2217            +

                 |           |         | 2448            +

                 |           |         | 2133            +

                 |           |         | 1861            +

                 |           |         | 2616            +

                 |           |         | 2796

 status          | f         |       6 | ok              +

                 |           |         | ignored         +

                 |           |         | channel_error   +

                 |           |         | in_progress     +

                 |           |         | error           +

                 |           |         | sent_to_proxy

(3 rows)

# select count(*) from external_sync_messages;

 count

--------

 992912

(1 row)


Hello, Bertrand!
May be statistics on external_sync_messages is wrong? i.e planner give us rows=6385 but seq scan give us Rows Removed by Filter: 600140
Maybe you should recalc it by VACUUM ANALYZE it?

--
Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: Query planner wants to use seq scan

От
Alex Ignatov
Дата:
On 27.10.2015 14:10, Bertrand Paquet wrote:
Yes, I have run VACUUM ANALYZE, no effect.

Bertrand

2015-10-27 12:08 GMT+01:00 Alex Ignatov <a.ignatov@postgrespro.ru>:
On 27.10.2015 12:35, Bertrand Paquet wrote:
Hi all,

We have a slow query. After analyzing, the planner decision seems to be discutable : the query is faster when disabling seqscan. See below the two query plan, and an extract from pg_stats.

Any idea about what to change to help the planner ?

An information which can be useful : the number on distinct value on organization_id is very very low, may be the planner does not known that, and take the wrong decision.

Regards,

Bertrand

# explain analyze SELECT  1 AS one FROM "external_sync_messages"  WHERE "external_sync_messages"."organization_id" = 1612 AND ("external_sync_messages"."status" NOT IN ('sent_to_proxy', 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS NULL LIMIT 1;

                              QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------

 Limit  (cost=0.00..12.39 rows=1 width=0) (actual time=232.212..232.213 rows=1 loops=1)

   ->  Seq Scan on external_sync_messages  (cost=0.00..79104.69 rows=6385 width=0) (actual time=232.209..232.209 rows=1 loops=1)

         Filter: ((handled_by IS NULL) AND (organization_id = 1612) AND ((status)::text <> ALL ('{sent_to_proxy,in_progress,ok}'::text[])))

         Rows Removed by Filter: 600140

 Planning time: 0.490 ms

 Execution time: 232.246 ms

(6 rows)

# set enable_seqscan = off;

SET

# explain analyze SELECT  1 AS one FROM "external_sync_messages"  WHERE "external_sync_messages"."organization_id" = 1612 AND ("external_sync_messages"."status" NOT IN ('sent_to_proxy', 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS NULL LIMIT 1;

                                                    QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

 Limit  (cost=0.42..39.88 rows=1 width=0) (actual time=0.030..0.030 rows=1 loops=1)

   ->  Index Scan using index_external_sync_messages_on_organization_id on external_sync_messages  (cost=0.42..251934.05 rows=6385 width=0) (actual time=0.028..0.028 rows=1 loops=1)

         Index Cond: (organization_id = 1612)

         Filter: ((handled_by IS NULL) AND ((status)::text <> ALL ('{sent_to_proxy,in_progress,ok}'::text[])))

 Planning time: 0.103 ms

 Execution time: 0.052 ms

(6 rows)

# SELECT attname, inherited, n_distinct, array_to_string(most_common_vals, E'\n') as most_common_vals FROM pg_stats WHERE tablename = 'external_sync_messages' and attname IN ('status', 'organization_id', 'handled_by');

     attname     | inherited | n_distinct | most_common_vals

-----------------+-----------+------------+------------------

 handled_by      | f         |       3 | 3               +

                 |           |         | 236140          +

                 |           |         | 54413

 organization_id | f         |     22 | 1612            +

                 |           |         | 287             +

                 |           |         | 967             +

                 |           |         | 1223            +

                 |           |         | 1123            +

                 |           |         | 1930            +

                 |           |         | 841             +

                 |           |         | 1814            +

                 |           |         | 711             +

                 |           |         | 1513            +

                 |           |         | 1794            +

                 |           |         | 1246            +

                 |           |         | 1673            +

                 |           |         | 1552            +

                 |           |         | 1747            +

                 |           |         | 2611            +

                 |           |         | 2217            +

                 |           |         | 2448            +

                 |           |         | 2133            +

                 |           |         | 1861            +

                 |           |         | 2616            +

                 |           |         | 2796

 status          | f         |       6 | ok              +

                 |           |         | ignored         +

                 |           |         | channel_error   +

                 |           |         | in_progress     +

                 |           |         | error           +

                 |           |         | sent_to_proxy

(3 rows)

# select count(*) from external_sync_messages;

 count

--------

 992912

(1 row)


Hello, Bertrand!
May be statistics on external_sync_messages is wrong? i.e planner give us rows=6385 but seq scan give us Rows Removed by Filter: 600140
Maybe you should recalc it by VACUUM ANALYZE it?

--
Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


What is the result of
select relname,n_live_tup,n_dead_tup, last_vacuum, last_autovacuum, last_analyze, last_autoanalyze from pg_stat_user_tables where relname='external_sync_messages' ?

-- 
Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Query planner wants to use seq scan

От
Bertrand Paquet
Дата:

        relname         | n_live_tup | n_dead_tup |          last_vacuum          |        last_autovacuum        |         last_analyze          |       last_autoanalyze        

------------------------+------------+------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------

 external_sync_messages |     998105 |      11750 | 2015-10-26 20:15:17.484771+00 | 2015-10-02 15:04:25.944479+00 | 2015-10-26 20:15:19.465308+00 | 2015-10-22 12:24:26.947616+00

(1 row)


2015-10-27 12:17 GMT+01:00 Alex Ignatov <a.ignatov@postgrespro.ru>:
On 27.10.2015 14:10, Bertrand Paquet wrote:
Yes, I have run VACUUM ANALYZE, no effect.

Bertrand

2015-10-27 12:08 GMT+01:00 Alex Ignatov <a.ignatov@postgrespro.ru>:
On 27.10.2015 12:35, Bertrand Paquet wrote:
Hi all,

We have a slow query. After analyzing, the planner decision seems to be discutable : the query is faster when disabling seqscan. See below the two query plan, and an extract from pg_stats.

Any idea about what to change to help the planner ?

An information which can be useful : the number on distinct value on organization_id is very very low, may be the planner does not known that, and take the wrong decision.

Regards,

Bertrand

# explain analyze SELECT  1 AS one FROM "external_sync_messages"  WHERE "external_sync_messages"."organization_id" = 1612 AND ("external_sync_messages"."status" NOT IN ('sent_to_proxy', 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS NULL LIMIT 1;

                              QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------

 Limit  (cost=0.00..12.39 rows=1 width=0) (actual time=232.212..232.213 rows=1 loops=1)

   ->  Seq Scan on external_sync_messages  (cost=0.00..79104.69 rows=6385 width=0) (actual time=232.209..232.209 rows=1 loops=1)

         Filter: ((handled_by IS NULL) AND (organization_id = 1612) AND ((status)::text <> ALL ('{sent_to_proxy,in_progress,ok}'::text[])))

         Rows Removed by Filter: 600140

 Planning time: 0.490 ms

 Execution time: 232.246 ms

(6 rows)

# set enable_seqscan = off;

SET

# explain analyze SELECT  1 AS one FROM "external_sync_messages"  WHERE "external_sync_messages"."organization_id" = 1612 AND ("external_sync_messages"."status" NOT IN ('sent_to_proxy', 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS NULL LIMIT 1;

                                                    QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

 Limit  (cost=0.42..39.88 rows=1 width=0) (actual time=0.030..0.030 rows=1 loops=1)

   ->  Index Scan using index_external_sync_messages_on_organization_id on external_sync_messages  (cost=0.42..251934.05 rows=6385 width=0) (actual time=0.028..0.028 rows=1 loops=1)

         Index Cond: (organization_id = 1612)

         Filter: ((handled_by IS NULL) AND ((status)::text <> ALL ('{sent_to_proxy,in_progress,ok}'::text[])))

 Planning time: 0.103 ms

 Execution time: 0.052 ms

(6 rows)

# SELECT attname, inherited, n_distinct, array_to_string(most_common_vals, E'\n') as most_common_vals FROM pg_stats WHERE tablename = 'external_sync_messages' and attname IN ('status', 'organization_id', 'handled_by');

     attname     | inherited | n_distinct | most_common_vals

-----------------+-----------+------------+------------------

 handled_by      | f         |       3 | 3               +

                 |           |         | 236140          +

                 |           |         | 54413

 organization_id | f         |     22 | 1612            +

                 |           |         | 287             +

                 |           |         | 967             +

                 |           |         | 1223            +

                 |           |         | 1123            +

                 |           |         | 1930            +

                 |           |         | 841             +

                 |           |         | 1814            +

                 |           |         | 711             +

                 |           |         | 1513            +

                 |           |         | 1794            +

                 |           |         | 1246            +

                 |           |         | 1673            +

                 |           |         | 1552            +

                 |           |         | 1747            +

                 |           |         | 2611            +

                 |           |         | 2217            +

                 |           |         | 2448            +

                 |           |         | 2133            +

                 |           |         | 1861            +

                 |           |         | 2616            +

                 |           |         | 2796

 status          | f         |       6 | ok              +

                 |           |         | ignored         +

                 |           |         | channel_error   +

                 |           |         | in_progress     +

                 |           |         | error           +

                 |           |         | sent_to_proxy

(3 rows)

# select count(*) from external_sync_messages;

 count

--------

 992912

(1 row)


Hello, Bertrand!
May be statistics on external_sync_messages is wrong? i.e planner give us rows=6385 but seq scan give us Rows Removed by Filter: 600140
Maybe you should recalc it by VACUUM ANALYZE it?

--
Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


What is the result of
select relname,n_live_tup,n_dead_tup, last_vacuum, last_autovacuum, last_analyze, last_autoanalyze from pg_stat_user_tables where relname='external_sync_messages' ?

-- 
Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: Query planner wants to use seq scan

От
Alex Ignatov
Дата:


On 27.10.2015 14:19, Bertrand Paquet wrote:

        relname         | n_live_tup | n_dead_tup |          last_vacuum          |        last_autovacuum        |         last_analyze          |       last_autoanalyze        

------------------------+------------+------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------

 external_sync_messages |     998105 |      11750 | 2015-10-26 20:15:17.484771+00 | 2015-10-02 15:04:25.944479+00 | 2015-10-26 20:15:19.465308+00 | 2015-10-22 12:24:26.947616+00

(1 row)


2015-10-27 12:17 GMT+01:00 Alex Ignatov <a.ignatov@postgrespro.ru>:
On 27.10.2015 14:10, Bertrand Paquet wrote:
Yes, I have run VACUUM ANALYZE, no effect.

Bertrand

2015-10-27 12:08 GMT+01:00 Alex Ignatov <a.ignatov@postgrespro.ru>:
On 27.10.2015 12:35, Bertrand Paquet wrote:
Hi all,

We have a slow query. After analyzing, the planner decision seems to be discutable : the query is faster when disabling seqscan. See below the two query plan, and an extract from pg_stats.

Any idea about what to change to help the planner ?

An information which can be useful : the number on distinct value on organization_id is very very low, may be the planner does not known that, and take the wrong decision.

Regards,

Bertrand

# explain analyze SELECT  1 AS one FROM "external_sync_messages"  WHERE "external_sync_messages"."organization_id" = 1612 AND ("external_sync_messages"."status" NOT IN ('sent_to_proxy', 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS NULL LIMIT 1;

                              QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------

 Limit  (cost=0.00..12.39 rows=1 width=0) (actual time=232.212..232.213 rows=1 loops=1)

   ->  Seq Scan on external_sync_messages  (cost=0.00..79104.69 rows=6385 width=0) (actual time=232.209..232.209 rows=1 loops=1)

         Filter: ((handled_by IS NULL) AND (organization_id = 1612) AND ((status)::text <> ALL ('{sent_to_proxy,in_progress,ok}'::text[])))

         Rows Removed by Filter: 600140

 Planning time: 0.490 ms

 Execution time: 232.246 ms

(6 rows)

# set enable_seqscan = off;

SET

# explain analyze SELECT  1 AS one FROM "external_sync_messages"  WHERE "external_sync_messages"."organization_id" = 1612 AND ("external_sync_messages"."status" NOT IN ('sent_to_proxy', 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS NULL LIMIT 1;

                                                    QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

 Limit  (cost=0.42..39.88 rows=1 width=0) (actual time=0.030..0.030 rows=1 loops=1)

   ->  Index Scan using index_external_sync_messages_on_organization_id on external_sync_messages  (cost=0.42..251934.05 rows=6385 width=0) (actual time=0.028..0.028 rows=1 loops=1)

         Index Cond: (organization_id = 1612)

         Filter: ((handled_by IS NULL) AND ((status)::text <> ALL ('{sent_to_proxy,in_progress,ok}'::text[])))

 Planning time: 0.103 ms

 Execution time: 0.052 ms

(6 rows)

# SELECT attname, inherited, n_distinct, array_to_string(most_common_vals, E'\n') as most_common_vals FROM pg_stats WHERE tablename = 'external_sync_messages' and attname IN ('status', 'organization_id', 'handled_by');

     attname     | inherited | n_distinct | most_common_vals

-----------------+-----------+------------+------------------

 handled_by      | f         |       3 | 3               +

                 |           |         | 236140          +

                 |           |         | 54413

 organization_id | f         |     22 | 1612            +

                 |           |         | 287             +

                 |           |         | 967             +

                 |           |         | 1223            +

                 |           |         | 1123            +

                 |           |         | 1930            +

                 |           |         | 841             +

                 |           |         | 1814            +

                 |           |         | 711             +

                 |           |         | 1513            +

                 |           |         | 1794            +

                 |           |         | 1246            +

                 |           |         | 1673            +

                 |           |         | 1552            +

                 |           |         | 1747            +

                 |           |         | 2611            +

                 |           |         | 2217            +

                 |           |         | 2448            +

                 |           |         | 2133            +

                 |           |         | 1861            +

                 |           |         | 2616            +

                 |           |         | 2796

 status          | f         |       6 | ok              +

                 |           |         | ignored         +

                 |           |         | channel_error   +

                 |           |         | in_progress     +

                 |           |         | error           +

                 |           |         | sent_to_proxy

(3 rows)

# select count(*) from external_sync_messages;

 count

--------

 992912

(1 row)


Hello, Bertrand!
May be statistics on external_sync_messages is wrong? i.e planner give us rows=6385 but seq scan give us Rows Removed by Filter: 600140
Maybe you should recalc it by VACUUM ANALYZE it?

--
Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


What is the result of
select relname,n_live_tup,n_dead_tup, last_vacuum, last_autovacuum, last_analyze, last_autoanalyze from pg_stat_user_tables where relname='external_sync_messages' ?

-- 
Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


What is yours random_page_cost  parameter in postgres config?
-- 
Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Query planner wants to use seq scan

От
Tom Lane
Дата:
Bertrand Paquet <bertrand.paquet@doctolib.fr> writes:
> We have a slow query. After analyzing, the planner decision seems to be
> discutable : the query is faster when disabling seqscan. See below the two
> query plan, and an extract from pg_stats.

> Any idea about what to change to help the planner ?

Neither one of those plans is very good: you're just hoping that the
Filter condition will let a tuple through sooner rather than later.

If you care about the performance of this type of query, I'd consider
creating an index on (organization_id, status, handled_by) so that all
the conditions can be checked in the index.

            regards, tom lane


Re: Query planner wants to use seq scan

От
Bertrand Paquet
Дата:

show random_page_cost ;

 random_page_cost 

------------------

 4

(1 row)


2015-10-27 12:30 GMT+01:00 Alex Ignatov <a.ignatov@postgrespro.ru>:


On 27.10.2015 14:19, Bertrand Paquet wrote:

        relname         | n_live_tup | n_dead_tup |          last_vacuum          |        last_autovacuum        |         last_analyze          |       last_autoanalyze        

------------------------+------------+------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------

 external_sync_messages |     998105 |      11750 | 2015-10-26 20:15:17.484771+00 | 2015-10-02 15:04:25.944479+00 | 2015-10-26 20:15:19.465308+00 | 2015-10-22 12:24:26.947616+00

(1 row)


2015-10-27 12:17 GMT+01:00 Alex Ignatov <a.ignatov@postgrespro.ru>:
On 27.10.2015 14:10, Bertrand Paquet wrote:
Yes, I have run VACUUM ANALYZE, no effect.

Bertrand

2015-10-27 12:08 GMT+01:00 Alex Ignatov <a.ignatov@postgrespro.ru>:
On 27.10.2015 12:35, Bertrand Paquet wrote:
Hi all,

We have a slow query. After analyzing, the planner decision seems to be discutable : the query is faster when disabling seqscan. See below the two query plan, and an extract from pg_stats.

Any idea about what to change to help the planner ?

An information which can be useful : the number on distinct value on organization_id is very very low, may be the planner does not known that, and take the wrong decision.

Regards,

Bertrand

# explain analyze SELECT  1 AS one FROM "external_sync_messages"  WHERE "external_sync_messages"."organization_id" = 1612 AND ("external_sync_messages"."status" NOT IN ('sent_to_proxy', 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS NULL LIMIT 1;

                              QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------

 Limit  (cost=0.00..12.39 rows=1 width=0) (actual time=232.212..232.213 rows=1 loops=1)

   ->  Seq Scan on external_sync_messages  (cost=0.00..79104.69 rows=6385 width=0) (actual time=232.209..232.209 rows=1 loops=1)

         Filter: ((handled_by IS NULL) AND (organization_id = 1612) AND ((status)::text <> ALL ('{sent_to_proxy,in_progress,ok}'::text[])))

         Rows Removed by Filter: 600140

 Planning time: 0.490 ms

 Execution time: 232.246 ms

(6 rows)

# set enable_seqscan = off;

SET

# explain analyze SELECT  1 AS one FROM "external_sync_messages"  WHERE "external_sync_messages"."organization_id" = 1612 AND ("external_sync_messages"."status" NOT IN ('sent_to_proxy', 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS NULL LIMIT 1;

                                                    QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

 Limit  (cost=0.42..39.88 rows=1 width=0) (actual time=0.030..0.030 rows=1 loops=1)

   ->  Index Scan using index_external_sync_messages_on_organization_id on external_sync_messages  (cost=0.42..251934.05 rows=6385 width=0) (actual time=0.028..0.028 rows=1 loops=1)

         Index Cond: (organization_id = 1612)

         Filter: ((handled_by IS NULL) AND ((status)::text <> ALL ('{sent_to_proxy,in_progress,ok}'::text[])))

 Planning time: 0.103 ms

 Execution time: 0.052 ms

(6 rows)

# SELECT attname, inherited, n_distinct, array_to_string(most_common_vals, E'\n') as most_common_vals FROM pg_stats WHERE tablename = 'external_sync_messages' and attname IN ('status', 'organization_id', 'handled_by');

     attname     | inherited | n_distinct | most_common_vals

-----------------+-----------+------------+------------------

 handled_by      | f         |       3 | 3               +

                 |           |         | 236140          +

                 |           |         | 54413

 organization_id | f         |     22 | 1612            +

                 |           |         | 287             +

                 |           |         | 967             +

                 |           |         | 1223            +

                 |           |         | 1123            +

                 |           |         | 1930            +

                 |           |         | 841             +

                 |           |         | 1814            +

                 |           |         | 711             +

                 |           |         | 1513            +

                 |           |         | 1794            +

                 |           |         | 1246            +

                 |           |         | 1673            +

                 |           |         | 1552            +

                 |           |         | 1747            +

                 |           |         | 2611            +

                 |           |         | 2217            +

                 |           |         | 2448            +

                 |           |         | 2133            +

                 |           |         | 1861            +

                 |           |         | 2616            +

                 |           |         | 2796

 status          | f         |       6 | ok              +

                 |           |         | ignored         +

                 |           |         | channel_error   +

                 |           |         | in_progress     +

                 |           |         | error           +

                 |           |         | sent_to_proxy

(3 rows)

# select count(*) from external_sync_messages;

 count

--------

 992912

(1 row)


Hello, Bertrand!
May be statistics on external_sync_messages is wrong? i.e planner give us rows=6385 but seq scan give us Rows Removed by Filter: 600140
Maybe you should recalc it by VACUUM ANALYZE it?

--
Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


What is the result of
select relname,n_live_tup,n_dead_tup, last_vacuum, last_autovacuum, last_analyze, last_autoanalyze from pg_stat_user_tables where relname='external_sync_messages' ?

-- 
Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


What is yours random_page_cost  parameter in postgres config?
-- 
Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: Query planner wants to use seq scan

От
Bertrand Paquet
Дата:
Hi tom,

I did the test yesterday with an index on the three fields, and with a partial index on organization and status and where is null condition on handled.

Le mardi 27 octobre 2015, Tom Lane <tgl@sss.pgh.pa.us> a écrit :
Bertrand Paquet <bertrand.paquet@doctolib.fr> writes:
> We have a slow query. After analyzing, the planner decision seems to be
> discutable : the query is faster when disabling seqscan. See below the two
> query plan, and an extract from pg_stats.

> Any idea about what to change to help the planner ?

Neither one of those plans is very good: you're just hoping that the
Filter condition will let a tuple through sooner rather than later.

If you care about the performance of this type of query, I'd consider
creating an index on (organization_id, status, handled_by) so that all
the conditions can be checked in the index.

                        regards, tom lane

Re: Query planner wants to use seq scan

От
Bertrand Paquet
Дата:
Hi tom,

I did the test yesterday with an index on the three fields, and with a partial index on organization and status and where is null condition on handled. I saw no modification on query plan.
May be I forgot to analyze vacuum after. I will retry tonight. 

I use a btree index. Is it the good solution, even with the In clause ?

Regards,

Bertrand

Le mardi 27 octobre 2015, Tom Lane <tgl@sss.pgh.pa.us> a écrit :
Bertrand Paquet <bertrand.paquet@doctolib.fr> writes:
> We have a slow query. After analyzing, the planner decision seems to be
> discutable : the query is faster when disabling seqscan. See below the two
> query plan, and an extract from pg_stats.

> Any idea about what to change to help the planner ?

Neither one of those plans is very good: you're just hoping that the
Filter condition will let a tuple through sooner rather than later.

If you care about the performance of this type of query, I'd consider
creating an index on (organization_id, status, handled_by) so that all
the conditions can be checked in the index.

                        regards, tom lane

Re: Query planner wants to use seq scan

От
Bertrand Paquet
Дата:
So,

Tonight, the index on the three field is used, may be my yesterday vacuum updated stats.

Thx you for your help.

Regards,

Bertrand




2015-10-27 18:33 GMT+01:00 Bertrand Paquet <bertrand.paquet@doctolib.fr>:
Hi tom,

I did the test yesterday with an index on the three fields, and with a partial index on organization and status and where is null condition on handled. I saw no modification on query plan.
May be I forgot to analyze vacuum after. I will retry tonight. 

I use a btree index. Is it the good solution, even with the In clause ?

Regards,

Bertrand

Le mardi 27 octobre 2015, Tom Lane <tgl@sss.pgh.pa.us> a écrit :
Bertrand Paquet <bertrand.paquet@doctolib.fr> writes:
> We have a slow query. After analyzing, the planner decision seems to be
> discutable : the query is faster when disabling seqscan. See below the two
> query plan, and an extract from pg_stats.

> Any idea about what to change to help the planner ?

Neither one of those plans is very good: you're just hoping that the
Filter condition will let a tuple through sooner rather than later.

If you care about the performance of this type of query, I'd consider
creating an index on (organization_id, status, handled_by) so that all
the conditions can be checked in the index.

                        regards, tom lane

Re: Query planner wants to use seq scan

От
Jim Nasby
Дата:
On 10/27/15 3:56 PM, Bertrand Paquet wrote:
> Tonight, the index on the three field is used, may be my yesterday
> vacuum updated stats.

BTW, you can run just ANALYZE, which is *far* faster than a VACUUM on a
large table.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


Re: Query planner wants to use seq scan

От
Alex Ignatov
Дата:


On 27.10.2015 23:56, Bertrand Paquet wrote:
So,

Tonight, the index on the three field is used, may be my yesterday vacuum updated stats.

Thx you for your help.

Regards,

Bertrand




2015-10-27 18:33 GMT+01:00 Bertrand Paquet <bertrand.paquet@doctolib.fr>:
Hi tom,

I did the test yesterday with an index on the three fields, and with a partial index on organization and status and where is null condition on handled. I saw no modification on query plan.
May be I forgot to analyze vacuum after. I will retry tonight. 

I use a btree index. Is it the good solution, even with the In clause ?

Regards,

Bertrand

Le mardi 27 octobre 2015, Tom Lane <tgl@sss.pgh.pa.us> a écrit :
Bertrand Paquet <bertrand.paquet@doctolib.fr> writes:
> We have a slow query. After analyzing, the planner decision seems to be
> discutable : the query is faster when disabling seqscan. See below the two
> query plan, and an extract from pg_stats.

> Any idea about what to change to help the planner ?

Neither one of those plans is very good: you're just hoping that the
Filter condition will let a tuple through sooner rather than later.

If you care about the performance of this type of query, I'd consider
creating an index on (organization_id, status, handled_by) so that all
the conditions can be checked in the index.

                        regards, tom lane

Hello Bertrand once again!
What's your status? Does the plan changed after deploying three field index ?
-- 
Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Query planner wants to use seq scan

От
Bertrand Paquet
Дата:
Yes, the three fields index AND vacuum solve the issue.

Regards,

Bertrand

2015-10-29 13:27 GMT+01:00 Alex Ignatov <a.ignatov@postgrespro.ru>:


On 27.10.2015 23:56, Bertrand Paquet wrote:
So,

Tonight, the index on the three field is used, may be my yesterday vacuum updated stats.

Thx you for your help.

Regards,

Bertrand




2015-10-27 18:33 GMT+01:00 Bertrand Paquet <bertrand.paquet@doctolib.fr>:
Hi tom,

I did the test yesterday with an index on the three fields, and with a partial index on organization and status and where is null condition on handled. I saw no modification on query plan.
May be I forgot to analyze vacuum after. I will retry tonight. 

I use a btree index. Is it the good solution, even with the In clause ?

Regards,

Bertrand

Le mardi 27 octobre 2015, Tom Lane <tgl@sss.pgh.pa.us> a écrit :
Bertrand Paquet <bertrand.paquet@doctolib.fr> writes:
> We have a slow query. After analyzing, the planner decision seems to be
> discutable : the query is faster when disabling seqscan. See below the two
> query plan, and an extract from pg_stats.

> Any idea about what to change to help the planner ?

Neither one of those plans is very good: you're just hoping that the
Filter condition will let a tuple through sooner rather than later.

If you care about the performance of this type of query, I'd consider
creating an index on (organization_id, status, handled_by) so that all
the conditions can be checked in the index.

                        regards, tom lane

Hello Bertrand once again!
What's your status? Does the plan changed after deploying three field index ?
-- 
Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company