Обсуждение: Problems with ordering (can't force query planner to use an index)

Поиск
Список
Период
Сортировка

Problems with ordering (can't force query planner to use an index)

От
Sebastjan Trepca
Дата:
Hey,

I have a table that links content together and it currently holds
about 17 mio records. Typical query is a join with a content table and
link table:

noovo-new=# explain analyze SELECT "core_accessor"."id",
"core_accessor"."content_type_id",
"core_accessor"."object_id", "core_accessor"."ordering",
"core_accessor"."label", "core_accessor"."date_posted",
"core_accessor"."publish_state", "core_accessor"."nooximity_old",
"core_accessor"."rising", "core_accessor"."nooximity",
"core_accessor"."nooximity_old_date_posted",
"core_accessor"."nooximity_date_posted", "core_accessor"."user_id",
"core_accessor"."slot_id", "core_accessor"."slot_type_id",
"core_accessor"."role", "core_base"."object_id",
"core_base"."content_type_id", "core_base"."abstract",
"core_base"."abstract_title", "core_base"."image",
 "core_base"."date_posted", "core_base"."date_modified",
"core_base"."date_expires", "core_base"."publish_state",
 "core_base"."location", "core_base"."location_x",
"core_base"."location_y", "core_base"."raw", "core_base"."author_id",
 "core_base"."excerpt", "core_base"."state_id",
"core_base"."country_id", "core_base"."language",
"core_base"."_identifier",
  "core_base"."slot_url", "core_base"."source_id",
"core_base"."source_content_type_id", "core_base"."source_type",
 "core_base"."source_value", "core_base"."source_title",
"core_base"."direct_to_source", "core_base"."comment_count",
 "core_base"."public" FROM "core_accessor" INNER JOIN core_base AS
core_base ON core_base.content_type_id =
 core_accessor.content_type_id AND core_base.object_id =
core_accessor.object_id WHERE (("core_accessor"."slot_type_id" = 119
 AND "core_accessor"."slot_id" = 472 AND "core_accessor"."label" = E''
AND "core_accessor"."publish_state" >= 60 AND
 "core_accessor"."role" IN (0) AND "core_accessor"."user_id" = 0))
order by core_accessor.date_posted, core_accessor.nooximity LIMIT 5
;

      QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=31930.65..31930.66 rows=5 width=860) (actual
time=711.924..711.927 rows=5 loops=1)
   ->  Sort  (cost=31930.65..31937.80 rows=2861 width=860) (actual
time=711.923..711.923 rows=5 loops=1)
         Sort Key: core_accessor.date_posted, core_accessor.nooximity
         Sort Method:  top-N heapsort  Memory: 31kB
         ->  Nested Loop  (cost=0.00..31883.13 rows=2861 width=860)
(actual time=0.089..543.497 rows=68505 loops=1)
               ->  Index Scan using core_accessor_fresh_idx on
core_accessor  (cost=0.00..5460.07 rows=2970 width=92) (actual
time=0.068..54.921 rows=69312 loops=1)
                     Index Cond: ((slot_id = 472) AND (slot_type_id =
119) AND (label = ''::text) AND (user_id = 0) AND (role = 0) AND
(publish_state >= 60))
               ->  Index Scan using core_base_pkey on core_base
(cost=0.00..8.88 rows=1 width=768) (actual time=0.004..0.005 rows=1
loops=69312)
                     Index Cond: ((core_base.object_id =
core_accessor.object_id) AND (core_base.content_type_id =
core_accessor.content_type_id))
 Total runtime: 712.031 ms
(10 rows)

noovo-new=# select * from pg_stat_user_tables where relname='core_accessor';
 relid | schemaname |    relname    | seq_scan | seq_tup_read |
idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del |
n_tup_hot_upd | n_live_tup | n_dead_tup |          last_vacuum
 | last_autovacuum |         last_analyze          | last_autoanalyze

-------+------------+---------------+----------+--------------+----------+---------------+-----------+-----------+-----------+---------------+------------+------------+-------------------------------+-----------------+-------------------------------+------------------
 51159 | public     | core_accessor |       58 |    749773516 |
13785608 |     149165183 |      9566 |       548 |       347 |
  206 |   17144303 |        251 | 2009-03-03 07:02:19.733778-06 |
           | 2009-03-03 06:17:47.784268-06 |
(1 row)

noovo-new=# \d+ core_accessor;
                                                  Table "public.core_accessor"
          Column           |           Type           |
         Modifiers                          | Description

---------------------------+--------------------------+------------------------------------------------------------+-------------
 id                        | bigint                   | not null
default nextval('core_accessor_id_seq'::regclass) |
 flavor                    | character varying(32)    |
                                            |
 content_type_id           | integer                  | not null
                                            |
 object_id                 | integer                  | not null
                                            |
 publish_state             | smallint                 | not null
                                            |
 date_posted               | timestamp with time zone | not null
                                            |
 user_id                   | integer                  |
                                            |
 slot_id                   | integer                  |
                                            |
 slot_type_id              | integer                  |
                                            |
 role                      | smallint                 |
                                            |
 ordering                  | integer                  |
                                            |
 author_id                 | integer                  |
                                            |
 nooximity_old             | double precision         | default 0.0
                                            |
 rising                    | double precision         | default 0.0
                                            |
 label                     | text                     |
                                            |
 nooximity                 | double precision         | not null
default 1.0                                       |
 nooximity_old_date_posted | timestamp with time zone |
                                            |
 nooximity_date_posted     | timestamp with time zone |
                                            |
Indexes:
    "portal_metainfo_pkey" PRIMARY KEY, btree (id)
    "portal_metainfo_unique_constr" UNIQUE, btree (content_type_id,
object_id, user_id, slot_id, slot_type_id, role, label) CLUSTER
    "core_accessor_date_idx" btree (date_posted, nooximity)
    "core_accessor_dated_idx" btree (slot_id, slot_type_id, label,
user_id, role, publish_state, date_posted, nooximity)
    "core_accessor_fresh_idx" btree (slot_id, slot_type_id, label,
user_id, role, publish_state)
    "core_accessor_popularity_idx" btree (nooximity, date_posted)
Check constraints:
    "portal_metainfo_object_id_check" CHECK (object_id >= 0)
    "portal_metainfo_owner_id_check" CHECK (slot_id >= 0)
Foreign-key constraints:
    "portal_metainfo_accessor_id_fkey" FOREIGN KEY (user_id)
REFERENCES auth_user(id) DEFERRABLE INITIALLY DEFERRED
    "portal_metainfo_content_type_id_fkey" FOREIGN KEY
(content_type_id) REFERENCES django_content_type(id) DEFERRABLE
INITIALLY DEFERRED
    "portal_metainfo_owner_type_id_fkey" FOREIGN KEY (slot_type_id)
REFERENCES django_content_type(id) DEFERRABLE INITIALLY DEFERRED
Has OIDs: no



As far as I understand the explain, it fetches 68505 rows, matches
them with core_base and then tries to sort them? AFAIK it would
probably be much more effective to just find the records in accessor
via core_accessor_dated_idx and then lookup the core_base table? But
for some reason it doesn't want to?

I ran analyze, vacuum and reindex but nothing helped. Queries just eat
all the I/O and block. There is a huge difference between cached and
non-cached queries, like 50.000 to 50 ms.

Help! :)


Thanks, Sebastjan

Re: Problems with ordering (can't force query planner to use an index)

От
Robert Haas
Дата:
On Tue, Mar 3, 2009 at 12:05 PM, Sebastjan Trepca <trepca@gmail.com> wrote:
> Hey,
>
> I have a table that links content together and it currently holds
> about 17 mio records. Typical query is a join with a content table and
> link table:
>
> noovo-new=# explain analyze SELECT "core_accessor"."id",
> "core_accessor"."content_type_id",
> "core_accessor"."object_id", "core_accessor"."ordering",
> "core_accessor"."label", "core_accessor"."date_posted",
> "core_accessor"."publish_state", "core_accessor"."nooximity_old",
> "core_accessor"."rising", "core_accessor"."nooximity",
> "core_accessor"."nooximity_old_date_posted",
> "core_accessor"."nooximity_date_posted", "core_accessor"."user_id",
> "core_accessor"."slot_id", "core_accessor"."slot_type_id",
> "core_accessor"."role", "core_base"."object_id",
> "core_base"."content_type_id", "core_base"."abstract",
> "core_base"."abstract_title", "core_base"."image",
>  "core_base"."date_posted", "core_base"."date_modified",
> "core_base"."date_expires", "core_base"."publish_state",
>  "core_base"."location", "core_base"."location_x",
> "core_base"."location_y", "core_base"."raw", "core_base"."author_id",
>  "core_base"."excerpt", "core_base"."state_id",
> "core_base"."country_id", "core_base"."language",
> "core_base"."_identifier",
>  "core_base"."slot_url", "core_base"."source_id",
> "core_base"."source_content_type_id", "core_base"."source_type",
>  "core_base"."source_value", "core_base"."source_title",
> "core_base"."direct_to_source", "core_base"."comment_count",
>  "core_base"."public" FROM "core_accessor" INNER JOIN core_base AS
> core_base ON core_base.content_type_id =
>  core_accessor.content_type_id AND core_base.object_id =
> core_accessor.object_id WHERE (("core_accessor"."slot_type_id" = 119
>  AND "core_accessor"."slot_id" = 472 AND "core_accessor"."label" = E''
> AND "core_accessor"."publish_state" >= 60 AND
>  "core_accessor"."role" IN (0) AND "core_accessor"."user_id" = 0))
> order by core_accessor.date_posted, core_accessor.nooximity LIMIT 5
> ;
>
>      QUERY PLAN
>
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=31930.65..31930.66 rows=5 width=860) (actual
> time=711.924..711.927 rows=5 loops=1)
>   ->  Sort  (cost=31930.65..31937.80 rows=2861 width=860) (actual
> time=711.923..711.923 rows=5 loops=1)
>         Sort Key: core_accessor.date_posted, core_accessor.nooximity
>         Sort Method:  top-N heapsort  Memory: 31kB
>         ->  Nested Loop  (cost=0.00..31883.13 rows=2861 width=860)
> (actual time=0.089..543.497 rows=68505 loops=1)
>               ->  Index Scan using core_accessor_fresh_idx on
> core_accessor  (cost=0.00..5460.07 rows=2970 width=92) (actual
> time=0.068..54.921 rows=69312 loops=1)
>                     Index Cond: ((slot_id = 472) AND (slot_type_id =
> 119) AND (label = ''::text) AND (user_id = 0) AND (role = 0) AND
> (publish_state >= 60))
>               ->  Index Scan using core_base_pkey on core_base
> (cost=0.00..8.88 rows=1 width=768) (actual time=0.004..0.005 rows=1
> loops=69312)
>                     Index Cond: ((core_base.object_id =
> core_accessor.object_id) AND (core_base.content_type_id =
> core_accessor.content_type_id))
>  Total runtime: 712.031 ms
> (10 rows)
>
> noovo-new=# select * from pg_stat_user_tables where relname='core_accessor';
>  relid | schemaname |    relname    | seq_scan | seq_tup_read |
> idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del |
> n_tup_hot_upd | n_live_tup | n_dead_tup |          last_vacuum
>  | last_autovacuum |         last_analyze          | last_autoanalyze
>
-------+------------+---------------+----------+--------------+----------+---------------+-----------+-----------+-----------+---------------+------------+------------+-------------------------------+-----------------+-------------------------------+------------------
>  51159 | public     | core_accessor |       58 |    749773516 |
> 13785608 |     149165183 |      9566 |       548 |       347 |
>  206 |   17144303 |        251 | 2009-03-03 07:02:19.733778-06 |
>           | 2009-03-03 06:17:47.784268-06 |
> (1 row)
>
> noovo-new=# \d+ core_accessor;
>                                                  Table "public.core_accessor"
>          Column           |           Type           |
>         Modifiers                          | Description
>
---------------------------+--------------------------+------------------------------------------------------------+-------------
>  id                        | bigint                   | not null
> default nextval('core_accessor_id_seq'::regclass) |
>  flavor                    | character varying(32)    |
>                                            |
>  content_type_id           | integer                  | not null
>                                            |
>  object_id                 | integer                  | not null
>                                            |
>  publish_state             | smallint                 | not null
>                                            |
>  date_posted               | timestamp with time zone | not null
>                                            |
>  user_id                   | integer                  |
>                                            |
>  slot_id                   | integer                  |
>                                            |
>  slot_type_id              | integer                  |
>                                            |
>  role                      | smallint                 |
>                                            |
>  ordering                  | integer                  |
>                                            |
>  author_id                 | integer                  |
>                                            |
>  nooximity_old             | double precision         | default 0.0
>                                            |
>  rising                    | double precision         | default 0.0
>                                            |
>  label                     | text                     |
>                                            |
>  nooximity                 | double precision         | not null
> default 1.0                                       |
>  nooximity_old_date_posted | timestamp with time zone |
>                                            |
>  nooximity_date_posted     | timestamp with time zone |
>                                            |
> Indexes:
>    "portal_metainfo_pkey" PRIMARY KEY, btree (id)
>    "portal_metainfo_unique_constr" UNIQUE, btree (content_type_id,
> object_id, user_id, slot_id, slot_type_id, role, label) CLUSTER
>    "core_accessor_date_idx" btree (date_posted, nooximity)
>    "core_accessor_dated_idx" btree (slot_id, slot_type_id, label,
> user_id, role, publish_state, date_posted, nooximity)
>    "core_accessor_fresh_idx" btree (slot_id, slot_type_id, label,
> user_id, role, publish_state)
>    "core_accessor_popularity_idx" btree (nooximity, date_posted)
> Check constraints:
>    "portal_metainfo_object_id_check" CHECK (object_id >= 0)
>    "portal_metainfo_owner_id_check" CHECK (slot_id >= 0)
> Foreign-key constraints:
>    "portal_metainfo_accessor_id_fkey" FOREIGN KEY (user_id)
> REFERENCES auth_user(id) DEFERRABLE INITIALLY DEFERRED
>    "portal_metainfo_content_type_id_fkey" FOREIGN KEY
> (content_type_id) REFERENCES django_content_type(id) DEFERRABLE
> INITIALLY DEFERRED
>    "portal_metainfo_owner_type_id_fkey" FOREIGN KEY (slot_type_id)
> REFERENCES django_content_type(id) DEFERRABLE INITIALLY DEFERRED
> Has OIDs: no
>
>
>
> As far as I understand the explain, it fetches 68505 rows, matches
> them with core_base and then tries to sort them? AFAIK it would
> probably be much more effective to just find the records in accessor
> via core_accessor_dated_idx and then lookup the core_base table? But
> for some reason it doesn't want to?
>
> I ran analyze, vacuum and reindex but nothing helped. Queries just eat
> all the I/O and block. There is a huge difference between cached and
> non-cached queries, like 50.000 to 50 ms.
>
> Help! :)

Please send the output of EXPLAIN ANALYZE for this query.

...Robert

Re: Problems with ordering (can't force query planner to use an index)

От
Sebastjan Trepca
Дата:
But it's already attached in the first mail or am I missing something?

If you don't see it, check this: http://pastebin.com/d71b996d0

Sebastjan



On Tue, Mar 3, 2009 at 6:12 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Mar 3, 2009 at 12:05 PM, Sebastjan Trepca <trepca@gmail.com> wrote:
>> Hey,
>>
>> I have a table that links content together and it currently holds
>> about 17 mio records. Typical query is a join with a content table and
>> link table:
>>
>> noovo-new=# explain analyze SELECT "core_accessor"."id",
>> "core_accessor"."content_type_id",
>> "core_accessor"."object_id", "core_accessor"."ordering",
>> "core_accessor"."label", "core_accessor"."date_posted",
>> "core_accessor"."publish_state", "core_accessor"."nooximity_old",
>> "core_accessor"."rising", "core_accessor"."nooximity",
>> "core_accessor"."nooximity_old_date_posted",
>> "core_accessor"."nooximity_date_posted", "core_accessor"."user_id",
>> "core_accessor"."slot_id", "core_accessor"."slot_type_id",
>> "core_accessor"."role", "core_base"."object_id",
>> "core_base"."content_type_id", "core_base"."abstract",
>> "core_base"."abstract_title", "core_base"."image",
>>  "core_base"."date_posted", "core_base"."date_modified",
>> "core_base"."date_expires", "core_base"."publish_state",
>>  "core_base"."location", "core_base"."location_x",
>> "core_base"."location_y", "core_base"."raw", "core_base"."author_id",
>>  "core_base"."excerpt", "core_base"."state_id",
>> "core_base"."country_id", "core_base"."language",
>> "core_base"."_identifier",
>>  "core_base"."slot_url", "core_base"."source_id",
>> "core_base"."source_content_type_id", "core_base"."source_type",
>>  "core_base"."source_value", "core_base"."source_title",
>> "core_base"."direct_to_source", "core_base"."comment_count",
>>  "core_base"."public" FROM "core_accessor" INNER JOIN core_base AS
>> core_base ON core_base.content_type_id =
>>  core_accessor.content_type_id AND core_base.object_id =
>> core_accessor.object_id WHERE (("core_accessor"."slot_type_id" = 119
>>  AND "core_accessor"."slot_id" = 472 AND "core_accessor"."label" = E''
>> AND "core_accessor"."publish_state" >= 60 AND
>>  "core_accessor"."role" IN (0) AND "core_accessor"."user_id" = 0))
>> order by core_accessor.date_posted, core_accessor.nooximity LIMIT 5
>> ;
>>
>>      QUERY PLAN
>>
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>  Limit  (cost=31930.65..31930.66 rows=5 width=860) (actual
>> time=711.924..711.927 rows=5 loops=1)
>>   ->  Sort  (cost=31930.65..31937.80 rows=2861 width=860) (actual
>> time=711.923..711.923 rows=5 loops=1)
>>         Sort Key: core_accessor.date_posted, core_accessor.nooximity
>>         Sort Method:  top-N heapsort  Memory: 31kB
>>         ->  Nested Loop  (cost=0.00..31883.13 rows=2861 width=860)
>> (actual time=0.089..543.497 rows=68505 loops=1)
>>               ->  Index Scan using core_accessor_fresh_idx on
>> core_accessor  (cost=0.00..5460.07 rows=2970 width=92) (actual
>> time=0.068..54.921 rows=69312 loops=1)
>>                     Index Cond: ((slot_id = 472) AND (slot_type_id =
>> 119) AND (label = ''::text) AND (user_id = 0) AND (role = 0) AND
>> (publish_state >= 60))
>>               ->  Index Scan using core_base_pkey on core_base
>> (cost=0.00..8.88 rows=1 width=768) (actual time=0.004..0.005 rows=1
>> loops=69312)
>>                     Index Cond: ((core_base.object_id =
>> core_accessor.object_id) AND (core_base.content_type_id =
>> core_accessor.content_type_id))
>>  Total runtime: 712.031 ms
>> (10 rows)
>>
>> noovo-new=# select * from pg_stat_user_tables where relname='core_accessor';
>>  relid | schemaname |    relname    | seq_scan | seq_tup_read |
>> idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del |
>> n_tup_hot_upd | n_live_tup | n_dead_tup |          last_vacuum
>>  | last_autovacuum |         last_analyze          | last_autoanalyze
>>
-------+------------+---------------+----------+--------------+----------+---------------+-----------+-----------+-----------+---------------+------------+------------+-------------------------------+-----------------+-------------------------------+------------------
>>  51159 | public     | core_accessor |       58 |    749773516 |
>> 13785608 |     149165183 |      9566 |       548 |       347 |
>>  206 |   17144303 |        251 | 2009-03-03 07:02:19.733778-06 |
>>           | 2009-03-03 06:17:47.784268-06 |
>> (1 row)
>>
>> noovo-new=# \d+ core_accessor;
>>                                                  Table "public.core_accessor"
>>          Column           |           Type           |
>>         Modifiers                          | Description
>>
---------------------------+--------------------------+------------------------------------------------------------+-------------
>>  id                        | bigint                   | not null
>> default nextval('core_accessor_id_seq'::regclass) |
>>  flavor                    | character varying(32)    |
>>                                            |
>>  content_type_id           | integer                  | not null
>>                                            |
>>  object_id                 | integer                  | not null
>>                                            |
>>  publish_state             | smallint                 | not null
>>                                            |
>>  date_posted               | timestamp with time zone | not null
>>                                            |
>>  user_id                   | integer                  |
>>                                            |
>>  slot_id                   | integer                  |
>>                                            |
>>  slot_type_id              | integer                  |
>>                                            |
>>  role                      | smallint                 |
>>                                            |
>>  ordering                  | integer                  |
>>                                            |
>>  author_id                 | integer                  |
>>                                            |
>>  nooximity_old             | double precision         | default 0.0
>>                                            |
>>  rising                    | double precision         | default 0.0
>>                                            |
>>  label                     | text                     |
>>                                            |
>>  nooximity                 | double precision         | not null
>> default 1.0                                       |
>>  nooximity_old_date_posted | timestamp with time zone |
>>                                            |
>>  nooximity_date_posted     | timestamp with time zone |
>>                                            |
>> Indexes:
>>    "portal_metainfo_pkey" PRIMARY KEY, btree (id)
>>    "portal_metainfo_unique_constr" UNIQUE, btree (content_type_id,
>> object_id, user_id, slot_id, slot_type_id, role, label) CLUSTER
>>    "core_accessor_date_idx" btree (date_posted, nooximity)
>>    "core_accessor_dated_idx" btree (slot_id, slot_type_id, label,
>> user_id, role, publish_state, date_posted, nooximity)
>>    "core_accessor_fresh_idx" btree (slot_id, slot_type_id, label,
>> user_id, role, publish_state)
>>    "core_accessor_popularity_idx" btree (nooximity, date_posted)
>> Check constraints:
>>    "portal_metainfo_object_id_check" CHECK (object_id >= 0)
>>    "portal_metainfo_owner_id_check" CHECK (slot_id >= 0)
>> Foreign-key constraints:
>>    "portal_metainfo_accessor_id_fkey" FOREIGN KEY (user_id)
>> REFERENCES auth_user(id) DEFERRABLE INITIALLY DEFERRED
>>    "portal_metainfo_content_type_id_fkey" FOREIGN KEY
>> (content_type_id) REFERENCES django_content_type(id) DEFERRABLE
>> INITIALLY DEFERRED
>>    "portal_metainfo_owner_type_id_fkey" FOREIGN KEY (slot_type_id)
>> REFERENCES django_content_type(id) DEFERRABLE INITIALLY DEFERRED
>> Has OIDs: no
>>
>>
>>
>> As far as I understand the explain, it fetches 68505 rows, matches
>> them with core_base and then tries to sort them? AFAIK it would
>> probably be much more effective to just find the records in accessor
>> via core_accessor_dated_idx and then lookup the core_base table? But
>> for some reason it doesn't want to?
>>
>> I ran analyze, vacuum and reindex but nothing helped. Queries just eat
>> all the I/O and block. There is a huge difference between cached and
>> non-cached queries, like 50.000 to 50 ms.
>>
>> Help! :)
>
> Please send the output of EXPLAIN ANALYZE for this query.
>
> ...Robert
>

Re: Problems with ordering (can't force query planner to use an index)

От
David Wilson
Дата:
On Tue, Mar 3, 2009 at 12:05 PM, Sebastjan Trepca <trepca@gmail.com> wrote:

>         ->  Nested Loop  (cost=0.00..31883.13 rows=2861 width=860)
> (actual time=0.089..543.497 rows=68505 loops=1)
>               ->  Index Scan using core_accessor_fresh_idx on
> core_accessor  (cost=0.00..5460.07 rows=2970 width=92) (actual
> time=0.068..54.921 rows=69312 loops=1)
>                     Index Cond: ((slot_id = 472) AND (slot_type_id =
> 119) AND (label = ''::text) AND (user_id = 0) AND (role = 0) AND
> (publish_state >= 60))

That index scan on core_accessor_fresh_idx has a pretty big disparity
between what the planer expects to get (2970 rows) and what it
actually gets (69312 rows). You should try increasing the statistics
target if you haven't, then re-analyze and try the query again to see
if the planner picks something better. The default of 10 is pretty
small- try 100, or higher.



--
- David T. Wilson
david.t.wilson@gmail.com

Re: Problems with ordering (can't force query planner to use an index)

От
Sebastjan Trepca
Дата:
Set statistics to 1000, reanalyzed and got exactly same results:


noovo-new=# explain analyze SELECT "core_accessor"."id",
"core_accessor"."content_type_id",
"core_accessor"."object_id", "core_accessor"."ordering",
"core_accessor"."label", "core_accessor"."date_posted",
"core_accessor"."publish_state", "core_accessor"."nooximity_old",
"core_accessor"."rising", "core_accessor"."nooximity",
"core_accessor"."nooximity_old_date_posted",
"core_accessor"."nooximity_date_posted", "core_accessor"."user_id",
"core_accessor"."slot_id", "core_accessor"."slot_type_id",
"core_accessor"."role", "core_base"."object_id",
"core_base"."content_type_id", "core_base"."abstract",
"core_base"."abstract_title", "core_base"."image",
 "core_base"."date_posted", "core_base"."date_modified",
"core_base"."date_expires", "core_base"."publish_state",
 "core_base"."location", "core_base"."location_x",
"core_base"."location_y", "core_base"."raw", "core_base"."author_id",
 "core_base"."excerpt", "core_base"."state_id",
"core_base"."country_id", "core_base"."language",
"core_base"."_identifier",
  "core_base"."slot_url", "core_base"."source_id",
"core_base"."source_content_type_id", "core_base"."source_type",
 "core_base"."source_value", "core_base"."source_title",
"core_base"."direct_to_source", "core_base"."comment_count",
 "core_base"."public" FROM "core_accessor" INNER JOIN core_base AS
core_base ON core_base.content_type_id =
 core_accessor.content_type_id AND core_base.object_id =
core_accessor.object_id WHERE (("core_accessor"."slot_type_id" = 119
 AND "core_accessor"."slot_id" = 472 AND "core_accessor"."label" = E''
AND "core_accessor"."publish_state" >= 60 AND
 "core_accessor"."role" IN (0) AND "core_accessor"."user_id" = 0))
order by core_accessor.date_posted, core_accessor.nooximity LIMIT 5
;

      QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=31716.13..31716.14 rows=5 width=860) (actual
time=711.340..711.343 rows=5 loops=1)
   ->  Sort  (cost=31716.13..31722.19 rows=2424 width=860) (actual
time=711.339..711.339 rows=5 loops=1)
         Sort Key: core_accessor.date_posted, core_accessor.nooximity
         Sort Method:  top-N heapsort  Memory: 31kB
         ->  Nested Loop  (cost=0.00..31675.87 rows=2424 width=860)
(actual time=0.076..544.039 rows=68505 loops=1)
               ->  Index Scan using core_accessor_fresh_idx on
core_accessor  (cost=0.00..9234.77 rows=2511 width=92) (actual
time=0.058..55.225 rows=69312 loops=1)
                     Index Cond: ((slot_id = 472) AND (slot_type_id =
119) AND (label = ''::text) AND (user_id = 0) AND (role = 0) AND
(publish_state >= 60))
               ->  Index Scan using core_base_pkey on core_base
(cost=0.00..8.92 rows=1 width=768) (actual time=0.005..0.005 rows=1
loops=69312)
                     Index Cond: ((core_base.object_id =
core_accessor.object_id) AND (core_base.content_type_id =
core_accessor.content_type_id))
 Total runtime: 711.443 ms
(10 rows)



This is how I did it:

noovo-new=# alter table core_accessor alter column slot_id set statistics 1000;
ALTER TABLE
noovo-new=# alter table core_accessor alter column slot_type_id set
statistics 1000;
ALTER TABLE
noovo-new=# alter table core_accessor alter column label set statistics 1000;
ALTER TABLE
noovo-new=# alter table core_accessor alter column user_id set statistics 1000;
ALTER TABLE
noovo-new=# alter table core_accessor alter column role set statistics 1000;
ALTER TABLE
noovo-new=# alter table core_accessor alter column publish_state set
statistics 1000;
ALTER TABLE
noovo-new=# analyze core_accessor;
ANALYZE



Sebastjan



On Tue, Mar 3, 2009 at 6:34 PM, David Wilson <david.t.wilson@gmail.com> wrote:
> On Tue, Mar 3, 2009 at 12:05 PM, Sebastjan Trepca <trepca@gmail.com> wrote:
>
>>         ->  Nested Loop  (cost=0.00..31883.13 rows=2861 width=860)
>> (actual time=0.089..543.497 rows=68505 loops=1)
>>               ->  Index Scan using core_accessor_fresh_idx on
>> core_accessor  (cost=0.00..5460.07 rows=2970 width=92) (actual
>> time=0.068..54.921 rows=69312 loops=1)
>>                     Index Cond: ((slot_id = 472) AND (slot_type_id =
>> 119) AND (label = ''::text) AND (user_id = 0) AND (role = 0) AND
>> (publish_state >= 60))
>
> That index scan on core_accessor_fresh_idx has a pretty big disparity
> between what the planer expects to get (2970 rows) and what it
> actually gets (69312 rows). You should try increasing the statistics
> target if you haven't, then re-analyze and try the query again to see
> if the planner picks something better. The default of 10 is pretty
> small- try 100, or higher.
>
>
>
> --
> - David T. Wilson
> david.t.wilson@gmail.com
>

Re: Problems with ordering (can't force query planner to use an index)

От
Robert Haas
Дата:
On Tue, Mar 3, 2009 at 12:20 PM, Sebastjan Trepca <trepca@gmail.com> wrote:
> But it's already attached in the first mail or am I missing something?
>
> If you don't see it, check this: http://pastebin.com/d71b996d0

Woops, sorry, I thought you had sent plain EXPLAIN.  I see it now.

The lowest level at which I see a problem is here:

->  Index Scan using core_accessor_fresh_idx on core_accessor
(cost=0.00..5460.07 rows=2970 width=92) (actual time=0.068..54.921
rows=69312 loops=1)
    Index Cond: ((slot_id = 472) AND (slot_type_id = 119) AND (label =
''::text) AND (user_id = 0) AND (role = 0) AND (publish_state >= 60))

For some reason it expect 2970 rows but gets 69312.

A good place to start is to change your default_statistics_target
value to 100 in postgresql.conf, restart postgresql, and re-ANALYZE.

...Robert

Re: Problems with ordering (can't force query planner to use an index)

От
Sebastjan Trepca
Дата:
Still the same :/

I raised the default_statistics_target to 600 (it was already 100). I
then restarted pg, ran analyze through all tables and yet there is not
effect.
This is the output for core_accessor:
INFO:  analyzing "public.core_accessor"
INFO:  "core_accessor": scanned 291230 of 291230 pages, containing
17144315 live rows and 0 dead rows; 300000 rows in sample, 17144315
estimated total rows

It thinks there are even less rows in the set:


---------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=30816.49..30816.50 rows=5 width=855) (actual
time=683.907..683.910 rows=5 loops=1)
   ->  Sort  (cost=30816.49..30822.29 rows=2321 width=855) (actual
time=683.906..683.907 rows=5 loops=1)
         Sort Key: core_accessor.date_posted, core_accessor.nooximity
         Sort Method:  top-N heapsort  Memory: 31kB
         ->  Nested Loop  (cost=0.00..30777.94 rows=2321 width=855)
(actual time=0.072..517.970 rows=68505 loops=1)
               ->  Index Scan using core_accessor_fresh_idx on
core_accessor  (cost=0.00..8955.44 rows=2440 width=92) (actual
time=0.056..53.107 rows=69312 loops=1)
                     Index Cond: ((slot_id = 472) AND (slot_type_id =
119) AND (label = ''::text) AND (user_id = 0) AND (role = 0) AND
(publish_state >= 60))
               ->  Index Scan using core_base_pkey on core_base
(cost=0.00..8.93 rows=1 width=763) (actual time=0.004..0.005 rows=1
loops=69312)
                     Index Cond: ((core_base.object_id =
core_accessor.object_id) AND (core_base.content_type_id =
core_accessor.content_type_id))
 Total runtime: 684.015 ms
(10 rows)





Sebastjan



On Tue, Mar 3, 2009 at 6:40 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Mar 3, 2009 at 12:20 PM, Sebastjan Trepca <trepca@gmail.com> wrote:
>> But it's already attached in the first mail or am I missing something?
>>
>> If you don't see it, check this: http://pastebin.com/d71b996d0
>
> Woops, sorry, I thought you had sent plain EXPLAIN.  I see it now.
>
> The lowest level at which I see a problem is here:
>
> ->  Index Scan using core_accessor_fresh_idx on core_accessor
> (cost=0.00..5460.07 rows=2970 width=92) (actual time=0.068..54.921
> rows=69312 loops=1)
>    Index Cond: ((slot_id = 472) AND (slot_type_id = 119) AND (label =
> ''::text) AND (user_id = 0) AND (role = 0) AND (publish_state >= 60))
>
> For some reason it expect 2970 rows but gets 69312.
>
> A good place to start is to change your default_statistics_target
> value to 100 in postgresql.conf, restart postgresql, and re-ANALYZE.
>
> ...Robert
>

Re: Problems with ordering (can't force query planner to use an index)

От
Sebastjan Trepca
Дата:
Maybe this is useful, I removed the JOIN and it uses other
index(core_accessor_date_idx indexes (date_posted, nooximity)), but
its still hardly any better:

noovo-new=# explain analyze SELECT * FROM "core_accessor" WHERE
(("core_accessor"."slot_type_id" = 119
noovo-new(#  AND "core_accessor"."slot_id" = 472 AND
"core_accessor"."label" = E'' AND "core_accessor"."publish_state" >=
60 AND
noovo-new(#  "core_accessor"."role" IN (0) AND
"core_accessor"."user_id" = 0)) ORDER BY "core_accessor"."date_posted"
DESC, "core_accessor"."nooximity" DESC LIMIT 5
noovo-new-# ;

       QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..3709.56 rows=5 width=178) (actual
time=4593.867..4597.587 rows=5 loops=1)
   ->  Index Scan Backward using core_accessor_date_idx on
core_accessor  (cost=0.00..1810265.67 rows=2440 width=178) (actual
time=4593.866..4597.583 rows=5 loops=1)
         Filter: ((publish_state >= 60) AND (slot_type_id = 119) AND
(slot_id = 472) AND (label = ''::text) AND (role = 0) AND (user_id =
0))
 Total runtime: 4597.632 ms
(4 rows)


Sebastjan



On Tue, Mar 3, 2009 at 8:05 PM, Sebastjan Trepca <trepca@gmail.com> wrote:
> Still the same :/
>
> I raised the default_statistics_target to 600 (it was already 100). I
> then restarted pg, ran analyze through all tables and yet there is not
> effect.
> This is the output for core_accessor:
> INFO:  analyzing "public.core_accessor"
> INFO:  "core_accessor": scanned 291230 of 291230 pages, containing
> 17144315 live rows and 0 dead rows; 300000 rows in sample, 17144315
> estimated total rows
>
> It thinks there are even less rows in the set:
>
>
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=30816.49..30816.50 rows=5 width=855) (actual
> time=683.907..683.910 rows=5 loops=1)
>   ->  Sort  (cost=30816.49..30822.29 rows=2321 width=855) (actual
> time=683.906..683.907 rows=5 loops=1)
>         Sort Key: core_accessor.date_posted, core_accessor.nooximity
>         Sort Method:  top-N heapsort  Memory: 31kB
>         ->  Nested Loop  (cost=0.00..30777.94 rows=2321 width=855)
> (actual time=0.072..517.970 rows=68505 loops=1)
>               ->  Index Scan using core_accessor_fresh_idx on
> core_accessor  (cost=0.00..8955.44 rows=2440 width=92) (actual
> time=0.056..53.107 rows=69312 loops=1)
>                     Index Cond: ((slot_id = 472) AND (slot_type_id =
> 119) AND (label = ''::text) AND (user_id = 0) AND (role = 0) AND
> (publish_state >= 60))
>               ->  Index Scan using core_base_pkey on core_base
> (cost=0.00..8.93 rows=1 width=763) (actual time=0.004..0.005 rows=1
> loops=69312)
>                     Index Cond: ((core_base.object_id =
> core_accessor.object_id) AND (core_base.content_type_id =
> core_accessor.content_type_id))
>  Total runtime: 684.015 ms
> (10 rows)
>
>
>
>
>
> Sebastjan
>
>
>
> On Tue, Mar 3, 2009 at 6:40 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Tue, Mar 3, 2009 at 12:20 PM, Sebastjan Trepca <trepca@gmail.com> wrote:
>>> But it's already attached in the first mail or am I missing something?
>>>
>>> If you don't see it, check this: http://pastebin.com/d71b996d0
>>
>> Woops, sorry, I thought you had sent plain EXPLAIN.  I see it now.
>>
>> The lowest level at which I see a problem is here:
>>
>> ->  Index Scan using core_accessor_fresh_idx on core_accessor
>> (cost=0.00..5460.07 rows=2970 width=92) (actual time=0.068..54.921
>> rows=69312 loops=1)
>>    Index Cond: ((slot_id = 472) AND (slot_type_id = 119) AND (label =
>> ''::text) AND (user_id = 0) AND (role = 0) AND (publish_state >= 60))
>>
>> For some reason it expect 2970 rows but gets 69312.
>>
>> A good place to start is to change your default_statistics_target
>> value to 100 in postgresql.conf, restart postgresql, and re-ANALYZE.
>>
>> ...Robert
>>
>

Re: Problems with ordering (can't force query planner to use an index)

От
Robert Haas
Дата:
On Tue, Mar 3, 2009 at 2:16 PM, Sebastjan Trepca <trepca@gmail.com> wrote:
> Maybe this is useful, I removed the JOIN and it uses other
> index(core_accessor_date_idx indexes (date_posted, nooximity)), but
> its still hardly any better:
>
> noovo-new=# explain analyze SELECT * FROM "core_accessor" WHERE
> (("core_accessor"."slot_type_id" = 119
> noovo-new(#  AND "core_accessor"."slot_id" = 472 AND
> "core_accessor"."label" = E'' AND "core_accessor"."publish_state" >=
> 60 AND
> noovo-new(#  "core_accessor"."role" IN (0) AND
> "core_accessor"."user_id" = 0)) ORDER BY "core_accessor"."date_posted"
> DESC, "core_accessor"."nooximity" DESC LIMIT 5
> noovo-new-# ;
>
>       QUERY PLAN
>
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=0.00..3709.56 rows=5 width=178) (actual
> time=4593.867..4597.587 rows=5 loops=1)
>   ->  Index Scan Backward using core_accessor_date_idx on
> core_accessor  (cost=0.00..1810265.67 rows=2440 width=178) (actual
> time=4593.866..4597.583 rows=5 loops=1)
>         Filter: ((publish_state >= 60) AND (slot_type_id = 119) AND
> (slot_id = 472) AND (label = ''::text) AND (role = 0) AND (user_id =
> 0))
>  Total runtime: 4597.632 ms
> (4 rows)
>
>
> Sebastjan

Well, in that case, you are being bitten by the fact that our
multi-column selectivity estimates are not very good.  The planner has
good information on how each column behaves in isolation, but not how
they act together.  I've found this to be a very difficult problem to
fix.

Which of the parameters in this query vary and which ones are
typically always the same?  Sometimes you can improve things by
creating an appropriate partial index.

...Robert

Re: Problems with ordering (can't force query planner to use an index)

От
Tom Lane
Дата:
Sebastjan Trepca <trepca@gmail.com> writes:
> It thinks there are even less rows in the set:

>                ->  Index Scan using core_accessor_fresh_idx on
> core_accessor  (cost=0.00..8955.44 rows=2440 width=92) (actual
> time=0.056..53.107 rows=69312 loops=1)
>                      Index Cond: ((slot_id = 472) AND (slot_type_id =
> 119) AND (label = ''::text) AND (user_id = 0) AND (role = 0) AND
> (publish_state >= 60))

Maybe you should get rid of this six-column index, if you'd rather the
query didn't use it.  It seems a tad overspecialized anyway.

            regards, tom lane