Problems with ordering (can't force query planner to use an index)

Поиск
Список
Период
Сортировка
От Sebastjan Trepca
Тема Problems with ordering (can't force query planner to use an index)
Дата
Msg-id cd329af80903030905y2efa15bdt123779ee8877df4a@mail.gmail.com
обсуждение исходный текст
Ответы Re: Problems with ordering (can't force query planner to use an index)
Re: Problems with ordering (can't force query planner to use an index)
Список pgsql-performance
Hey,

I have a table that links content together and it currently holds
about 17 mio records. Typical query is a join with a content table and
link table:

noovo-new=# explain analyze SELECT "core_accessor"."id",
"core_accessor"."content_type_id",
"core_accessor"."object_id", "core_accessor"."ordering",
"core_accessor"."label", "core_accessor"."date_posted",
"core_accessor"."publish_state", "core_accessor"."nooximity_old",
"core_accessor"."rising", "core_accessor"."nooximity",
"core_accessor"."nooximity_old_date_posted",
"core_accessor"."nooximity_date_posted", "core_accessor"."user_id",
"core_accessor"."slot_id", "core_accessor"."slot_type_id",
"core_accessor"."role", "core_base"."object_id",
"core_base"."content_type_id", "core_base"."abstract",
"core_base"."abstract_title", "core_base"."image",
 "core_base"."date_posted", "core_base"."date_modified",
"core_base"."date_expires", "core_base"."publish_state",
 "core_base"."location", "core_base"."location_x",
"core_base"."location_y", "core_base"."raw", "core_base"."author_id",
 "core_base"."excerpt", "core_base"."state_id",
"core_base"."country_id", "core_base"."language",
"core_base"."_identifier",
  "core_base"."slot_url", "core_base"."source_id",
"core_base"."source_content_type_id", "core_base"."source_type",
 "core_base"."source_value", "core_base"."source_title",
"core_base"."direct_to_source", "core_base"."comment_count",
 "core_base"."public" FROM "core_accessor" INNER JOIN core_base AS
core_base ON core_base.content_type_id =
 core_accessor.content_type_id AND core_base.object_id =
core_accessor.object_id WHERE (("core_accessor"."slot_type_id" = 119
 AND "core_accessor"."slot_id" = 472 AND "core_accessor"."label" = E''
AND "core_accessor"."publish_state" >= 60 AND
 "core_accessor"."role" IN (0) AND "core_accessor"."user_id" = 0))
order by core_accessor.date_posted, core_accessor.nooximity LIMIT 5
;

      QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=31930.65..31930.66 rows=5 width=860) (actual
time=711.924..711.927 rows=5 loops=1)
   ->  Sort  (cost=31930.65..31937.80 rows=2861 width=860) (actual
time=711.923..711.923 rows=5 loops=1)
         Sort Key: core_accessor.date_posted, core_accessor.nooximity
         Sort Method:  top-N heapsort  Memory: 31kB
         ->  Nested Loop  (cost=0.00..31883.13 rows=2861 width=860)
(actual time=0.089..543.497 rows=68505 loops=1)
               ->  Index Scan using core_accessor_fresh_idx on
core_accessor  (cost=0.00..5460.07 rows=2970 width=92) (actual
time=0.068..54.921 rows=69312 loops=1)
                     Index Cond: ((slot_id = 472) AND (slot_type_id =
119) AND (label = ''::text) AND (user_id = 0) AND (role = 0) AND
(publish_state >= 60))
               ->  Index Scan using core_base_pkey on core_base
(cost=0.00..8.88 rows=1 width=768) (actual time=0.004..0.005 rows=1
loops=69312)
                     Index Cond: ((core_base.object_id =
core_accessor.object_id) AND (core_base.content_type_id =
core_accessor.content_type_id))
 Total runtime: 712.031 ms
(10 rows)

noovo-new=# select * from pg_stat_user_tables where relname='core_accessor';
 relid | schemaname |    relname    | seq_scan | seq_tup_read |
idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del |
n_tup_hot_upd | n_live_tup | n_dead_tup |          last_vacuum
 | last_autovacuum |         last_analyze          | last_autoanalyze

-------+------------+---------------+----------+--------------+----------+---------------+-----------+-----------+-----------+---------------+------------+------------+-------------------------------+-----------------+-------------------------------+------------------
 51159 | public     | core_accessor |       58 |    749773516 |
13785608 |     149165183 |      9566 |       548 |       347 |
  206 |   17144303 |        251 | 2009-03-03 07:02:19.733778-06 |
           | 2009-03-03 06:17:47.784268-06 |
(1 row)

noovo-new=# \d+ core_accessor;
                                                  Table "public.core_accessor"
          Column           |           Type           |
         Modifiers                          | Description

---------------------------+--------------------------+------------------------------------------------------------+-------------
 id                        | bigint                   | not null
default nextval('core_accessor_id_seq'::regclass) |
 flavor                    | character varying(32)    |
                                            |
 content_type_id           | integer                  | not null
                                            |
 object_id                 | integer                  | not null
                                            |
 publish_state             | smallint                 | not null
                                            |
 date_posted               | timestamp with time zone | not null
                                            |
 user_id                   | integer                  |
                                            |
 slot_id                   | integer                  |
                                            |
 slot_type_id              | integer                  |
                                            |
 role                      | smallint                 |
                                            |
 ordering                  | integer                  |
                                            |
 author_id                 | integer                  |
                                            |
 nooximity_old             | double precision         | default 0.0
                                            |
 rising                    | double precision         | default 0.0
                                            |
 label                     | text                     |
                                            |
 nooximity                 | double precision         | not null
default 1.0                                       |
 nooximity_old_date_posted | timestamp with time zone |
                                            |
 nooximity_date_posted     | timestamp with time zone |
                                            |
Indexes:
    "portal_metainfo_pkey" PRIMARY KEY, btree (id)
    "portal_metainfo_unique_constr" UNIQUE, btree (content_type_id,
object_id, user_id, slot_id, slot_type_id, role, label) CLUSTER
    "core_accessor_date_idx" btree (date_posted, nooximity)
    "core_accessor_dated_idx" btree (slot_id, slot_type_id, label,
user_id, role, publish_state, date_posted, nooximity)
    "core_accessor_fresh_idx" btree (slot_id, slot_type_id, label,
user_id, role, publish_state)
    "core_accessor_popularity_idx" btree (nooximity, date_posted)
Check constraints:
    "portal_metainfo_object_id_check" CHECK (object_id >= 0)
    "portal_metainfo_owner_id_check" CHECK (slot_id >= 0)
Foreign-key constraints:
    "portal_metainfo_accessor_id_fkey" FOREIGN KEY (user_id)
REFERENCES auth_user(id) DEFERRABLE INITIALLY DEFERRED
    "portal_metainfo_content_type_id_fkey" FOREIGN KEY
(content_type_id) REFERENCES django_content_type(id) DEFERRABLE
INITIALLY DEFERRED
    "portal_metainfo_owner_type_id_fkey" FOREIGN KEY (slot_type_id)
REFERENCES django_content_type(id) DEFERRABLE INITIALLY DEFERRED
Has OIDs: no



As far as I understand the explain, it fetches 68505 rows, matches
them with core_base and then tries to sort them? AFAIK it would
probably be much more effective to just find the records in accessor
via core_accessor_dated_idx and then lookup the core_base table? But
for some reason it doesn't want to?

I ran analyze, vacuum and reindex but nothing helped. Queries just eat
all the I/O and block. There is a huge difference between cached and
non-cached queries, like 50.000 to 50 ms.

Help! :)


Thanks, Sebastjan

В списке pgsql-performance по дате отправления:

Предыдущее
От: Hans Liebenberg
Дата:
Сообщение: Substring search using "exists" with a space in the search term
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Problems with ordering (can't force query planner to use an index)