Re: quickly getting the top N rows

Поиск
Список
Период
Сортировка
От Ben
Тема Re: quickly getting the top N rows
Дата
Msg-id Pine.LNX.4.64.0710041229080.30864@localhost.localdomain
обсуждение исходный текст
Ответ на Re: quickly getting the top N rows  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: quickly getting the top N rows  (Simon Riggs <simon@2ndquadrant.com>)
Re: quickly getting the top N rows  ("Scott Marlowe" <scott.marlowe@gmail.com>)
Список pgsql-performance
On Thu, 4 Oct 2007, Tom Lane wrote:

> You're being about as clear as mud here, except that you obviously lied
> about what you were doing in your first message.  If you have a planner
> problem, show us the *exact* query, the *exact* table definition, and
> unfaked EXPLAIN ANALYZE output.

I didn't realize that simplification was viewed as so sinister, but
thanks, I'll remember that in the future.

The table:
                          Table "public.log"
      Column     |            Type             |      Modifiers
----------------+-----------------------------+---------------------
  clientkey      | character(30)               | not null
  premiseskey    | character(30)               | not null
  logkey         | character(30)               | not null
  logicaldel     | character(1)                | default 'N'::bpchar
  lockey         | character(30)               |
  devlockey      | character(30)               |
  eventkey       | character(30)               |
  logshorttext   | character varying(255)      |
  logdesc        | character varying(255)      |
  loguserkey     | character(30)               |
  logassetkey    | character(30)               |
  logresourcekey | character(30)               |
  logtime        | timestamp without time zone |
  logip          | character varying(50)       |
  logarchived    | character(1)                |
  logarchivedate | timestamp without time zone |
  loghasvideo    | character(1)                |
  loghasaudio    | character(1)                |
  resvehiclekey  | character(30)               |
  synccreated    | character(1)                |
  logtypekey     | character(30)               |
Indexes:
     "log_pkey" PRIMARY KEY, btree (clientkey, premiseskey, logkey)
     "eventkey_idx" btree (eventkey),
     "log_ak1" btree (clientkey, premiseskey, logtime, logkey)


The original, slow query:

explain analyze SELECT * FROM log WHERE clientkey in
('000000004500000000010000000001')  AND premiseskey in
('000000004500000000010000000001') and logicaldel = 'N'
ORDER BY logtime desc, logkey desc, clientkey desc, premiseskey desc LIMIT 20 offset 0;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=356402.58..356402.63 rows=20 width=563) (actual time=215858.481..215858.527 rows=20 loops=1)
    ->  Sort  (cost=356402.58..357598.25 rows=478267 width=563) (actual time=215858.478..215858.498 rows=20 loops=1)
          Sort Key: logtime, logkey, clientkey, premiseskey
          ->  Seq Scan on log  (cost=0.00..52061.67 rows=478267 width=563) (actual time=29.340..100043.313 rows=475669
loops=1)
                Filter: ((clientkey = '000000004500000000010000000001'::bpchar) AND (premiseskey =
'000000004500000000010000000001'::bpchar)AND (logicaldel = 'N'::bpchar)) 
  Total runtime: 262462.582 ms
(6 rows)


Every row in log has identical clientkey and premiseskey values, so if I
just remove those columns from the order by clause, I get this far
superior plan:

explain analyze SELECT * FROM log WHERE clientkey in
('000000004500000000010000000001') AND premiseskey in
('000000004500000000010000000001') and logicaldel = 'N'
ORDER BY logtime desc, logkey desc LIMIT 20 offset 0;
                                                                 QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=0.00..12.33 rows=20 width=563) (actual time=0.047..0.105 rows=20 loops=1)
    ->  Index Scan Backward using log_ak1 on log  (cost=0.00..294735.70 rows=478267 width=563) (actual
time=0.044..0.076rows=20 loops=1) 
          Index Cond: ((clientkey = '000000004500000000010000000001'::bpchar) AND (premiseskey =
'000000004500000000010000000001'::bpchar))
          Filter: (logicaldel = 'N'::bpchar)
  Total runtime: 0.165 ms
(5 rows)


...which made me to think that maybe postgres is not using log_ak1 in the
former case because two of the columns in the order by match every row.

Unfortunately, in this case it's not an option to alter the query. I'm
just trying to figure out an explaination.

В списке pgsql-performance по дате отправления:

Предыдущее
От: Josh Trutwin
Дата:
Сообщение: Re: Tuning Help - What did I do wrong?
Следующее
От: Simon Riggs
Дата:
Сообщение: Re: quickly getting the top N rows