Re: difficulties with time based queries

От: PFC
Тема: Re: difficulties with time based queries
Дата: ,
Msg-id: op.urxt00cbcigqcu@soyouz
(см: обсуждение, исходный текст)
Ответ на: difficulties with time based queries  ("Rainer Mager")
Список: pgsql-performance

Скрыть дерево обсуждения

difficulties with time based queries  ("Rainer Mager", )
 Re: difficulties with time based queries  (David Wilson, )
  Re: difficulties with time based queries  ("Rainer Mager", )
   Re: difficulties with time based queries  (Tom Lane, )
    Re: difficulties with time based queries  ("Rainer Mager", )
     Re: difficulties with time based queries  (Robert Haas, )
   Re: difficulties with time based queries  (Matthew Wakeling, )
 Re: difficulties with time based queries  (PFC, )
 Re: difficulties with time based queries  (Tom Lane, )
 Re: difficulties with time based queries  ("Rainer Mager", )
  Re: difficulties with time based queries  (PFC, )
   Re: difficulties with time based queries  (Nikolas Everett, )
 Re: difficulties with time based queries  ("Rainer Mager", )
  Re: difficulties with time based queries  (Tom Lane, )

> When I try to do queries on this
> table I always find them slower than what I need and what I believe
> should be possible.
>
>    ->  Bitmap Index Scan on ad_log_date_all  (cost=0.00..72750.51
> rows=2488252 width=0) (actual time=49776.332..49776.332 rows=2268490
> loops=1)
>
>          Index Cond: ((date(start_time) < '2009-03-31'::date) AND
> (date(start_time) >= '2009-03-30'::date))
>
>  Total runtime: 65279.352 ms

    Well, it is grabbing 2.268.490 rows, that's a lot of rows, so it is not
going to be very fast like a few milliseconds.
    Your columns are small, ints, dates, not large text strings which would
augment the total amount of data.
    So your timing looks pretty slow, it should be faster than this, maybe a
few seconds.

    With this quantity of rows, you want to try to make the disk accesses as
linear as possible.
    This means your table should be organized on disk by date, at least
roughly.
    If your data comes from an import that was sorted on some other column,
this may not be the case.

    What kind of bytes/s do you get from the drives ?

=> Can you post the result of "vmstat 1" during the entire execution of
the query ?

    2 phases should be visible in the vmstat output, the indexscan, and the
bitmap heapscan.

    You could use CLUSTER on the table (it will take a long time), or simply
create another table and INSERT INTO ... SELECT ORDER BY date. This will
also take a long time, but faster than CLUSTER. Then you could recreate
the indexes.

    Do you UPDATE or DELETE a lot from this table ? Is it vacuum'd enough ?




В списке pgsql-performance по дате сообщения:

От: "Rainer Mager"
Дата:
Сообщение: Re: difficulties with time based queries
От: roopasatish
Дата:
Сообщение: probelm with alter table add constraint......