Re: Optimising "in" queries

Поиск

Список

Период

Сортировка

От	Stephen Davies
Тема	Re: Optimising "in" queries
Дата	28 августа 2007 г. 14:44:32
Msg-id	200708222138.56605.scldad@sdc.com.au обсуждение исходный текст
Ответ на	Re: Optimising "in" queries (Russell Smith <mr-russ@pws.com.au>)
Список	pgsql-performance

Дерево обсуждения

array processing???

There are no arrays. What made you think there might be?

The table definition is:


benparts=# \d reading
                                       Table "public.reading"
  Column   |            Type             |
Modifiers
-----------+-----------------------------+-----------------------------------------------------------
 id        | integer                     | not null default
nextval(('reading_seq'::text)::regclass)
 sensor_id | integer                     |
 rdate     | timestamp without time zone |
 rval      | numeric(7,3)                |
Indexes:
    "reading_pkey" PRIMARY KEY, btree (id)
    "unique_sensor_date" UNIQUE, btree (sensor_id, rdate)
    "date" btree (rdate)
    "reading_sensor" btree (sensor_id)
Foreign-key constraints:
    "$1" FOREIGN KEY (sensor_id) REFERENCES sensor(id)

Cheers,
Stephen

On Wednesday 22 August 2007 20:28, Russell Smith wrote:
> Stephen Davies wrote:
> > I have a PostgreSQL 8.2.4 table with some seven million rows.
> >
> > The psql query:
> >
> > select count(rdate),rdate from reading where sensor_id in
> > (1137,1138,1139) group by rdate order by rdate desc limit 1;
> >
> > takes a few seconds but:
> >
> > select count(rdate),rdate from reading where sensor_id in
> > (1137,1138,1139,1140) group by rdate order by rdate desc limit 1;
>
> It would have been helpful to see the table definition here.  I can
> say up front that array processing in postgres is SLOW.
>
> > (anything with four or more values in the "in" list) takes several
> > minutes.
> >
> > Is there any way to make the "larger" queries more efficient?
> >
> > Both rdate and sensor_id are indexed and the database is vacuumed
> > every night.
> >
> > The values in the "in" list are seldom as "neat" as in the above
> > examples. Actual values can range from 1 to about 2000. The number
> > of values ranges from 2 to about 10.
> >
> > Explain outputs are:
> >
> > benparts=# explain select count(rdate),rdate from reading where
> > sensor_id in (1137,1138,1139,1140) group by rdate order by rdate
> > desc limit 1;
> >                                             QUERY PLAN
> > -------------------------------------------------------------------
> >-------------------------------- Limit  (cost=0.00..39890.96 rows=1
> > width=8)
> >    ->  GroupAggregate  (cost=0.00..7938300.21 rows=199 width=8)
> >          ->  Index Scan Backward using date on reading
> > (cost=0.00..7937884.59 rows=82625 width=8)
> >                Filter: (sensor_id = ANY
> > ('{1137,1138,1139,1140}'::integer[]))
> > (4 rows)
>
> I'm unsure of how you produced a plan like this without the benefit
> of seeing the table definition.
>
> > benparts=# explain select count(rdate),rdate from reading where
> > sensor_id in (1137,1138,1139) group by rdate order by rdate desc
> > limit 1;
> >                                              QUERY PLAN
> > -------------------------------------------------------------------
> >---------------------------------- Limit  (cost=48364.32..48364.32
> > rows=1 width=8)
> >    ->  Sort  (cost=48364.32..48364.49 rows=69 width=8)
> >          Sort Key: rdate
> >          ->  HashAggregate  (cost=48361.35..48362.21 rows=69
> > width=8) ->  Bitmap Heap Scan on reading  (cost=535.53..48218.10
> > rows=28650 width=8)
> >                      Recheck Cond: (sensor_id = ANY
> > ('{1137,1138,1139}'::integer[]))
> >                      ->  Bitmap Index Scan on reading_sensor
> > (cost=0.00..528.37 rows=28650 width=0)
> >                            Index Cond: (sensor_id = ANY
> > ('{1137,1138,1139}'::integer[]))
> > (8 rows)
>
> As mentioned already, you need explain analyze.
>
> However I again will say that array processing is postgres is SLOW.
> It would strongly recommend redesigning your schema to use a table
> with sensor_id's that correspond to the primary key in the reading
> table.
>
> Rethinking the way you are going about this will probably be the most
> effective solution, but we will need more information if you are not
> comfortable doing that yourself.
>
> Regards
>
> Russell Smith

--
========================================================================
This email is for the person(s) identified above, and is confidential to
the sender and the person(s).  No one else is authorised to use or
disseminate this email or its contents.

Stephen Davies Consulting                            Voice: 08-8177 1595
Adelaide, South Australia.                             Fax: 08-8177 0133
Computing & Network solutions.                       Mobile:0403 0405 83

В списке pgsql-performance по дате отправления:

Предыдущее

От: "Sachchida Ojha"
Дата: 28 августа 2007 г., 14:43:59
Сообщение: Re: Autovacuum is running forever

Следующее

От: Kenneth Marshall
Дата: 28 августа 2007 г., 14:44:54
Сообщение: Re: io storm on checkpoints, postgresql 8.2.4, linux

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Optimising "in" queries

Предыдущее

Следующее