Re: Index usage on OR queries

Поиск
Список
Период
Сортировка
От Andy Colson
Тема Re: Index usage on OR queries
Дата
Msg-id 4E5E49FF.7030603@squeakycode.net
обсуждение исходный текст
Ответ на Index usage on OR queries  (Tore Halvorsen <tore.halvorsen@gmail.com>)
Ответы Re: Index usage on OR queries  (Tore Halvorsen <tore.halvorsen@gmail.com>)
Re: Index usage on OR queries  ("Tomas Vondra" <tv@fuzzy.cz>)
Список pgsql-general
On 8/31/2011 9:35 AM, Tore Halvorsen wrote:
> Hi,
>
> I'm trying to optimize a query where I have two tables that both have a
> timestamp column. I want the result where either of the timestamps is
> after a specified time. In a reduced form, like this:
>
>
> CREATE TABLE a
> (
>    id serial NOT NULL PRIMARY KEY,
>    time timestamp without time zone NOT NULL DEFAULT now()
> );
>
> CREATE INDEX a_time_idx ON a USING btree (time DESC NULLS LAST);
>
> CREATE TABLE b
> (
>    id serial NOT NULL PRIMARY KEY,
>    time timestamp without time zone NOT NULL DEFAULT now()
> );
>
> CREATE INDEX b_time_idx ON b USING btree (time DESC NULLS LAST);
>
> --- generate some data
> insert into a(time)
> select now() - '10 year'::interval * random() from generate_series(1,
> 1000000, 1);
>
> insert into b(time)
> select now() - '10 year'::interval * random() from generate_series(1,
> 1000000, 1);
>
> -- Using constraint works as expected, and uses the time index.
> select * from a join b using(id)
> where a.time >= '2011-08-15';
>
> -- ... both ways...
> select * from a join b using(id)
> where b.time >= '2011-08-15';
>
> -- However, if I'm trying to do this for both times at once, the time
> index is not used at all
> select * from a join b using(id)
> where a.time >= '2011-08-15' OR b.time >= '2011-08-01'
>
> -- This can be optimized by using CTEs
> with am as (
>    select * from a where time >= '2011-08-15'
> )
> , bm as (
>    select * from b where time >= '2011-08-15'
> )
> select * from am join bm using(id)
>
> -- end
>
> I'm just wondering why the optimizer does things the way it does - and
> if the CTE version is the best way to go...
>
> The actual case is slightly more complex and uses more tables - this is
> mostly a way to find updated data.
>
> --
> Eld på åren og sol på eng gjer mannen fegen og fjåg. [Jøtul]
> <demo> 2011 Tore Halvorsen || +052 0553034554



On PG 9, after I ANALYZED the tables, it used indexes:


QUERY PLAN
---------------------------------------------------------------------------------------------------------------
  Merge Join  (cost=1.59..82778.35 rows=13171 width=20) (actual
time=0.066..1076.616 rows=12966 loops=1)
    Merge Cond: (a.id = b.id)
    Join Filter: ((a."time" >= '2011-08-15 00:00:00'::timestamp without
time zone) OR (b."time" >= '2011-08-01 0
    ->  Index Scan using a_pkey on a  (cost=0.00..31389.36 rows=1000000
width=12) (actual time=0.007..204.856 ro
    ->  Index Scan using b_pkey on b  (cost=0.00..31389.36 rows=1000000
width=12) (actual time=0.006..224.189 ro


ANALYZE is the magic.

-Andy

В списке pgsql-general по дате отправления:

Предыдущее
От: Jeff Ross
Дата:
Сообщение: Re: FATAL: terminating connection due to conflict with recovery
Следующее
От: Tore Halvorsen
Дата:
Сообщение: Re: Index usage on OR queries