Q: will GROUP BY make use of an index to return tuples early?

Поиск
Список
Период
Сортировка
От Gunther Schadow
Тема Q: will GROUP BY make use of an index to return tuples early?
Дата
Msg-id 3D038B57.8000007@aurora.regenstrief.org
обсуждение исходный текст
Ответы Re: Q: will GROUP BY make use of an index to return tuples early?  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-sql
Hi pgsql optimizer gurus,

If I have a table

CREATE TABLE Foo(  id    OID,  time  TIMESTAMP,  value INTEGER
);

CREATE INDEX Foo_id_idx ON Foo(id);

and I have a query for

SELECT id, MIN(foo.time)  FROM Foo foo GROUP BY foo.id;

can one tell the query executor to do an index scan on Foo_id_idx
such as to be sure the Foo tuples are being considered ordered by
foo.id in order to produce tuples before having worked through the
whole table? If we don't use that assumption of the index order and
would instead to a full table scan, the select could not return
anything until the full table scan is completed.

I am asking because if I wanted to stream the tuples of the first
query into anothe system to do a distributed semijoin, I would
like data to flow at all times while queries are still being executed.

If the answer is yes, there is a way, then how about if we do this:

CREATE INDEX Foo_id_time_idx ON Foo(id, time);

now, considering the same query, it could be executed even faster,
because we could do an index scan on Foo_id_time_idx and only need
to consider the first data tuple of every Foo.id group (because
the ordering of Foo_id_time_idx guarantees that the MIN(time) is
in the first tuple.

thank you,
-Gunther


-- 
Gunther Schadow, M.D., Ph.D.                    gschadow@regenstrief.org
Medical Information Scientist      Regenstrief Institute for Health Care
Adjunct Assistant Professor        Indiana University School of Medicine
tel:1(317)630-7960                         http://aurora.regenstrief.org




В списке pgsql-sql по дате отправления:

Предыдущее
От: Stephan Szabo
Дата:
Сообщение: Re: Indexing timestamps
Следующее
От: Gunther Schadow
Дата:
Сообщение: Support for distributed queries with semijoins could be possible, if ...