Re: A counter argument about DISTINCT and GROUP BY in PostgreSQL

Поиск

Список

Период

Сортировка

От	dterrors@hotmail.com
Тема	Re: A counter argument about DISTINCT and GROUP BY in PostgreSQL
Дата	16 января 2008 г. 12:21:22
Msg-id	a1510e4a-129b-4048-bcdb-73d3f9fab359@n20g2000hsh.googlegroups.com обсуждение исходный текст
Ответ на	A counter argument about DISTINCT and GROUP BY in PostgreSQL (dterrors@hotmail.com)
Ответы	Re: A counter argument about DISTINCT and GROUP BY in PostgreSQL (Gregory Stark <stark@enterprisedb.com>)
Список	pgsql-general

Дерево обсуждения

On Jan 4, 11:48 am, st...@enterprisedb.com (Gregory Stark) wrote:
> <dterr...@hotmail.com> writes:
> > I've just spent a few hours searching and reading about the postgres
> > way of selecting distinct records.  I understand the points made about
> > the ORDER BY limitation of DISTINCT ON, and the abiguity of GROUP BY,
> > but I think there's a (simple, common) case that have been missed in
> > the discussion. Here is my sitation:
>
> > table "projects":
> > id  title     more stuff (pretend there's 20 more columns.)
> > -----------------------------------------------------------
> > 1   buildrome     moredata       inothercolumns
> > 2   housework   evenmoredata letssay20columns
>
> > table "todos":
> > id projectid name     duedate
> > -----------------------------------------
> > 1  1         conquer    1pm
> > 2  1         laybricks  10pm
> > 3  2         dolaundry  5pm
>
> > In english, I want to "select projects and order them by the ones that
> > have todos due the soonest."  Does that sound like a reasonable
> > request?
>
> SELECT *
>   FROM (
>         SELECT DISTINCT ON (projects.id) projects.*
>           FROM projects
>           JOIN todos ON (todos.projectid = projects.id)
>          ORDER BY projects.id, projects.duedate ASC
>         )
>  ORDER BY duedate ASC
> OFFSET 10
>  LIMIT 20
>
> > Option E: I could use a subselect.  But notice my offset, limit.  If I
> > use a subselect, then postgresql would have to build ALL of the
> > results in memory (to create the subselect virtual table), before I
> > apply the offset and limit on the subselect.
>
> Don't assume Postgres has to do things a particular way just because there's a
> subselect involved. In this case however I expect Postgres would have to build
> the results in memory, but not because of the subselect, just because that's
> the only way to do what you're asking.

When you say it would build the results in memory, do you mean the
entire subselected table?  The subselect in your example doesn't do
any offset, limit. (And, do you think what I'm asking for is odd or
unusual? I can think of a hundred examples besides a todo list.)

> You're asking for it to pick out distinct values according to one sort key
> then return the results sorted according to another key. Even if you had an
> index for the first key or Postgres used a hash to perform the distinct, the
> ORDER BY will require a sort.

I'm not trying to avoid doing a sort, actually.

> > Any suggestion would be appreciated.
>
> > BTW for those of you who are curious, in mysql (that other db), this
> > would be:
>
> > select a.* from projects a inner join todos b on b.projectid = a.id
> > group by a.id order by b.duedate limit 10,20;
>
> And what does the plan look like?

It looks great in mysql!   Works perfectly- that was from my old mysql
code before I switched, or well tried to switch to postgres. I get:

id  title     more stuff....
 -----------------------------------------------------------
1   buildrome     moredata       inothercolumns
2   housework   evenmoredata letssay20columns

В списке pgsql-general по дате отправления:

Предыдущее

От: Russ Brown
Дата: 16 января 2008 г., 12:21:09
Сообщение: Sun acquires MySQL

Следующее

От: Lew
Дата: 16 января 2008 г., 12:21:23
Сообщение: Re: Postgresql 8.2.4 on linux-sparc problem

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: A counter argument about DISTINCT and GROUP BY in PostgreSQL

Предыдущее

Следующее