Re: [HACKERS] No: implied sort with group by

Поиск
Список
Период
Сортировка
От darrenk@insightdist.com (Darren King)
Тема Re: [HACKERS] No: implied sort with group by
Дата
Msg-id 9801271708.AA21938@ceodev
обсуждение исходный текст
Список pgsql-hackers
> > > This is what I think is missing or broken right now.
> > >
> > > > > select * from t1;
> > > >          a b  c
> > > >          1    x
> > > >          2    x
> > > >          3    z
> > > >          2    x
> > > >
> > > > 4 row(s) retrieved.
> > > > > select b,c,sum(a) from t1 group by b,c;
> > > > b  c             (sum)
> > > >
> > > >    x                 5
> > > >    z                 3
> > > >> 2 row(s) retrieved.
> >
> > Sorry, I've lost the thread. What is broken? I get this same result, and
> > (assuming that column "b" is full of nulls) I think this the correct result.
>
> At one point, it was thought that NULLs shouldn't be grouped, but I
> backed out the patch.  There is a problem with GROUP BY on large
> datasets, and Vadim knows the cause, and will work on it later.

Different from the grouping by NULLs issue...

The above results are from Sybase.  If these same four rows are inserted into
postgres, the second query will return three rows.  Something like...

b|c|sum(a)
 |x|3
 |z|3
 |x|2

It does this not because of the null values of column b, but because the data is
not sorted before getting to the group by node if the user does not explicitly put
an order by in the query.  IMHO, postgres should put an arbitrary sort node in the
tree so that the data can be properly grouped as the group node iterates over it.

And even if I put an "order by c" clause in there, I still get three rows, they're
just properly sorted. :)

darrenk


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Brett McCormick
Дата:
Сообщение: Re: [QUESTIONS] select date('now'::datetime+'30 day'::timespan)
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: Frontend/Backend Protocol Patch