On Thu, May 04, 2006 at 01:32:45 -0400, Greg Stark <gsstark@mit.edu> wrote:
> Bruno Wolff III <bruno@wolff.to> writes:
>
> > On Thu, May 04, 2006 at 00:05:16 -0400,
> > Greg Stark <gsstark@mit.edu> wrote:
> > > Bruno Wolff III <bruno@wolff.to> writes:
> > >
> > > > > Whereas it shouldn't be hard to prove that this is equivalent:
> > > > >
> > > > > stark=> explain select col1 from test group by upper(col1),col1 order by upper(col1);
> > > > > QUERY PLAN
> > > > > ---------------------------------------------------------------------
> > > > > Group (cost=88.50..98.23 rows=200 width=32)
> > > > > -> Sort (cost=88.50..91.58 rows=1230 width=32)
> > > > > Sort Key: upper(col1), col1
> > > > > -> Seq Scan on test (cost=0.00..25.38 rows=1230 width=32)
> > > > > (4 rows)
> > > >
> > > > I don't think you can assume that that will be true for any locale. If there
> > > > are two different characters that both have the same uppercase version, this
> > > > will break things.
> > >
> > > No it won't.
> >
> > Sure it will, because when you do the group by you will get a different
> > number of groups. When grouping by the original characters you will get
> > separate groups for characters that have the same uppercase character, where
> > as when grouing by the uppercased characters you won't.
>
> But grouping on *both* will produce the same groups as grouping on the
> original characters alone.
OK, I misssed that. My brain only saw upper(col) and not the immediately
following ,col1.
I aggree that grouping by col1 and upper(col1), col1 will give you the same
groups. And hence the queries should be equivalent.