Re: [HACKERS] Re: [QUESTIONS] trouble grouping rows

Поиск
Список
Период
Сортировка
От Thomas G. Lockhart
Тема Re: [HACKERS] Re: [QUESTIONS] trouble grouping rows
Дата
Msg-id 34BD8866.96E64094@alumni.caltech.edu
обсуждение исходный текст
Ответы Re: [HACKERS] Re: [QUESTIONS] trouble grouping rows  (Bruce Momjian <maillist@candle.pha.pa.us>)
Список pgsql-hackers
> > > > I came across the same grouping problem myself. In my installation it
> > > > only happens with tables containing thousands of rows (for example 10000
> > > > or more).

I've been keeping track of this problem and was preparing to send a message to the
list saying that it had disappeared as of a few weeks ago. Even did some more test
cases to confirm, and (unfortunately) tried one last case:

tgl=> select c1, c2, count(*) from v group by c1, c2;
c1  |c2  |count
----+----+-----
foo1|foo2|    2
foo1|foo2|    2
foo1|foo2|    3
foo1|foo2|    4
foo1|foo2|    2
foo1|foo2|27151
(6 rows)

where other cases like:

tgl=> select c1, count(*) from v group by c1;
c1  |count
----+-----
foo1|27164
(1 row)

seem to work.  I get identical results for both char16 and for text fields in the
two-column table, and the order of the "group by" does not matter.

From what others said earlier, the problem is not reproducible on all systems, but
clearly shows up on at least two (perhaps both Linux?). Bruce, do you have some
suggestions on where to look to track this down? Where in the code does the sorting
and ordering happen during the select?

                                                      - Tom

> OK, thanks for the cookbook (retained below) on how to demonstrate the problem.
> The limit for triggering the problem seems to be system-dependent, but not
> related to the postmaster -B option (I tried with both 256 and 64 with the same
> results).
>
> This is a problem which is _not_ present in v6.1. I suspect it may be related to
> changes in sorting for using "psort", but have nothing on which to base that
> other than the v6.1 success.
>
> My results (following the cookbook):
>
> tgl=> create table test (field1 char16, field2 char8);
> CREATE
> tgl=> copy test from '/home/tgl/postgres/testagg.input';
> COPY
> tgl=> select count(*) from test;
> count
> -----
>  6791
> (1 row)
>
> tgl=> select field1, field2, count(*) from test group by field1, field2;
> NOTICE:copyObject: don't know how to copy 720
> NOTICE:copyObject: don't know how to copy 720
> field1|field2|count
> ------+------+-----
> foo1  |foo2  | 6791
> (1 row)
>
> tgl=> copy test from '/home/tgl/postgres/testagg.input';
> COPY
> tgl=> select count(*) from test;
> count
> -----
> 13582
> (1 row)
>
> tgl=> select field1, field2, count(*) from test group by field1, field2;
> NOTICE:copyObject: don't know how to copy 720
> NOTICE:copyObject: don't know how to copy 720
> field1|field2|count
> ------+------+-----
> foo1  |foo2  |    2
> foo1  |foo2  |    2
> foo1  |foo2  |    3
> foo1  |foo2  |    4
> foo1  |foo2  |    2
> foo1  |foo2  |13569
> (6 rows)
>
> I tried v6.1 with up to 27164 rows and did not see the problem. Any ideas
> hackers??
>
>                                                                               -
> Tom
>
> > > Yes. If possible please shrink the test case to the minimum needed to
> > > exhibit the problem. TIA
> >
> > create table test (field1 char16, field2 char8);
> >
> > Insert 6791 or more rows. An easy way is to:
> >    - create a text file using vi
> >    - insert a line with 2 words like foo1 and foo2 separated
> >      using a tab
> >    - copy it (yy) and paste it 6790 times (6790p)
> >    - save it and exit vi
> > Then, using psql enter the following query:
> >    copy test from 'the_file_you_created';
> >
> > Now to trigger the bug:
> >    select field1, field2, count(*) from test group by field1, field2;
> >
> > At my system I would see many lines like this:
> >    foo1   foo2   1
> >    foo1   foo2   3
> >    foo1   foo2   1
> >    foo1   foo2   3
> >    foo1   foo2   1
> >    foo1   foo2   3
> >    foo1   foo2   1
> >    foo1   foo2   3
> > instead of:
> >    foo1   foo2   6791
> >
> > This also happens when some of the rows are different from the others
> > (they don't have to be the same like in the example above).
> >
> > When the table has 6790 or less rows, everything is ok.
> >
> > When the table contains int's instead of char8/char16's you will probably
> > need more rows in order to exhibit the problem. I will try to find out the
> > exact number of rows needed in that case.
> >
> > Cheers,
> > Ronald




В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Vadim B. Mikheev"
Дата:
Сообщение: Re: Fix for select sum(2+2)...
Следующее
От: "Vadim B. Mikheev"
Дата:
Сообщение: Re: [HACKERS] Just a small thing for 6.3 ...