RE: BUG #16031: Group by returns duplicate groups

Поиск
Список
Период
Сортировка
От David Raymond
Тема RE: BUG #16031: Group by returns duplicate groups
Дата
Msg-id VI1PR07MB6029E7C9B73250E386088047879D0@VI1PR07MB6029.eurprd07.prod.outlook.com
обсуждение исходный текст
Ответ на Re: BUG #16031: Group by returns duplicate groups  (Andrew Gierth <andrew@tao11.riddles.org.uk>)
Ответы RE: BUG #16031: Group by returns duplicate groups  (David Raymond <David.Raymond@tomtom.com>)
Список pgsql-bugs
"But this is all kinds of messed up"
Sounds about right :)

Output below.

As asked earlier it's Windows 10, all the lc_* settings are "en-US", and the server encoding is UTF8.
Throughout the table there are records with pretty much every notation alphabet. Latin, Cyrillic, Greek, Arabic,
Hebrew,Japanese, Mandarin, etc. etc. Which "shouldn't" matter, but I figured I'd mention it.
 



testing=> set enable_indexscan = off;
SET
Time: 0.536 ms
testing=> set enable_bitmapscan = off;
SET
Time: 0.225 ms
testing=> with sd as (select name, row_number() over (order by name) rnum
testing(>               from big_table)
testing-> select rnum, name, encode(convert_to(name, 'SQL_ASCII'),'escape'),
testing->        name < 'DK' as lt, name = 'DK' as eq, name > 'DK' as gt
testing->   from sd
testing->  where rnum >= (select min(rnum) from sd where name='DK')
testing->    and rnum <= (select max(rnum) from sd where name='DK');
    rnum    |               name                |              encode               | lt | eq | gt
------------+-----------------------------------+-----------------------------------+----+----+----
 27,900,023 | DK                                | DK                                | f  | t  | f
 27,900,024 | DK                                | DK                                | f  | t  | f
 27,900,025 | DK                                | DK                                | f  | t  | f
 27,900,026 | DK                                | DK                                | f  | t  | f
 27,900,027 | DK                                | DK                                | f  | t  | f
 27,900,028 | DK                                | DK                                | f  | t  | f
 27,900,029 | DK                                | DK                                | f  | t  | f
 27,900,030 | DK                                | DK                                | f  | t  | f
 27,900,031 | DK                                | DK                                | f  | t  | f
 27,900,032 | Dk'bus Marine                     | Dk'bus Marine                     | f  | f  | t
 27,900,033 | Dk's Auto's                       | Dk's Auto's                       | f  | f  | t
 27,900,034 | Dk's Bar & Grill                  | Dk's Bar & Grill                  | f  | f  | t
 27,900,035 | Dk's Barbers & Stylist            | Dk's Barbers & Stylist            | f  | f  | t
 27,900,036 | Dk's Beach Boutique               | Dk's Beach Boutique               | f  | f  | t
 27,900,037 | Dk's Cabinets & Countertops       | Dk's Cabinets & Countertops       | f  | f  | t
 27,900,038 | Dk's Cleaning Service             | Dk's Cleaning Service             | f  | f  | t
 27,900,039 | Dk's Clothing                     | Dk's Clothing                     | f  | f  | t
 27,900,040 | Dk's Communications               | Dk's Communications               | f  | f  | t
 27,900,041 | Dk's Dancewear & Fitnesswear      | Dk's Dancewear & Fitnesswear      | f  | f  | t
 27,900,042 | Dk's Dancewear Boutique           | Dk's Dancewear Boutique           | f  | f  | t
 27,900,043 | Dk's Discount Dance & Fitnesswear | Dk's Discount Dance & Fitnesswear | f  | f  | t
 27,900,044 | DK's Drywall Service              | DK's Drywall Service              | f  | f  | t
 27,900,045 | DK'S DUSTBUSTERS                  | DK'S DUSTBUSTERS                  | f  | f  | t
 27,900,046 | Dk's Family Five Star Trophies    | Dk's Family Five Star Trophies    | f  | f  | t
 27,900,047 | DK's Family Five Star Trophies    | DK's Family Five Star Trophies    | f  | f  | t
 27,900,048 | Dk's Food Mart                    | Dk's Food Mart                    | f  | f  | t
 27,900,049 | Dk'S Group Pte. Ltd.              | Dk'S Group Pte. Ltd.              | f  | f  | t
 27,900,050 | Dk's Hair Designs                 | Dk's Hair Designs                 | f  | f  | t
 27,900,051 | Dk's Hair Happenings              | Dk's Hair Happenings              | f  | f  | t
 27,900,052 | Dk's Hair Supply                  | Dk's Hair Supply                  | f  | f  | t
 27,900,053 | Dk's Home Decor                   | Dk's Home Decor                   | f  | f  | t
 27,900,054 | DK's Informática                  | DK's Inform\303\241tica           | f  | f  | t
 27,900,055 | Dk's Janitorial                   | Dk's Janitorial                   | f  | f  | t
 27,900,056 | DK's Liquors                      | DK's Liquors                      | f  | f  | t
 27,900,057 | Dk's Market                       | Dk's Market                       | f  | f  | t
 27,900,058 | Dk's Moda Masculina               | Dk's Moda Masculina               | f  | f  | t
 27,900,059 | Dk's Nails And Spa                | Dk's Nails And Spa                | f  | f  | t
 27,900,060 | DK's Pawn Shop                    | DK's Pawn Shop                    | f  | f  | t
 27,900,061 | Dk's Pet Grooming                 | Dk's Pet Grooming                 | f  | f  | t
 27,900,062 | DK's Quality Service              | DK's Quality Service              | f  | f  | t
 27,900,063 | DK's Restoration                  | DK's Restoration                  | f  | f  | t
 27,900,064 | Dk's Sports Center                | Dk's Sports Center                | f  | f  | t
 27,900,065 | Dk's Statuary                     | Dk's Statuary                     | f  | f  | t
 27,900,066 | Dk's Style Hut                    | Dk's Style Hut                    | f  | f  | t
 27,900,067 | Dk's Temiskaming Shore Taxi       | Dk's Temiskaming Shore Taxi       | f  | f  | t
 27,900,068 | Dk's Towing                       | Dk's Towing                       | f  | f  | t
 27,900,069 | DK's Travel                       | DK's Travel                       | f  | f  | t
 27,900,070 | Dk'Style                          | Dk'Style                          | f  | f  | t
 27,900,071 | DK'Z Car Wash                     | DK'Z Car Wash                     | f  | f  | t
 27,900,072 | Dk-                               | Dk-                               | t  | f  | f
 27,900,073 | DK                                | DK                                | f  | t  | f
 27,900,074 | DK                                | DK                                | f  | t  | f
 27,900,075 | DK                                | DK                                | f  | t  | f
 27,900,076 | DK                                | DK                                | f  | t  | f
 27,900,077 | DK                                | DK                                | f  | t  | f
 27,900,078 | DK                                | DK                                | f  | t  | f
 27,900,079 | DK                                | DK                                | f  | t  | f
 27,900,080 | DK                                | DK                                | f  | t  | f
 27,900,081 | DK                                | DK                                | f  | t  | f
 27,900,082 | DK                                | DK                                | f  | t  | f
 27,900,083 | DK                                | DK                                | f  | t  | f
 27,900,084 | DK                                | DK                                | f  | t  | f
 27,900,085 | DK                                | DK                                | f  | t  | f
 27,900,086 | DK                                | DK                                | f  | t  | f
 27,900,087 | DK                                | DK                                | f  | t  | f
 27,900,088 | DK                                | DK                                | f  | t  | f
(66 rows)

Time: 821796.036 ms (13:41.796)

testing=> explain (analyze, verbose, costs, buffers, timing, summary)
testing-> with sd as (select name, row_number() over (order by name) rnum
testing(>               from big_table)
testing-> select rnum, name, encode(convert_to(name, 'SQL_ASCII'),'escape'),
testing->        name < 'DK' as lt, name = 'DK' as eq, name > 'DK' as gt
testing->   from sd
testing->  where rnum >= (select min(rnum) from sd where name='DK')
testing->    and rnum <= (select max(rnum) from sd where name='DK');
                                                                                             QUERY PLAN
           
 

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 CTE Scan on sd  (cost=30560557.18..33281469.69 rows=542825 width=559) (actual time=886056.646..898585.621 rows=66
loops=1)
   Output: sd.rnum, sd.name, encode(convert_to((sd.name)::text, 'SQL_ASCII'::name), 'escape'::text), ((sd.name)::text <
'DK'::text),((sd.name)::text = 'DK'::text), ((sd.name)::text > 'DK'::text)
 
   Filter: ((sd.rnum >= $1) AND (sd.rnum <= $2))
   Rows Removed by Filter: 108565020
   Buffers: shared hit=482 read=1516916, temp read=3200588 written=2619846
   CTE sd
     ->  WindowAgg  (cost=23772525.03..25672414.07 rows=108565088 width=30) (actual time=719781.561..832442.189
rows=108565086loops=1)
 
           Output: big_table.name, row_number() OVER (?)
           Buffers: shared hit=482 read=1516916, temp read=2033663 written=2036384
           ->  Sort  (cost=23772525.03..24043937.75 rows=108565088 width=22) (actual time=719781.549..793477.044
rows=108565086loops=1)
 
                 Output: big_table.name
                 Sort Key: big_table.name
                 Sort Method: external merge  Disk: 3453888kB
                 Buffers: shared hit=482 read=1516916, temp read=2033663 written=2036384
                 ->  Seq Scan on name_stuff.big_table  (cost=0.00..2603048.88 rows=108565088 width=22) (actual
time=0.010..105238.261rows=108565086 loops=1)
 
                       Output: big_table.name
                       Buffers: shared hit=482 read=1516916
   InitPlan 2 (returns $1)
     ->  Aggregate  (cost=2444071.54..2444071.55 rows=1 width=8) (actual time=148300.335..148300.336 rows=1 loops=1)
           Output: min(sd_1.rnum)
           Buffers: temp read=431489 written=583461
           ->  CTE Scan on sd sd_1  (cost=0.00..2442714.48 rows=542825 width=8) (actual time=34105.882..148300.327
rows=25loops=1)
 
                 Output: sd_1.name, sd_1.rnum
                 Filter: ((sd_1.name)::text = 'DK'::text)
                 Rows Removed by Filter: 108565061
                 Buffers: temp read=431489 written=583461
   InitPlan 3 (returns $2)
     ->  Aggregate  (cost=2444071.54..2444071.55 rows=1 width=8) (actual time=14707.032..14707.032 rows=1 loops=1)
           Output: max(sd_2.rnum)
           Buffers: temp read=583462
           ->  CTE Scan on sd sd_2  (cost=0.00..2442714.48 rows=542825 width=8) (actual time=3729.712..14707.025
rows=25loops=1)
 
                 Output: sd_2.name, sd_2.rnum
                 Filter: ((sd_2.name)::text = 'DK'::text)
                 Rows Removed by Filter: 108565061
                 Buffers: temp read=583462
 Planning Time: 0.099 ms
 Execution Time: 899881.036 ms
(37 rows)

Time: 899900.240 ms (14:59.900)
testing=>



-----Original Message-----
From: Andrew Gierth <andrew@tao11.riddles.org.uk> 
Sent: Tuesday, October 1, 2019 11:21 AM
To: David Raymond <David.Raymond@tomtom.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>; pgsql-bugs@lists.postgresql.org
Subject: Re: BUG #16031: Group by returns duplicate groups

But this is all kinds of messed up:

...

Let's see some more data from that. Do this query:

with sd as (select name, row_number() over (order by name) rnum
              from big_table)
select rnum, name, encode(convert_to(name, 'SQL_ASCII'),'escape'),
       name < 'DK' as lt, name = 'DK' as eq, name > 'DK' as gt
  from sd
 where rnum >= (select min(rnum) from sd where name='DK')
   and rnum <= (select max(rnum) from sd where name='DK');

-- 
Andrew (irc:RhodiumToad)

В списке pgsql-bugs по дате отправления:

Предыдущее
От: PG Bug reporting form
Дата:
Сообщение: BUG #16032: pg_basebackup when running on Windows doesn't clean up on failure correctly
Следующее
От: David Raymond
Дата:
Сообщение: RE: BUG #16031: Group by returns duplicate groups