Обсуждение: BUG #16031: Group by returns duplicate groups
The following bug has been logged on the website:
Bug reference: 16031
Logged by: David Raymond
Email address: david.raymond@tomtom.com
PostgreSQL version: 11.5
Operating system: Windows 10
Description:
I have a large table which I'm running a query on, grouping on a single
field, and returning only that 1 field, filtered with a HAVING clause. The
results that I'm getting back however contain a duplicate entry for one of
the returned values, which I believe should never be able to happen.
I'm working on dropping as many extra fields and records as I can to still
get the bad result from something small enough to send as a test case, but
figured I'd post this while I'm at it to get any advice. I've done a dump of
the table and then restored it to a new table, and still get the same weird
results. On the original version of the table I had clustered it on the
index that starts with the field being grouped on, but it's still giving the
bad groups after a dump and restore.
I'm running 11.5 on Windows 10 (Enterprise DB installer) and unfortunately
don't have the resources to build a new version myself if you come up with a
patch (also why I'm hoping to shrink it down to where others can test it)
Here's output from psql:
testing=> select version();
version
------------------------------------------------------------
PostgreSQL 11.5, compiled by Visual C++ build 1914, 64-bit
(1 row)
Time: 0.272 ms
testing=> \d+ big_table
Table "name_stuff.big_table"
Column | Type | Collation | Nullable | Default |
Storage | Stats target | Description
-------------+------------------------+-----------+----------+---------+----------+--------------+-------------
id_1 | uuid | | not null | |
plain | |
field_2 | uuid | | | |
plain | |
name | character varying(254) | | not null | |
extended | |
field_4 | character varying(254) | | | |
extended | |
field_5 | numeric(2,0) | | not null | |
main | |
field_6 | character varying(4) | | | |
extended | |
field_7 | character varying(3) | | not null | |
extended | |
field_8 | character varying(3) | | | |
extended | |
arr_field_1 | character varying(254) | | | |
extended | |
arr_field_2 | character varying(254) | | | |
extended | |
arr_field_3 | character varying(254) | | | |
extended | |
arr_field_4 | character varying(254) | | | |
extended | |
arr_field_5 | character varying(254) | | | |
extended | |
arr_field_6 | character varying(254) | | | |
extended | |
field_15 | boolean | | | |
plain | |
field_16 | boolean | | | |
plain | |
id_2 | text | | not null | |
extended | |
Indexes:
"big_table_pkey" PRIMARY KEY, btree (id_1, id_2)
"big_table_name_id_1_id_2_idx" btree (name, id_1, id_2)
testing=> select count(*) from big_table;
count
-------------
108,565,086
(1 row)
Time: 273770.205 ms (04:33.770)
testing=> explain (analyze, verbose, costs, buffers, timing, summary) create
table bad_groups_1 as select name from big_table group by name having
count(distinct array[arr_field_1, arr_field_2, arr_field_3, arr_field_4,
arr_field_5, arr_field_6]) > 1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=29317810.81..30149735.19 rows=487417 width=22)
(actual time=758501.326..895894.082 rows=745024 loops=1)
Output: name
Group Key: big_table.name
Filter: (count(DISTINCT ARRAY[big_table.arr_field_1,
big_table.arr_field_2, big_table.arr_field_3, big_table.arr_field_4,
big_table.arr_field_5, big_table.arr_field_6]) > 1)
Rows Removed by Filter: 80610652
Buffers: shared hit=2325 read=1515081, temp read=2464481
written=2467410
-> Sort (cost=29317810.81..29589026.23 rows=108486168 width=57) (actual
time=758493.476..819035.136 rows=108565086 loops=1)
Output: name, arr_field_1, arr_field_2, arr_field_3, arr_field_4,
arr_field_5, arr_field_6
Sort Key: big_table.name
Sort Method: external merge Disk: 4174488kB
Buffers: shared hit=2317 read=1515081, temp read=2464481
written=2467410
-> Seq Scan on name_stuff.big_table (cost=0.00..2602259.68
rows=108486168 width=57) (actual time=23.216..119113.708 rows=108565086
loops=1)
Output: name, arr_field_1, arr_field_2, arr_field_3,
arr_field_4, arr_field_5, arr_field_6
Buffers: shared hit=2317 read=1515081
Planning Time: 0.196 ms
Execution Time: 897276.109 ms
(16 rows)
Time: 897285.808 ms (14:57.286)
testing=> \d+ bad_groups_1
Table "name_stuff.bad_groups_1"
Column | Type | Collation | Nullable | Default | Storage
| Stats target | Description
--------+------------------------+-----------+----------+---------+----------+--------------+-------------
name | character varying(254) | | | | extended
| |
testing=> select count(*), count(distinct name) from bad_groups_1;
count | count
---------+---------
745,024 | 745,023
(1 row)
Time: 899.273 ms
testing=> select name from bad_groups_1 group by name having count(*) > 1;
name
------
DK
(1 row)
Time: 337.663 ms
testing=>
PG Bug reporting form <noreply@postgresql.org> writes:
> I have a large table which I'm running a query on, grouping on a single
> field, and returning only that 1 field, filtered with a HAVING clause. The
> results that I'm getting back however contain a duplicate entry for one of
> the returned values, which I believe should never be able to happen.
That seems quite weird. What LC_COLLATE/LC_CTYPE settings are you using?
regards, tom lane
All the lc_* settings are en-US, and server_encoding is UTF8.
-----Original Message-----
From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Monday, September 30, 2019 2:17 PM
To: David Raymond <David.Raymond@tomtom.com>
Cc: pgsql-bugs@lists.postgresql.org
Subject: Re: BUG #16031: Group by returns duplicate groups
PG Bug reporting form <noreply@postgresql.org> writes:
> I have a large table which I'm running a query on, grouping on a single
> field, and returning only that 1 field, filtered with a HAVING clause. The
> results that I'm getting back however contain a duplicate entry for one of
> the returned values, which I believe should never be able to happen.
That seems quite weird. What LC_COLLATE/LC_CTYPE settings are you using?
regards, tom lane
Looking possibly like indexing is part of the issue at the moment. When I added an array_agg(id_1) in the group by, one of the 'DK' groups had 9, one had 16. I wrote a script to scan the text dump and counted 25 records with the "name" field value of 'DK' After the restore which includes the indexes, "count(*) from big_table where name = 'DK';" uses the index which starts with"name" and returns 9. Dropping that index and running it again returns 25. I re-created the index... create index on big_table (name, id_1, id_2); ...and count(*) goes back to returning 9 again. and group by sees those 9 as one group and the other 16 as a different group. Will get back to this sometime tomorrow as my brain has now melted.
On Mon, Sep 30, 2019 at 2:49 PM David Raymond <David.Raymond@tomtom.com> wrote:
> I re-created the index...
> create index on big_table (name, id_1, id_2);
>
> ...and count(*) goes back to returning 9 again.
>
> and group by sees those 9 as one group and the other 16 as a different group.
You should try running contrib/amcheck, which should be able to confirm
index corruption, and give you a specific complaint. You may then be
able to inspect the exact index page with the problem using
contrib/pageinspect. Something like this ought to do it on Postgres
11:
CREATE EXTENSION IF NOT EXISTS amcheck
SELECT bt_index_check('my_index', true);
If that doesn't show any errors, then perhaps try this:
SELECT bt_index_parent_check('my_index', true);
Let us know what you see.
--
Peter Geoghegan
>>>>> "David" == David Raymond <David.Raymond@tomtom.com> writes:
David> Looking possibly like indexing is part of the issue at the
David> moment.
Your original EXPLAIN didn't show any index scans being used...?
I can think of a possible explanation if there's some other value in the
big table which, due to some collation bug, is not consistently being
compared as < 'DK' or > 'DK'.
Unfortunately, we have two code paths for comparison, and one of them
can't easily be exercised directly from SQL, since it is only used for
sorts (and therefore also index builds).
Can you try these queries:
set enable_indexscan=off;
set enable_bitmapscan=off;
select count(*) c0,
count(*) filter (where bttextcmp(name,'DK') < 0) c1_lt,
count(*) filter (where bttextcmp(name,'DK') = 0) c1_eq,
count(*) filter (where bttextcmp(name,'DK') > 0) c1_gt,
count(*) filter (where bttextcmp('DK',name) > 0) c2_lt,
count(*) filter (where bttextcmp('DK',name) = 0) c2_eq,
count(*) filter (where bttextcmp('DK',name) < 0) c2_gt
from big_table;
with sd as (select name, row_number() over (order by name) rnum
from big_table)
select name from sd
where rnum >= (select min(rnum) from sd where name='DK')
and rnum <= (select max(rnum) from sd where name='DK')
and name <> 'DK';
--
Andrew (irc:RhodiumToad)
psql output for those two queries, along with explain output, pasted below.
testing=> set enable_indexscan = off;
SET
Time: 0.265 ms
testing=> set enable_bitmapscan = off;
SET
Time: 0.236 ms
testing=> select count(*) c0,
testing-> count(*) filter (where bttextcmp(name,'DK') < 0) c1_lt,
testing-> count(*) filter (where bttextcmp(name,'DK') = 0) c1_eq,
testing-> count(*) filter (where bttextcmp(name,'DK') > 0) c1_gt,
testing-> count(*) filter (where bttextcmp('DK',name) > 0) c2_lt,
testing-> count(*) filter (where bttextcmp('DK',name) = 0) c2_eq,
testing-> count(*) filter (where bttextcmp('DK',name) < 0) c2_gt
testing-> from big_table;
c0 | c1_lt | c1_eq | c1_gt | c2_lt | c2_eq | c2_gt
-------------+------------+-------+------------+------------+-------+------------
108,565,086 | 27,900,023 | 25 | 80,665,038 | 27,900,023 | 25 | 80,665,038
(1 row)
Time: 311710.895 ms (05:11.711)
testing=> explain (analyze, verbose, costs, buffers, timing, summary) select count(*) c0,
testing-> count(*) filter (where bttextcmp(name,'DK') < 0) c1_lt,
testing-> count(*) filter (where bttextcmp(name,'DK') = 0) c1_eq,
testing-> count(*) filter (where bttextcmp(name,'DK') > 0) c1_gt,
testing-> count(*) filter (where bttextcmp('DK',name) > 0) c2_lt,
testing-> count(*) filter (where bttextcmp('DK',name) = 0) c2_eq,
testing-> count(*) filter (where bttextcmp('DK',name) < 0) c2_gt
testing-> from big_table;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=4119436.80..4119436.81 rows=1 width=56) (actual time=315815.338..315815.338 rows=1 loops=1)
Output: count(*), count(*) FILTER (WHERE (bttextcmp((name)::text, 'DK'::text) < 0)), count(*) FILTER (WHERE
(bttextcmp((name)::text,'DK'::text) = 0)), count(*) FILTER (WHERE (bttextcmp((name)::text, 'DK'::text) > 0)), count(*)
FILTER(WHERE (bttextcmp('DK'::text, (name)::text) > 0)), count(*) FILTER (WHERE (bttextcmp('DK'::text, (name)::text) =
0)),count(*) FILTER (WHERE (bttextcmp('DK'::text, (name)::text) < 0))
Buffers: shared hit=258 read=1517140
-> Gather (cost=4119436.55..4119436.76 rows=2 width=56) (actual time=315814.594..315834.986 rows=3 loops=1)
Output: (PARTIAL count(*)), (PARTIAL count(*) FILTER (WHERE (bttextcmp((name)::text, 'DK'::text) < 0))),
(PARTIALcount(*) FILTER (WHERE (bttextcmp((name)::text, 'DK'::text) = 0))), (PARTIAL count(*) FILTER (WHERE
(bttextcmp((name)::text,'DK'::text) > 0))), (PARTIAL count(*) FILTER (WHERE (bttextcmp('DK'::text, (name)::text) >
0))),(PARTIAL count(*) FILTER (WHERE (bttextcmp('DK'::text, (name)::text) = 0))), (PARTIAL count(*) FILTER (WHERE
(bttextcmp('DK'::text,(name)::text) < 0)))
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=258 read=1517140
-> Partial Aggregate (cost=4118436.55..4118436.56 rows=1 width=56) (actual time=315571.326..315571.326
rows=1loops=3)
Output: PARTIAL count(*), PARTIAL count(*) FILTER (WHERE (bttextcmp((name)::text, 'DK'::text) < 0)),
PARTIALcount(*) FILTER (WHERE (bttextcmp((name)::text, 'DK'::text) = 0)), PARTIAL count(*) FILTER (WHERE
(bttextcmp((name)::text,'DK'::text) > 0)), PARTIAL count(*) FILTER (WHERE (bttextcmp('DK'::text, (name)::text) > 0)),
PARTIALcount(*) FILTER (WHERE (bttextcmp('DK'::text, (name)::text) = 0)), PARTIAL count(*) FILTER (WHERE
(bttextcmp('DK'::text,(name)::text) < 0))
Buffers: shared hit=258 read=1517140
Worker 0: actual time=315491.284..315491.284 rows=1 loops=1
Buffers: shared hit=80 read=324165
Worker 1: actual time=315411.542..315411.542 rows=1 loops=1
Buffers: shared hit=81 read=592546
-> Parallel Seq Scan on name_stuff.big_table (cost=0.00..1969752.53 rows=45235453 width=22) (actual
time=8.834..245716.269rows=36188362 loops=3)
Output: id_1, field_2, name, field_4, field_5, field_6, field_7, field_8, arr_field_1,
arr_field_2,arr_field_3, arr_field_4, arr_field_5, arr_field_6, field_15, field_16, id_2
Buffers: shared hit=258 read=1517140
Worker 0: actual time=13.873..213354.668 rows=23076339 loops=1
Buffers: shared hit=80 read=324165
Worker 1: actual time=0.232..260652.945 rows=42461913 loops=1
Buffers: shared hit=81 read=592546
Planning Time: 1.746 ms
Execution Time: 315835.098 ms
(24 rows)
Time: 315860.795 ms (05:15.861)
testing=> with sd as (select name, row_number() over (order by name) rnum
testing(> from big_table)
testing-> select name from sd
testing-> where rnum >= (select min(rnum) from sd where name='DK')
testing-> and rnum <= (select max(rnum) from sd where name='DK')
testing-> and name <> 'DK';
name
-----------------------------------
Dk'bus Marine
Dk's Auto's
Dk's Bar & Grill
Dk's Barbers & Stylist
Dk's Beach Boutique
Dk's Cabinets & Countertops
Dk's Cleaning Service
Dk's Clothing
Dk's Communications
Dk's Dancewear & Fitnesswear
Dk's Dancewear Boutique
Dk's Discount Dance & Fitnesswear
DK's Drywall Service
DK'S DUSTBUSTERS
Dk's Family Five Star Trophies
DK's Family Five Star Trophies
Dk's Food Mart
Dk'S Group Pte. Ltd.
Dk's Hair Designs
Dk's Hair Happenings
Dk's Hair Supply
Dk's Home Decor
DK's Informática
Dk's Janitorial
DK's Liquors
Dk's Market
Dk's Moda Masculina
Dk's Nails And Spa
DK's Pawn Shop
Dk's Pet Grooming
DK's Quality Service
DK's Restoration
Dk's Sports Center
Dk's Statuary
Dk's Style Hut
Dk's Temiskaming Shore Taxi
Dk's Towing
DK's Travel
Dk'Style
DK'Z Car Wash
Dk-
(41 rows)
Time: 848889.096 ms (14:08.889)
testing=> explain (analyze, verbose, costs, buffers, timing, summary) with sd as (select name, row_number() over (order
byname) rnum
testing(> from big_table)
testing-> select name from sd
testing-> where rnum >= (select min(rnum) from sd where name='DK')
testing-> and rnum <= (select max(rnum) from sd where name='DK')
testing-> and name <> 'DK';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
CTE Scan on sd (cost=30560557.18..33546097.10 rows=540111 width=516) (actual time=794958.814..805397.008 rows=41
loops=1)
Output: sd.name
Filter: ((sd.rnum >= $1) AND (sd.rnum <= $2) AND ((sd.name)::text <> 'DK'::text))
Rows Removed by Filter: 108565045
Buffers: shared hit=386 read=1517012, temp read=3202642 written=2621799
CTE sd
-> WindowAgg (cost=23772525.03..25672414.07 rows=108565088 width=30) (actual time=668354.380..753149.449
rows=108565086loops=1)
Output: big_table.name, row_number() OVER (?)
Buffers: shared hit=386 read=1517012, temp read=2035717 written=2038337
-> Sort (cost=23772525.03..24043937.75 rows=108565088 width=22) (actual time=668354.361..724059.209
rows=108565086loops=1)
Output: big_table.name
Sort Key: big_table.name
Sort Method: external merge Disk: 3453888kB
Buffers: shared hit=386 read=1517012, temp read=2035717 written=2038337
-> Seq Scan on name_stuff.big_table (cost=0.00..2603048.88 rows=108565088 width=22) (actual
time=18.886..106731.751rows=108565086 loops=1)
Output: big_table.name
Buffers: shared hit=386 read=1517012
InitPlan 2 (returns $1)
-> Aggregate (cost=2444071.54..2444071.55 rows=1 width=8) (actual time=111513.923..111513.923 rows=1 loops=1)
Output: min(sd_1.rnum)
Buffers: temp read=431506 written=583461
-> CTE Scan on sd sd_1 (cost=0.00..2442714.48 rows=542825 width=8) (actual time=28796.843..111513.916
rows=25loops=1)
Output: sd_1.name, sd_1.rnum
Filter: ((sd_1.name)::text = 'DK'::text)
Rows Removed by Filter: 108565061
Buffers: temp read=431506 written=583461
InitPlan 3 (returns $2)
-> Aggregate (cost=2444071.54..2444071.55 rows=1 width=8) (actual time=12135.464..12135.464 rows=1 loops=1)
Output: max(sd_2.rnum)
Buffers: temp read=583462
-> CTE Scan on sd sd_2 (cost=0.00..2442714.48 rows=542825 width=8) (actual time=2785.283..12135.457
rows=25loops=1)
Output: sd_2.name, sd_2.rnum
Filter: ((sd_2.name)::text = 'DK'::text)
Rows Removed by Filter: 108565061
Buffers: temp read=583462
Planning Time: 0.159 ms
Execution Time: 807095.184 ms
(37 rows)
Time: 807103.763 ms (13:27.104)
testing=>
-----Original Message-----
From: Andrew Gierth <andrew@tao11.riddles.org.uk>
Sent: Tuesday, October 1, 2019 1:41 AM
To: David Raymond <David.Raymond@tomtom.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>; pgsql-bugs@lists.postgresql.org
Subject: Re: BUG #16031: Group by returns duplicate groups
>>>>> "David" == David Raymond <David.Raymond@tomtom.com> writes:
David> Looking possibly like indexing is part of the issue at the
David> moment.
Your original EXPLAIN didn't show any index scans being used...?
I can think of a possible explanation if there's some other value in the
big table which, due to some collation bug, is not consistently being
compared as < 'DK' or > 'DK'.
Unfortunately, we have two code paths for comparison, and one of them
can't easily be exercised directly from SQL, since it is only used for
sorts (and therefore also index builds).
Can you try these queries:
set enable_indexscan=off;
set enable_bitmapscan=off;
select count(*) c0,
count(*) filter (where bttextcmp(name,'DK') < 0) c1_lt,
count(*) filter (where bttextcmp(name,'DK') = 0) c1_eq,
count(*) filter (where bttextcmp(name,'DK') > 0) c1_gt,
count(*) filter (where bttextcmp('DK',name) > 0) c2_lt,
count(*) filter (where bttextcmp('DK',name) = 0) c2_eq,
count(*) filter (where bttextcmp('DK',name) < 0) c2_gt
from big_table;
with sd as (select name, row_number() over (order by name) rnum
from big_table)
select name from sd
where rnum >= (select min(rnum) from sd where name='DK')
and rnum <= (select max(rnum) from sd where name='DK')
and name <> 'DK';
--
Andrew (irc:RhodiumToad)
>>>>> "David" == David Raymond <David.Raymond@tomtom.com> writes:
David> c0 | c1_lt | c1_eq | c1_gt | c2_lt | c2_eq | c2_gt
David> -------------+------------+-------+------------+------------+-------+------------
David> 108,565,086 | 27,900,023 | 25 | 80,665,038 | 27,900,023 | 25 | 80,665,038
David> (1 row)
Well those counts look consistent to me.
But this is all kinds of messed up:
David> testing=> with sd as (select name, row_number() over (order by name) rnum
David> testing(> from big_table)
David> testing-> select name from sd
David> testing-> where rnum >= (select min(rnum) from sd where name='DK')
David> testing-> and rnum <= (select max(rnum) from sd where name='DK')
David> testing-> and name <> 'DK';
David> name
David> -----------------------------------
David> Dk'bus Marine
David> Dk's Auto's
David> Dk's Bar & Grill
David> Dk's Barbers & Stylist
David> Dk's Beach Boutique
David> Dk's Cabinets & Countertops
David> Dk's Cleaning Service
David> Dk's Clothing
David> Dk's Communications
David> Dk's Dancewear & Fitnesswear
David> Dk's Dancewear Boutique
David> Dk's Discount Dance & Fitnesswear
David> DK's Drywall Service
David> DK'S DUSTBUSTERS
David> Dk's Family Five Star Trophies
David> DK's Family Five Star Trophies
David> Dk's Food Mart
David> Dk'S Group Pte. Ltd.
David> Dk's Hair Designs
David> Dk's Hair Happenings
David> Dk's Hair Supply
David> Dk's Home Decor
David> DK's Informática
David> Dk's Janitorial
David> DK's Liquors
David> Dk's Market
David> Dk's Moda Masculina
David> Dk's Nails And Spa
David> DK's Pawn Shop
David> Dk's Pet Grooming
David> DK's Quality Service
David> DK's Restoration
David> Dk's Sports Center
David> Dk's Statuary
David> Dk's Style Hut
David> Dk's Temiskaming Shore Taxi
David> Dk's Towing
David> DK's Travel
David> Dk'Style
David> DK'Z Car Wash
David> Dk-
David> (41 rows)
Let's see some more data from that. Do this query:
with sd as (select name, row_number() over (order by name) rnum
from big_table)
select rnum, name, encode(convert_to(name, 'SQL_ASCII'),'escape'),
name < 'DK' as lt, name = 'DK' as eq, name > 'DK' as gt
from sd
where rnum >= (select min(rnum) from sd where name='DK')
and rnum <= (select max(rnum) from sd where name='DK');
--
Andrew (irc:RhodiumToad)
Here's what it's reporting for an error:
testing=# select bt_index_check('big_table_name_id_1_id_2_idx', true);
ERROR: high key invariant violated for index "big_table_name_id_1_id_2_idx"
DETAIL: Index tid=(86990,140) points to heap tid=(139313,57) page lsn=0/0.
Time: 65695.359 ms (01:05.695)
testing=# select bt_index_parent_check('big_table_name_id_1_id_2_idx', true);
ERROR: down-link lower bound invariant violated for index "big_table_name_id_1_id_2_idx"
DETAIL: Parent block=43131 child index tid=(43197,9) parent page lsn=0/0.
Time: 1697.205 ms (00:01.697)
testing=#
-----Original Message-----
From: Peter Geoghegan <pg@bowt.ie>
Sent: Monday, September 30, 2019 7:34 PM
To: David Raymond <David.Raymond@tomtom.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>; pgsql-bugs@lists.postgresql.org
Subject: Re: BUG #16031: Group by returns duplicate groups
On Mon, Sep 30, 2019 at 2:49 PM David Raymond <David.Raymond@tomtom.com> wrote:
> I re-created the index...
> create index on big_table (name, id_1, id_2);
>
> ...and count(*) goes back to returning 9 again.
>
> and group by sees those 9 as one group and the other 16 as a different group.
You should try running contrib/amcheck, which should be able to confirm
index corruption, and give you a specific complaint. You may then be
able to inspect the exact index page with the problem using
contrib/pageinspect. Something like this ought to do it on Postgres
11:
CREATE EXTENSION IF NOT EXISTS amcheck
SELECT bt_index_check('my_index', true);
If that doesn't show any errors, then perhaps try this:
SELECT bt_index_parent_check('my_index', true);
Let us know what you see.
--
Peter Geoghegan
>>>>> "David" == David Raymond <David.Raymond@tomtom.com> writes:
David> Here's what it's reporting for an error:
David> testing=# select bt_index_check('big_table_name_id_1_id_2_idx', true);
David> ERROR: high key invariant violated for index "big_table_name_id_1_id_2_idx"
David> DETAIL: Index tid=(86990,140) points to heap tid=(139313,57) page lsn=0/0.
David> Time: 65695.359 ms (01:05.695)
David> testing=# select bt_index_parent_check('big_table_name_id_1_id_2_idx', true);
David> ERROR: down-link lower bound invariant violated for index "big_table_name_id_1_id_2_idx"
David> DETAIL: Parent block=43131 child index tid=(43197,9) parent page lsn=0/0.
David> Time: 1697.205 ms (00:01.697)
David> testing=#
Based on the other response, it looks like something is fundamentally
broken with regard to sorting this dataset, and since btree index build
works by sorting, the index corruption is probably just another symptom
of the real problem rather than a cause in itself.
--
Andrew (irc:RhodiumToad)
"But this is all kinds of messed up"
Sounds about right :)
Output below.
As asked earlier it's Windows 10, all the lc_* settings are "en-US", and the server encoding is UTF8.
Throughout the table there are records with pretty much every notation alphabet. Latin, Cyrillic, Greek, Arabic,
Hebrew,Japanese, Mandarin, etc. etc. Which "shouldn't" matter, but I figured I'd mention it.
testing=> set enable_indexscan = off;
SET
Time: 0.536 ms
testing=> set enable_bitmapscan = off;
SET
Time: 0.225 ms
testing=> with sd as (select name, row_number() over (order by name) rnum
testing(> from big_table)
testing-> select rnum, name, encode(convert_to(name, 'SQL_ASCII'),'escape'),
testing-> name < 'DK' as lt, name = 'DK' as eq, name > 'DK' as gt
testing-> from sd
testing-> where rnum >= (select min(rnum) from sd where name='DK')
testing-> and rnum <= (select max(rnum) from sd where name='DK');
rnum | name | encode | lt | eq | gt
------------+-----------------------------------+-----------------------------------+----+----+----
27,900,023 | DK | DK | f | t | f
27,900,024 | DK | DK | f | t | f
27,900,025 | DK | DK | f | t | f
27,900,026 | DK | DK | f | t | f
27,900,027 | DK | DK | f | t | f
27,900,028 | DK | DK | f | t | f
27,900,029 | DK | DK | f | t | f
27,900,030 | DK | DK | f | t | f
27,900,031 | DK | DK | f | t | f
27,900,032 | Dk'bus Marine | Dk'bus Marine | f | f | t
27,900,033 | Dk's Auto's | Dk's Auto's | f | f | t
27,900,034 | Dk's Bar & Grill | Dk's Bar & Grill | f | f | t
27,900,035 | Dk's Barbers & Stylist | Dk's Barbers & Stylist | f | f | t
27,900,036 | Dk's Beach Boutique | Dk's Beach Boutique | f | f | t
27,900,037 | Dk's Cabinets & Countertops | Dk's Cabinets & Countertops | f | f | t
27,900,038 | Dk's Cleaning Service | Dk's Cleaning Service | f | f | t
27,900,039 | Dk's Clothing | Dk's Clothing | f | f | t
27,900,040 | Dk's Communications | Dk's Communications | f | f | t
27,900,041 | Dk's Dancewear & Fitnesswear | Dk's Dancewear & Fitnesswear | f | f | t
27,900,042 | Dk's Dancewear Boutique | Dk's Dancewear Boutique | f | f | t
27,900,043 | Dk's Discount Dance & Fitnesswear | Dk's Discount Dance & Fitnesswear | f | f | t
27,900,044 | DK's Drywall Service | DK's Drywall Service | f | f | t
27,900,045 | DK'S DUSTBUSTERS | DK'S DUSTBUSTERS | f | f | t
27,900,046 | Dk's Family Five Star Trophies | Dk's Family Five Star Trophies | f | f | t
27,900,047 | DK's Family Five Star Trophies | DK's Family Five Star Trophies | f | f | t
27,900,048 | Dk's Food Mart | Dk's Food Mart | f | f | t
27,900,049 | Dk'S Group Pte. Ltd. | Dk'S Group Pte. Ltd. | f | f | t
27,900,050 | Dk's Hair Designs | Dk's Hair Designs | f | f | t
27,900,051 | Dk's Hair Happenings | Dk's Hair Happenings | f | f | t
27,900,052 | Dk's Hair Supply | Dk's Hair Supply | f | f | t
27,900,053 | Dk's Home Decor | Dk's Home Decor | f | f | t
27,900,054 | DK's Informática | DK's Inform\303\241tica | f | f | t
27,900,055 | Dk's Janitorial | Dk's Janitorial | f | f | t
27,900,056 | DK's Liquors | DK's Liquors | f | f | t
27,900,057 | Dk's Market | Dk's Market | f | f | t
27,900,058 | Dk's Moda Masculina | Dk's Moda Masculina | f | f | t
27,900,059 | Dk's Nails And Spa | Dk's Nails And Spa | f | f | t
27,900,060 | DK's Pawn Shop | DK's Pawn Shop | f | f | t
27,900,061 | Dk's Pet Grooming | Dk's Pet Grooming | f | f | t
27,900,062 | DK's Quality Service | DK's Quality Service | f | f | t
27,900,063 | DK's Restoration | DK's Restoration | f | f | t
27,900,064 | Dk's Sports Center | Dk's Sports Center | f | f | t
27,900,065 | Dk's Statuary | Dk's Statuary | f | f | t
27,900,066 | Dk's Style Hut | Dk's Style Hut | f | f | t
27,900,067 | Dk's Temiskaming Shore Taxi | Dk's Temiskaming Shore Taxi | f | f | t
27,900,068 | Dk's Towing | Dk's Towing | f | f | t
27,900,069 | DK's Travel | DK's Travel | f | f | t
27,900,070 | Dk'Style | Dk'Style | f | f | t
27,900,071 | DK'Z Car Wash | DK'Z Car Wash | f | f | t
27,900,072 | Dk- | Dk- | t | f | f
27,900,073 | DK | DK | f | t | f
27,900,074 | DK | DK | f | t | f
27,900,075 | DK | DK | f | t | f
27,900,076 | DK | DK | f | t | f
27,900,077 | DK | DK | f | t | f
27,900,078 | DK | DK | f | t | f
27,900,079 | DK | DK | f | t | f
27,900,080 | DK | DK | f | t | f
27,900,081 | DK | DK | f | t | f
27,900,082 | DK | DK | f | t | f
27,900,083 | DK | DK | f | t | f
27,900,084 | DK | DK | f | t | f
27,900,085 | DK | DK | f | t | f
27,900,086 | DK | DK | f | t | f
27,900,087 | DK | DK | f | t | f
27,900,088 | DK | DK | f | t | f
(66 rows)
Time: 821796.036 ms (13:41.796)
testing=> explain (analyze, verbose, costs, buffers, timing, summary)
testing-> with sd as (select name, row_number() over (order by name) rnum
testing(> from big_table)
testing-> select rnum, name, encode(convert_to(name, 'SQL_ASCII'),'escape'),
testing-> name < 'DK' as lt, name = 'DK' as eq, name > 'DK' as gt
testing-> from sd
testing-> where rnum >= (select min(rnum) from sd where name='DK')
testing-> and rnum <= (select max(rnum) from sd where name='DK');
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
CTE Scan on sd (cost=30560557.18..33281469.69 rows=542825 width=559) (actual time=886056.646..898585.621 rows=66
loops=1)
Output: sd.rnum, sd.name, encode(convert_to((sd.name)::text, 'SQL_ASCII'::name), 'escape'::text), ((sd.name)::text <
'DK'::text),((sd.name)::text = 'DK'::text), ((sd.name)::text > 'DK'::text)
Filter: ((sd.rnum >= $1) AND (sd.rnum <= $2))
Rows Removed by Filter: 108565020
Buffers: shared hit=482 read=1516916, temp read=3200588 written=2619846
CTE sd
-> WindowAgg (cost=23772525.03..25672414.07 rows=108565088 width=30) (actual time=719781.561..832442.189
rows=108565086loops=1)
Output: big_table.name, row_number() OVER (?)
Buffers: shared hit=482 read=1516916, temp read=2033663 written=2036384
-> Sort (cost=23772525.03..24043937.75 rows=108565088 width=22) (actual time=719781.549..793477.044
rows=108565086loops=1)
Output: big_table.name
Sort Key: big_table.name
Sort Method: external merge Disk: 3453888kB
Buffers: shared hit=482 read=1516916, temp read=2033663 written=2036384
-> Seq Scan on name_stuff.big_table (cost=0.00..2603048.88 rows=108565088 width=22) (actual
time=0.010..105238.261rows=108565086 loops=1)
Output: big_table.name
Buffers: shared hit=482 read=1516916
InitPlan 2 (returns $1)
-> Aggregate (cost=2444071.54..2444071.55 rows=1 width=8) (actual time=148300.335..148300.336 rows=1 loops=1)
Output: min(sd_1.rnum)
Buffers: temp read=431489 written=583461
-> CTE Scan on sd sd_1 (cost=0.00..2442714.48 rows=542825 width=8) (actual time=34105.882..148300.327
rows=25loops=1)
Output: sd_1.name, sd_1.rnum
Filter: ((sd_1.name)::text = 'DK'::text)
Rows Removed by Filter: 108565061
Buffers: temp read=431489 written=583461
InitPlan 3 (returns $2)
-> Aggregate (cost=2444071.54..2444071.55 rows=1 width=8) (actual time=14707.032..14707.032 rows=1 loops=1)
Output: max(sd_2.rnum)
Buffers: temp read=583462
-> CTE Scan on sd sd_2 (cost=0.00..2442714.48 rows=542825 width=8) (actual time=3729.712..14707.025
rows=25loops=1)
Output: sd_2.name, sd_2.rnum
Filter: ((sd_2.name)::text = 'DK'::text)
Rows Removed by Filter: 108565061
Buffers: temp read=583462
Planning Time: 0.099 ms
Execution Time: 899881.036 ms
(37 rows)
Time: 899900.240 ms (14:59.900)
testing=>
-----Original Message-----
From: Andrew Gierth <andrew@tao11.riddles.org.uk>
Sent: Tuesday, October 1, 2019 11:21 AM
To: David Raymond <David.Raymond@tomtom.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>; pgsql-bugs@lists.postgresql.org
Subject: Re: BUG #16031: Group by returns duplicate groups
But this is all kinds of messed up:
...
Let's see some more data from that. Do this query:
with sd as (select name, row_number() over (order by name) rnum
from big_table)
select rnum, name, encode(convert_to(name, 'SQL_ASCII'),'escape'),
name < 'DK' as lt, name = 'DK' as eq, name > 'DK' as gt
from sd
where rnum >= (select min(rnum) from sd where name='DK')
and rnum <= (select max(rnum) from sd where name='DK');
--
Andrew (irc:RhodiumToad)
I checked for other places where there was weirdness going on and there are more. testing=> create temp table weird as with foo as (select name from big_table group by name) select name from foo group byname having count(*) > 1; SELECT 23 Time: 700304.130 ms (11:40.304) Hilariously enough 'DK' isn't in these 23. Did initdb let me pick a locale that doesn't actually exist? I'm beginning to feel like I need an exorcist.
>>>>> "David" == David Raymond <David.Raymond@tomtom.com> writes:
David> I checked for other places where there was weirdness going on
David> and there are more.
You could try something like,
select name, lname
from (select name, lag(name) over (order by name) as lname
from big_table) s
where name < lname;
That should show all cases where the sort order was inconsistent with
the < operator (which obviously should never happen).
However, there's not much more I can do to help with this, since I don't
use Windows myself and have no useful access to any Windows system. You
might try and cut down the data to the smallest set that shows
inconsistencies using the above; particularly relevant is whether the
problem only shows up for external merge sorts or whether it happens for
in-memory sorts too.
--
Andrew (irc:RhodiumToad)
As an update, I've currently got a dump that consistently shows weirdness when loaded. It's just the "name" field, has
1.3million records, is 15 MB zipped, and has things garbled enough that I don't mind sending it.
How small does it need to be before it's good to send to someone?
Output after re-loading the table:
testing=> \d+ weird_grouping
Table "name_stuff.weird_grouping"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------+------------------------+-----------+----------+---------+----------+--------------+-------------
name | character varying(254) | | not null | | extended | |
testing=> select count(*), count(distinct name) from weird_grouping;
count | count
-----------+-----------
1,297,265 | 1,176,103
(1 row)
Time: 7616.186 ms (00:07.616)
testing=> explain (analyze, verbose, costs, buffers, timing, summary) select count(*), count(distinct name) from
weird_grouping;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=10187.10..10187.11 rows=1 width=16) (actual time=7805.463..7805.463 rows=1 loops=1)
Output: count(*), count(DISTINCT name)
Buffers: shared hit=2080 read=6236, temp read=5484 written=5506
-> Seq Scan on name_stuff.weird_grouping (cost=0.00..9563.40 rows=124740 width=516) (actual time=0.031..128.203
rows=1297265loops=1)
Output: name
Buffers: shared hit=2080 read=6236
Planning Time: 0.027 ms
Execution Time: 7805.483 ms
(8 rows)
Time: 7805.822 ms (00:07.806)
testing=> with foo as (select name from weird_grouping group by name) select name from foo group by name having
count(*)> 1;
name
------
(0 rows)
Time: 1950.401 ms (00:01.950)
testing=> explain (analyze, verbose, costs, buffers, timing, summary) with foo as (select name from weird_grouping
groupby name) select name from foo group by name having count(*) > 1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=9882.25..9884.75 rows=67 width=516) (actual time=1466.738..1466.738 rows=0 loops=1)
Output: foo.name
Group Key: foo.name
Filter: (count(*) > 1)
Rows Removed by Filter: 1176101
Buffers: shared hit=2144 read=6172, temp written=4533
CTE foo
-> HashAggregate (cost=9875.25..9877.25 rows=200 width=516) (actual time=494.343..734.005 rows=1176101 loops=1)
Output: weird_grouping.name
Group Key: weird_grouping.name
Buffers: shared hit=2144 read=6172
-> Seq Scan on name_stuff.weird_grouping (cost=0.00..9563.40 rows=124740 width=516) (actual
time=0.283..120.898rows=1297265 loops=1)
Output: weird_grouping.name
Buffers: shared hit=2144 read=6172
-> CTE Scan on foo (cost=0.00..4.00 rows=200 width=516) (actual time=494.346..1035.185 rows=1176101 loops=1)
Output: foo.name
Buffers: shared hit=2144 read=6172, temp written=4533
Planning Time: 0.464 ms
Execution Time: 2230.238 ms
(19 rows)
Time: 2231.291 ms (00:02.231)
testing=> analyze verbose weird_grouping;
INFO: analyzing "name_stuff.weird_grouping"
INFO: "weird_grouping": scanned 8316 of 8316 pages, containing 1297265 live rows and 0 dead rows; 30000 rows in
sample,1297265 estimated total rows
ANALYZE
Time: 474.963 ms
testing=> select count(*), count(distinct name) from weird_grouping;
count | count
-----------+-----------
1,297,265 | 1,176,103
(1 row)
Time: 7449.983 ms (00:07.450)
testing=> explain (analyze, verbose, costs, buffers, timing, summary) select count(*), count(distinct name) from
weird_grouping;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=27774.98..27774.99 rows=1 width=16) (actual time=7754.578..7754.578 rows=1 loops=1)
Output: count(*), count(DISTINCT name)
Buffers: shared hit=2240 read=6076, temp read=5484 written=5506
-> Seq Scan on name_stuff.weird_grouping (cost=0.00..21288.65 rows=1297265 width=20) (actual time=0.036..102.144
rows=1297265loops=1)
Output: name
Buffers: shared hit=2240 read=6076
Planning Time: 0.031 ms
Execution Time: 7754.598 ms
(8 rows)
Time: 7754.902 ms (00:07.755)
testing=> with foo as (select name from weird_grouping group by name) select name from foo group by name having
count(*)> 1;
name
-------
DCT
DELTA
(2 rows)
Time: 9561.382 ms (00:09.561)
testing=> explain (analyze, verbose, costs, buffers, timing, summary) with foo as (select name from weird_grouping
groupby name) select name from foo group by name having count(*) > 1;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=209783.95..209786.45 rows=67 width=516) (actual time=8591.210..8686.132 rows=2 loops=1)
Output: foo.name
Group Key: foo.name
Filter: (count(*) > 1)
Rows Removed by Filter: 1176099
Buffers: shared hit=2304 read=6012, temp read=7800 written=12363
CTE foo
-> Group (cost=179613.72..186100.05 rows=947356 width=20) (actual time=6416.900..7842.634 rows=1176103 loops=1)
Output: weird_grouping.name
Group Key: weird_grouping.name
Buffers: shared hit=2304 read=6012, temp read=7800 written=7830
-> Sort (cost=179613.72..182856.89 rows=1297265 width=20) (actual time=6416.899..7665.158 rows=1297265
loops=1)
Output: weird_grouping.name
Sort Key: weird_grouping.name
Sort Method: external merge Disk: 39048kB
Buffers: shared hit=2304 read=6012, temp read=7800 written=7830
-> Seq Scan on name_stuff.weird_grouping (cost=0.00..21288.65 rows=1297265 width=20) (actual
time=0.294..137.486rows=1297265 loops=1)
Output: weird_grouping.name
Buffers: shared hit=2304 read=6012
-> CTE Scan on foo (cost=0.00..18947.12 rows=947356 width=516) (actual time=6416.902..8105.771 rows=1176103
loops=1)
Output: foo.name
Buffers: shared hit=2304 read=6012, temp read=7800 written=12363
Planning Time: 0.258 ms
Execution Time: 10305.891 ms
(24 rows)
Time: 10306.990 ms (00:10.307)
testing=>
-----Original Message-----
From: Andrew Gierth <andrew@tao11.riddles.org.uk>
Sent: Tuesday, October 1, 2019 3:27 PM
To: David Raymond <David.Raymond@tomtom.com>
Cc: pgsql-bugs@lists.postgresql.org
Subject: Re: BUG #16031: Group by returns duplicate groups
>>>>> "David" == David Raymond <David.Raymond@tomtom.com> writes:
David> I checked for other places where there was weirdness going on
David> and there are more.
You could try something like,
select name, lname
from (select name, lag(name) over (order by name) as lname
from big_table) s
where name < lname;
That should show all cases where the sort order was inconsistent with
the < operator (which obviously should never happen).
However, there's not much more I can do to help with this, since I don't
use Windows myself and have no useful access to any Windows system. You
might try and cut down the data to the smallest set that shows
inconsistencies using the above; particularly relevant is whether the
problem only shows up for external merge sorts or whether it happens for
in-memory sorts too.
--
Andrew (irc:RhodiumToad)
>>>>> "David" == David Raymond <David.Raymond@tomtom.com> writes: David> As an update, I've currently got a dump that consistently shows David> weirdness when loaded. It's just the "name" field, has 1.3 David> million records, is 15 MB zipped, and has things garbled enough David> that I don't mind sending it. David> How small does it need to be before it's good to send to David> someone? That's small enough for me, though since I don't use Windows all I'll be able to do is check if you're exposing some general PG bug. If not I'll see if I can find someone to test on Windows. -- Andrew (irc:RhodiumToad)
Downloaded and installed 12.0, created a nice shiny new cluster, and confirmed that it's still doing it. Now in 12 you
haveto force it to materialize the CTE, which was why I had used a CTE in 11 in the first place.
testing=> select version();
version
------------------------------------------------------------
PostgreSQL 12.0, compiled by Visual C++ build 1914, 64-bit
(1 row)
Time: 0.148 ms
testing=> \d+ weird_grouping
Table "name_stuff.weird_grouping"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------+------------------------+-----------+----------+---------+----------+--------------+-------------
name | character varying(254) | | not null | | extended | |
Access method: heap
testing=> analyze verbose weird_grouping;
INFO: analyzing "name_stuff.weird_grouping"
INFO: "weird_grouping": scanned 8316 of 8316 pages, containing 1297265 live rows and 0 dead rows; 30000 rows in
sample,1297265 estimated total rows
ANALYZE
Time: 206.577 ms
testing=> select count(*), count(distinct name) from weird_grouping;
count | count
-----------+-----------
1,297,265 | 1,176,103
(1 row)
Time: 6729.011 ms (00:06.729)
testing=> with foo as (select name from weird_grouping group by name) select name from foo group by name having
count(*)> 1;
name
------
(0 rows)
Time: 7289.128 ms (00:07.289)
testing=> explain (analyze, verbose, costs, settings, buffers, timing, summary) with foo as (select name from
weird_groupinggroup by name) select name from foo group by name having count(*) > 1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=179613.72..200268.04 rows=67 width=20) (actual time=6203.868..6203.868 rows=0 loops=1)
Output: weird_grouping.name
Group Key: weird_grouping.name
Filter: (count(*) > 1)
Rows Removed by Filter: 1176103
Buffers: shared hit=2464 read=5852, temp read=7800 written=7830
-> Group (cost=179613.72..186100.05 rows=944366 width=20) (actual time=4769.781..5985.111 rows=1176103 loops=1)
Output: weird_grouping.name
Group Key: weird_grouping.name
Buffers: shared hit=2464 read=5852, temp read=7800 written=7830
-> Sort (cost=179613.72..182856.89 rows=1297265 width=20) (actual time=4769.779..5844.350 rows=1297265
loops=1)
Output: weird_grouping.name
Sort Key: weird_grouping.name
Sort Method: external merge Disk: 39048kB
Buffers: shared hit=2464 read=5852, temp read=7800 written=7830
-> Seq Scan on name_stuff.weird_grouping (cost=0.00..21288.65 rows=1297265 width=20) (actual
time=0.059..102.772rows=1297265 loops=1)
Output: weird_grouping.name
Buffers: shared hit=2464 read=5852
Settings: search_path = 'name_stuff'
Planning Time: 0.048 ms
Execution Time: 7115.761 ms
(21 rows)
Time: 7116.170 ms (00:07.116)
testing=> with foo as materialized (select name from weird_grouping group by name) select name from foo group by name
havingcount(*) > 1;
name
-------
DCT
DELTA
(2 rows)
Time: 8850.833 ms (00:08.851)
testing=> explain (analyze, verbose, costs, settings, buffers, timing, summary) with foo as materialized (select name
fromweird_grouping group by name) select name from foo group by name having count(*) > 1;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=209709.20..209711.70 rows=67 width=516) (actual time=6676.811..6768.094 rows=2 loops=1)
Output: foo.name
Group Key: foo.name
Filter: (count(*) > 1)
Rows Removed by Filter: 1176099
Buffers: shared hit=2528 read=5788, temp read=7800 written=12363
CTE foo
-> Group (cost=179613.72..186100.05 rows=944366 width=20) (actual time=4774.681..6004.725 rows=1176103 loops=1)
Output: weird_grouping.name
Group Key: weird_grouping.name
Buffers: shared hit=2528 read=5788, temp read=7800 written=7830
-> Sort (cost=179613.72..182856.89 rows=1297265 width=20) (actual time=4774.678..5860.270 rows=1297265
loops=1)
Output: weird_grouping.name
Sort Key: weird_grouping.name
Sort Method: external merge Disk: 39048kB
Buffers: shared hit=2528 read=5788, temp read=7800 written=7830
-> Seq Scan on name_stuff.weird_grouping (cost=0.00..21288.65 rows=1297265 width=20) (actual
time=0.065..101.141rows=1297265 loops=1)
Output: weird_grouping.name
Buffers: shared hit=2528 read=5788
-> CTE Scan on foo (cost=0.00..18887.32 rows=944366 width=516) (actual time=4774.683..6228.002 rows=1176103
loops=1)
Output: foo.name
Buffers: shared hit=2528 read=5788, temp read=7800 written=12363
Settings: search_path = 'name_stuff'
Planning Time: 0.054 ms
Execution Time: 8786.597 ms
(25 rows)
Time: 8787.011 ms (00:08.787)
testing=>
-----Original Message-----
From: Andrew Gierth <andrew@tao11.riddles.org.uk>
Sent: Sunday, October 6, 2019 10:29 AM
To: David Raymond <David.Raymond@tomtom.com>
Cc: pgsql-bugs@lists.postgresql.org
Subject: Re: BUG #16031: Group by returns duplicate groups
>>>>> "David" == David Raymond <David.Raymond@tomtom.com> writes:
David> As an update, I've currently got a dump that consistently shows
David> weirdness when loaded. It's just the "name" field, has 1.3
David> million records, is 15 MB zipped, and has things garbled enough
David> that I don't mind sending it.
David> How small does it need to be before it's good to send to
David> someone?
That's small enough for me, though since I don't use Windows all I'll be
able to do is check if you're exposing some general PG bug. If not I'll
see if I can find someone to test on Windows.
--
Andrew (irc:RhodiumToad)
>>>>> "David" == David Raymond <David.Raymond@tomtom.com> writes: David> As an update, I've currently got a dump that consistently shows David> weirdness when loaded. It's just the "name" field, has 1.3 David> million records, is 15 MB zipped, and has things garbled enough David> that I don't mind sending it. How small does it need to be David> before it's good to send to someone? I got your file, and I put it up here if others want to test this: http://www.rhodiumtoad.org.uk/junk/Bug_16031-BadGrouping.zip Unfortunately, I can't reproduce any issue (as expected), and the Windows guy who tested it for me also wasn't able to reproduce it. :-( -- Andrew (irc:RhodiumToad)
Hi, On Mon, Oct 07, 2019 at 06:29:04PM +0000, David Raymond wrote: >Downloaded and installed 12.0, created a nice shiny new cluster, and >confirmed that it's still doing it. Now in 12 you have to force it to >materialize the CTE, which was why I had used a CTE in 11 in the first >place. > I've tried running this on a 12.0 cluster too (on Linux though), using the data set linked by Andrew, and I haven't observed any duplicate rows either. I do have some observations, though ... > >testing=> select version(); > version >------------------------------------------------------------ > PostgreSQL 12.0, compiled by Visual C++ build 1914, 64-bit >(1 row) > >Time: 0.148 ms >testing=> \d+ weird_grouping > Table "name_stuff.weird_grouping" > Column | Type | Collation | Nullable | Default | Storage | Stats target | Description >--------+------------------------+-----------+----------+---------+----------+--------------+------------- > name | character varying(254) | | not null | | extended | | >Access method: heap > >testing=> analyze verbose weird_grouping; >INFO: analyzing "name_stuff.weird_grouping" >INFO: "weird_grouping": scanned 8316 of 8316 pages, containing 1297265 live rows and 0 dead rows; 30000 rows in sample,1297265 estimated total rows >ANALYZE >Time: 206.577 ms >testing=> select count(*), count(distinct name) from weird_grouping; > count | count >-----------+----------- > 1,297,265 | 1,176,103 >(1 row) > I do get this: count | count ---------+--------- 1297265 | 1176101 (1 row) >Time: 6729.011 ms (00:06.729) >testing=> with foo as (select name from weird_grouping group by name) select name from foo group by name having count(*)> 1; > name >------ >(0 rows) > >Time: 7289.128 ms (00:07.289) >testing=> explain (analyze, verbose, costs, settings, buffers, timing, summary) with foo as (select name from weird_groupinggroup by name) select name from foo group by name having count(*) > 1; > QUERY PLAN >-------------------------------------------------------------------------------------------------------------------------------------------------------- > GroupAggregate (cost=179613.72..200268.04 rows=67 width=20) (actual time=6203.868..6203.868 rows=0 loops=1) > Output: weird_grouping.name > Group Key: weird_grouping.name > Filter: (count(*) > 1) > Rows Removed by Filter: 1176103 > Buffers: shared hit=2464 read=5852, temp read=7800 written=7830 > -> Group (cost=179613.72..186100.05 rows=944366 width=20) (actual time=4769.781..5985.111 rows=1176103 loops=1) > Output: weird_grouping.name > Group Key: weird_grouping.name > Buffers: shared hit=2464 read=5852, temp read=7800 written=7830 > -> Sort (cost=179613.72..182856.89 rows=1297265 width=20) (actual time=4769.779..5844.350 rows=1297265 loops=1) > Output: weird_grouping.name > Sort Key: weird_grouping.name > Sort Method: external merge Disk: 39048kB > Buffers: shared hit=2464 read=5852, temp read=7800 written=7830 > -> Seq Scan on name_stuff.weird_grouping (cost=0.00..21288.65 rows=1297265 width=20) (actual time=0.059..102.772rows=1297265 loops=1) > Output: weird_grouping.name > Buffers: shared hit=2464 read=5852 > Settings: search_path = 'name_stuff' > Planning Time: 0.048 ms > Execution Time: 7115.761 ms >(21 rows) > I'd like to point this may not be actually correct either. It does not produce any rows, i.e. it does not see any "duplicate groups", but it does this: -> Group (cost=179613.72..186100.05 rows=944366 width=20) (actual time=4769.781..5985.111 rows=1176103 loops=1) Notice that it has 1176103 groups, but is that actually correct? On my machine I only see 1176101 groups, and the difference (2) would match the DCL and DELTA duplicates with the next query. >Time: 7116.170 ms (00:07.116) >testing=> with foo as materialized (select name from weird_grouping group by name) select name from foo group by name havingcount(*) > 1; > name >------- > DCT > DELTA >(2 rows) > >Time: 8850.833 ms (00:08.851) >testing=> explain (analyze, verbose, costs, settings, buffers, timing, summary) with foo as materialized (select name fromweird_grouping group by name) select name from foo group by name having count(*) > 1; > QUERY PLAN >---------------------------------------------------------------------------------------------------------------------------------------------------------- > HashAggregate (cost=209709.20..209711.70 rows=67 width=516) (actual time=6676.811..6768.094 rows=2 loops=1) > Output: foo.name > Group Key: foo.name > Filter: (count(*) > 1) > Rows Removed by Filter: 1176099 > Buffers: shared hit=2528 read=5788, temp read=7800 written=12363 > CTE foo > -> Group (cost=179613.72..186100.05 rows=944366 width=20) (actual time=4774.681..6004.725 rows=1176103 loops=1) > Output: weird_grouping.name > Group Key: weird_grouping.name > Buffers: shared hit=2528 read=5788, temp read=7800 written=7830 > -> Sort (cost=179613.72..182856.89 rows=1297265 width=20) (actual time=4774.678..5860.270 rows=1297265 loops=1) > Output: weird_grouping.name > Sort Key: weird_grouping.name > Sort Method: external merge Disk: 39048kB > Buffers: shared hit=2528 read=5788, temp read=7800 written=7830 > -> Seq Scan on name_stuff.weird_grouping (cost=0.00..21288.65 rows=1297265 width=20) (actual time=0.065..101.141rows=1297265 loops=1) > Output: weird_grouping.name > Buffers: shared hit=2528 read=5788 > -> CTE Scan on foo (cost=0.00..18887.32 rows=944366 width=516) (actual time=4774.683..6228.002 rows=1176103 loops=1) > Output: foo.name > Buffers: shared hit=2528 read=5788, temp read=7800 written=12363 > Settings: search_path = 'name_stuff' > Planning Time: 0.054 ms > Execution Time: 8786.597 ms >(25 rows) > >Time: 8787.011 ms (00:08.787) >testing=> > Can you try running this with enable_hashagg = off? That should give you another Sort on the CTE Scan, and a GroupAggregate at the top. I wonder if that makes the issue go away ... regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
" Can you try running this with enable_hashagg = off? That should give you another Sort on the CTE Scan, and a
GroupAggregateat the top. I wonder if that makes the issue go away ..."
Here's what I get for that. And extra weirdness below:
testing=> set enable_hashagg = off;
SET
Time: 0.241 ms
testing=> with foo as materialized (select name from weird_grouping group by name) select name from foo group by name
havingcount(*) > 1;
name
------
(0 rows)
Time: 10423.486 ms (00:10.423)
testing=> explain (analyze, verbose, costs, settings, buffers, timing, summary) with foo as materialized (select name
fromweird_grouping group by name) select name from foo group by name having count(*) > 1;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=737694.91..744780.15 rows=67 width=516) (actual time=7650.411..7650.411 rows=0 loops=1)
Output: foo.name
Group Key: foo.name
Filter: (count(*) > 1)
Rows Removed by Filter: 1176103
Buffers: shared hit=160 read=8156, temp read=14645 written=19235
CTE foo
-> Group (cost=179613.72..186100.05 rows=944366 width=20) (actual time=4811.449..6027.355 rows=1176103 loops=1)
Output: weird_grouping.name
Group Key: weird_grouping.name
Buffers: shared hit=160 read=8156, temp read=7800 written=7830
-> Sort (cost=179613.72..182856.89 rows=1297265 width=20) (actual time=4811.447..5884.667 rows=1297265
loops=1)
Output: weird_grouping.name
Sort Key: weird_grouping.name
Sort Method: external merge Disk: 39048kB
Buffers: shared hit=160 read=8156, temp read=7800 written=7830
-> Seq Scan on name_stuff.weird_grouping (cost=0.00..21288.65 rows=1297265 width=20) (actual
time=0.058..117.833rows=1297265 loops=1)
Output: weird_grouping.name
Buffers: shared hit=160 read=8156
-> Sort (cost=551594.86..553955.78 rows=944366 width=516) (actual time=6915.562..7418.978 rows=1176103 loops=1)
Output: foo.name
Sort Key: foo.name
Sort Method: external merge Disk: 36368kB
Buffers: shared hit=160 read=8156, temp read=14645 written=19235
-> CTE Scan on foo (cost=0.00..18887.32 rows=944366 width=516) (actual time=4811.451..6243.160 rows=1176103
loops=1)
Output: foo.name
Buffers: shared hit=160 read=8156, temp read=7800 written=12363
Settings: enable_hashagg = 'off', search_path = 'name_stuff'
Planning Time: 0.064 ms
Execution Time: 10175.478 ms
(30 rows)
Time: 10175.906 ms (00:10.176)
testing=>
But now here's another weird bit:
testing=> select count(*), count(distinct name) from weird_grouping;
count | count
-----------+-----------
1,297,265 | 1,176,103
(1 row)
Time: 6866.369 ms (00:06.866)
testing=> explain (analyze, verbose, costs, settings, buffers, timing, summary) select count(*), count(distinct name)
fromweird_grouping;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=27774.98..27774.99 rows=1 width=16) (actual time=6642.856..6642.856 rows=1 loops=1)
Output: count(*), count(DISTINCT name)
Buffers: shared hit=928 read=7388, temp read=5484 written=5506
-> Seq Scan on name_stuff.weird_grouping (cost=0.00..21288.65 rows=1297265 width=20) (actual time=0.072..110.798
rows=1297265loops=1)
Output: name
Buffers: shared hit=928 read=7388
Settings: search_path = 'name_stuff'
Planning Time: 0.030 ms
Execution Time: 6642.875 ms
(9 rows)
Time: 6643.181 ms (00:06.643)
testing=> select count(*), count(distinct name collate "C") from weird_grouping;
count | count
-----------+-----------
1,297,265 | 1,176,101
(1 row)
Time: 1655.202 ms (00:01.655)
testing=> explain (analyze, verbose, costs, settings, buffers, timing, summary) select count(*), count(distinct name
collate"C") from weird_grouping;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=27774.98..27774.99 rows=1 width=16) (actual time=1788.276..1788.277 rows=1 loops=1)
Output: count(*), count(DISTINCT (name)::character varying(254))
Buffers: shared hit=992 read=7324, temp read=5484 written=5506
-> Seq Scan on name_stuff.weird_grouping (cost=0.00..21288.65 rows=1297265 width=20) (actual time=0.059..112.815
rows=1297265loops=1)
Output: name
Buffers: shared hit=992 read=7324
Settings: search_path = 'name_stuff'
Planning Time: 0.030 ms
Execution Time: 1788.295 ms
(9 rows)
Time: 1788.596 ms (00:01.789)
testing=> show lc_collate;
lc_collate
------------
en-US
(1 row)
Time: 0.122 ms
testing=> show server_encoding;
server_encoding
-----------------
UTF8
(1 row)
Time: 0.082 ms
testing=> select count(*), count(distinct name collate "en-US") from weird_grouping;
ERROR: collation "en-US" for encoding "UTF8" does not exist
LINE 1: select count(*), count(distinct name collate "en-US") from w...
^
Time: 5.759 ms
testing=>