Обсуждение: Multi-level hierarchy with parallel append can lead to an extra subplan.
Multi-level hierarchy with parallel append can lead to an extra subplan.
От
Rajkumar Raghuwanshi
Дата:
Hi,
I am getting extra subplan when using parallel append with multi-level hierarchy, leading to data corruption.
Please see below test case.
-- set below parameters to encourage use of parallel plans
SET parallel_setup_cost=0;
SET parallel_tuple_cost=0;
SET min_parallel_table_scan_size=0;
SET max_parallel_workers_per_gather=4;
--create below data set
CREATE TABLE RM38941_inherit (c1 INTEGER PRIMARY KEY,c2 INTEGER,c3 CHAR(10));
INSERT INTO RM38941_inherit SELECT i, i % 125, to_char(i % 4, 'FM0000') FROM generate_series(0, 499,2) i;
CREATE TABLE RM38941_inherit_t1 () INHERITS (RM38941_inherit);
INSERT INTO RM38941_inherit_t1 SELECT i, i % 125, to_char(i % 4, 'FM0000') FROM generate_series(0, 499,3) i;
CREATE TABLE RM38941_inherit_t2 () INHERITS (RM38941_inherit);
INSERT INTO RM38941_inherit_t2 SELECT i, i % 125, to_char(i % 4, 'FM0000') FROM generate_series(0, 499,5) i;
CREATE TABLE RM38941_union_t1 (c1 INTEGER PRIMARY KEY,c2 INTEGER,c3 CHAR(10));
INSERT INTO RM38941_union_t1 SELECT i, i % 125, to_char(i % 4, 'FM0000') FROM generate_series(0, 499,2) i;
CREATE TABLE RM38941_union_t2 (c1 INTEGER PRIMARY KEY,c2 INTEGER,c3 CHAR(10));
INSERT INTO RM38941_union_t2 SELECT i, i % 125, to_char(i % 4, 'FM0000') FROM generate_series(0, 499,4) i;
ALTER TABLE RM38941_union_t1 SET (parallel_workers = 0);
ALTER TABLE RM38941_inherit_t1 SET (parallel_workers = 0);
--with parallel_append
SET enable_parallel_append=on;
postgres=# EXPLAIN (COSTS OFF)
postgres-# SELECT AVG(c1),SUM(c2) FROM (SELECT c1,c2 FROM RM38941_union_t2 UNION ALL SELECT c1,c2 FROM RM38941_inherit UNION ALL SELECT c1,c2 FROM RM38941_union_t1)UA;
QUERY PLAN
-----------------------------------------------------------------------
Finalize Aggregate
-> Gather
Workers Planned: 3
-> Partial Aggregate
-> Parallel Append
-> Seq Scan on rm38941_inherit_t1
-> Seq Scan on rm38941_union_t1
-> Parallel Seq Scan on rm38941_union_t2
-> Parallel Seq Scan on rm38941_inherit
-> Parallel Seq Scan on rm38941_inherit_t2
-> Parallel Append
-> Seq Scan on rm38941_inherit_t1
-> Parallel Seq Scan on rm38941_inherit
-> Parallel Seq Scan on rm38941_inherit_t2
(14 rows)
postgres=# SELECT AVG(c1),SUM(c2) FROM (SELECT c1,c2 FROM RM38941_union_t2 UNION ALL SELECT c1,c2 FROM RM38941_inherit UNION ALL SELECT c1,c2 FROM RM38941_union_t1)UA;
avg | sum
----------------------+-------
248.6983676366217175 | 86916
(1 row)
--without parallel_append
SET enable_parallel_append=off;
postgres=# EXPLAIN (COSTS OFF)
postgres-# SELECT AVG(c1),SUM(c2) FROM (SELECT c1,c2 FROM RM38941_union_t2 UNION ALL SELECT c1,c2 FROM RM38941_inherit UNION ALL SELECT c1,c2 FROM RM38941_union_t1)UA;
QUERY PLAN
--------------------------------------------
Aggregate
-> Append
-> Seq Scan on rm38941_union_t2
-> Seq Scan on rm38941_inherit
-> Seq Scan on rm38941_inherit_t1
-> Seq Scan on rm38941_inherit_t2
-> Seq Scan on rm38941_union_t1
(7 rows)
postgres=# SELECT AVG(c1),SUM(c2) FROM (SELECT c1,c2 FROM RM38941_union_t2 UNION ALL SELECT c1,c2 FROM RM38941_inherit UNION ALL SELECT c1,c2 FROM RM38941_union_t1)UA;
avg | sum
----------------------+-------
248.6917040358744395 | 55083
(1 row)
I am getting extra subplan when using parallel append with multi-level hierarchy, leading to data corruption.
Please see below test case.
-- set below parameters to encourage use of parallel plans
SET parallel_setup_cost=0;
SET parallel_tuple_cost=0;
SET min_parallel_table_scan_size=0;
SET max_parallel_workers_per_gather=4;
--create below data set
CREATE TABLE RM38941_inherit (c1 INTEGER PRIMARY KEY,c2 INTEGER,c3 CHAR(10));
INSERT INTO RM38941_inherit SELECT i, i % 125, to_char(i % 4, 'FM0000') FROM generate_series(0, 499,2) i;
CREATE TABLE RM38941_inherit_t1 () INHERITS (RM38941_inherit);
INSERT INTO RM38941_inherit_t1 SELECT i, i % 125, to_char(i % 4, 'FM0000') FROM generate_series(0, 499,3) i;
CREATE TABLE RM38941_inherit_t2 () INHERITS (RM38941_inherit);
INSERT INTO RM38941_inherit_t2 SELECT i, i % 125, to_char(i % 4, 'FM0000') FROM generate_series(0, 499,5) i;
CREATE TABLE RM38941_union_t1 (c1 INTEGER PRIMARY KEY,c2 INTEGER,c3 CHAR(10));
INSERT INTO RM38941_union_t1 SELECT i, i % 125, to_char(i % 4, 'FM0000') FROM generate_series(0, 499,2) i;
CREATE TABLE RM38941_union_t2 (c1 INTEGER PRIMARY KEY,c2 INTEGER,c3 CHAR(10));
INSERT INTO RM38941_union_t2 SELECT i, i % 125, to_char(i % 4, 'FM0000') FROM generate_series(0, 499,4) i;
ALTER TABLE RM38941_union_t1 SET (parallel_workers = 0);
ALTER TABLE RM38941_inherit_t1 SET (parallel_workers = 0);
--with parallel_append
SET enable_parallel_append=on;
postgres=# EXPLAIN (COSTS OFF)
postgres-# SELECT AVG(c1),SUM(c2) FROM (SELECT c1,c2 FROM RM38941_union_t2 UNION ALL SELECT c1,c2 FROM RM38941_inherit UNION ALL SELECT c1,c2 FROM RM38941_union_t1)UA;
QUERY PLAN
-----------------------------------------------------------------------
Finalize Aggregate
-> Gather
Workers Planned: 3
-> Partial Aggregate
-> Parallel Append
-> Seq Scan on rm38941_inherit_t1
-> Seq Scan on rm38941_union_t1
-> Parallel Seq Scan on rm38941_union_t2
-> Parallel Seq Scan on rm38941_inherit
-> Parallel Seq Scan on rm38941_inherit_t2
-> Parallel Append
-> Seq Scan on rm38941_inherit_t1
-> Parallel Seq Scan on rm38941_inherit
-> Parallel Seq Scan on rm38941_inherit_t2
(14 rows)
postgres=# SELECT AVG(c1),SUM(c2) FROM (SELECT c1,c2 FROM RM38941_union_t2 UNION ALL SELECT c1,c2 FROM RM38941_inherit UNION ALL SELECT c1,c2 FROM RM38941_union_t1)UA;
avg | sum
----------------------+-------
248.6983676366217175 | 86916
(1 row)
--without parallel_append
SET enable_parallel_append=off;
postgres=# EXPLAIN (COSTS OFF)
postgres-# SELECT AVG(c1),SUM(c2) FROM (SELECT c1,c2 FROM RM38941_union_t2 UNION ALL SELECT c1,c2 FROM RM38941_inherit UNION ALL SELECT c1,c2 FROM RM38941_union_t1)UA;
QUERY PLAN
--------------------------------------------
Aggregate
-> Append
-> Seq Scan on rm38941_union_t2
-> Seq Scan on rm38941_inherit
-> Seq Scan on rm38941_inherit_t1
-> Seq Scan on rm38941_inherit_t2
-> Seq Scan on rm38941_union_t1
(7 rows)
postgres=# SELECT AVG(c1),SUM(c2) FROM (SELECT c1,c2 FROM RM38941_union_t2 UNION ALL SELECT c1,c2 FROM RM38941_inherit UNION ALL SELECT c1,c2 FROM RM38941_union_t1)UA;
avg | sum
----------------------+-------
248.6917040358744395 | 55083
(1 row)
Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation
Re: Multi-level hierarchy with parallel append can lead to an extra subplan.
От
Amit Khandekar
Дата:
Thanks Rajkumar for catching this.
There is indeed is an extra child subplan showing up in the Parallel Append :
-> Parallel Append
-> Seq Scan on rm38941_inherit_t1
-> Parallel Seq Scan on rm38941_inherit
-> Parallel Seq Scan on rm38941_inherit_t2
This subplan should not be there. It looks like that's because when
we add up subpaths while preparing Parallel Append path containing mix of
partial and non-partial child paths, in accumulate_append_subpath(),
the subpath being added is an Append path because it's a multi-level
partition. So in the condition "else if (special_subpaths != NULL)",
both *subpaths and *special_paths get updated. But then it is not
returning from there when it should. Instead, the control passes to
the end of function where *subpaths is again populated :
*subpaths = lappend(*subpaths, path);
which leads to an extra child with partial subpath getting added.
I think the fix should be that we should return from the above if
condition block, just like we return from the other if blocks. I
quickly tried to do this and it seems to work :
@@ -1926,6 +1926,7 @@ accumulate_append_subpath(Path *path, List
**subpaths, List **special_subpaths)
apath->first_partial_path);
*special_subpaths = list_concat(*special_subpaths,
new_special_subpaths);
+ return;
}
}
else if (IsA(path, MergeAppendPath))
But yet to test it sufficiently.
Actually I had not included UNION ALL testcase since the code flow for
Parallel Append is exactly the same for partitions and UNION ALL case.
But with the patch for this issue, I will also include a UNION ALL
testcase in the regression test. But, I think there must also be some
way to reproduce this same issue with an aggregate query with
multi-level partitioned tables without having to use UNION ALL.
Thanks
-Amit
On 9 January 2018 at 15:39, Rajkumar Raghuwanshi
<rajkumar.raghuwanshi@enterprisedb.com> wrote:
> Hi,
>
> I am getting extra subplan when using parallel append with multi-level
> hierarchy, leading to data corruption.
> Please see below test case.
>
> -- set below parameters to encourage use of parallel plans
> SET parallel_setup_cost=0;
> SET parallel_tuple_cost=0;
> SET min_parallel_table_scan_size=0;
> SET max_parallel_workers_per_gather=4;
>
> --create below data set
> CREATE TABLE RM38941_inherit (c1 INTEGER PRIMARY KEY,c2 INTEGER,c3
> CHAR(10));
> INSERT INTO RM38941_inherit SELECT i, i % 125, to_char(i % 4, 'FM0000') FROM
> generate_series(0, 499,2) i;
>
> CREATE TABLE RM38941_inherit_t1 () INHERITS (RM38941_inherit);
> INSERT INTO RM38941_inherit_t1 SELECT i, i % 125, to_char(i % 4, 'FM0000')
> FROM generate_series(0, 499,3) i;
>
> CREATE TABLE RM38941_inherit_t2 () INHERITS (RM38941_inherit);
> INSERT INTO RM38941_inherit_t2 SELECT i, i % 125, to_char(i % 4, 'FM0000')
> FROM generate_series(0, 499,5) i;
>
> CREATE TABLE RM38941_union_t1 (c1 INTEGER PRIMARY KEY,c2 INTEGER,c3
> CHAR(10));
> INSERT INTO RM38941_union_t1 SELECT i, i % 125, to_char(i % 4, 'FM0000')
> FROM generate_series(0, 499,2) i;
>
> CREATE TABLE RM38941_union_t2 (c1 INTEGER PRIMARY KEY,c2 INTEGER,c3
> CHAR(10));
> INSERT INTO RM38941_union_t2 SELECT i, i % 125, to_char(i % 4, 'FM0000')
> FROM generate_series(0, 499,4) i;
>
> ALTER TABLE RM38941_union_t1 SET (parallel_workers = 0);
> ALTER TABLE RM38941_inherit_t1 SET (parallel_workers = 0);
>
> --with parallel_append
> SET enable_parallel_append=on;
> postgres=# EXPLAIN (COSTS OFF)
> postgres-# SELECT AVG(c1),SUM(c2) FROM (SELECT c1,c2 FROM RM38941_union_t2
> UNION ALL SELECT c1,c2 FROM RM38941_inherit UNION ALL SELECT c1,c2 FROM
> RM38941_union_t1)UA;
> QUERY PLAN
> -----------------------------------------------------------------------
> Finalize Aggregate
> -> Gather
> Workers Planned: 3
> -> Partial Aggregate
> -> Parallel Append
> -> Seq Scan on rm38941_inherit_t1
> -> Seq Scan on rm38941_union_t1
> -> Parallel Seq Scan on rm38941_union_t2
> -> Parallel Seq Scan on rm38941_inherit
> -> Parallel Seq Scan on rm38941_inherit_t2
> -> Parallel Append
> -> Seq Scan on rm38941_inherit_t1
> -> Parallel Seq Scan on rm38941_inherit
> -> Parallel Seq Scan on rm38941_inherit_t2
> (14 rows)
>
> postgres=# SELECT AVG(c1),SUM(c2) FROM (SELECT c1,c2 FROM RM38941_union_t2
> UNION ALL SELECT c1,c2 FROM RM38941_inherit UNION ALL SELECT c1,c2 FROM
> RM38941_union_t1)UA;
> avg | sum
> ----------------------+-------
> 248.6983676366217175 | 86916
> (1 row)
>
>
> --without parallel_append
> SET enable_parallel_append=off;
> postgres=# EXPLAIN (COSTS OFF)
> postgres-# SELECT AVG(c1),SUM(c2) FROM (SELECT c1,c2 FROM RM38941_union_t2
> UNION ALL SELECT c1,c2 FROM RM38941_inherit UNION ALL SELECT c1,c2 FROM
> RM38941_union_t1)UA;
> QUERY PLAN
> --------------------------------------------
> Aggregate
> -> Append
> -> Seq Scan on rm38941_union_t2
> -> Seq Scan on rm38941_inherit
> -> Seq Scan on rm38941_inherit_t1
> -> Seq Scan on rm38941_inherit_t2
> -> Seq Scan on rm38941_union_t1
> (7 rows)
>
> postgres=# SELECT AVG(c1),SUM(c2) FROM (SELECT c1,c2 FROM RM38941_union_t2
> UNION ALL SELECT c1,c2 FROM RM38941_inherit UNION ALL SELECT c1,c2 FROM
> RM38941_union_t1)UA;
> avg | sum
> ----------------------+-------
> 248.6917040358744395 | 55083
> (1 row)
>
> Thanks & Regards,
> Rajkumar Raghuwanshi
> QMG, EnterpriseDB Corporation
--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company
On Tue, Jan 9, 2018 at 5:41 AM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote: > This subplan should not be there. It looks like that's because when > we add up subpaths while preparing Parallel Append path containing mix of > partial and non-partial child paths, in accumulate_append_subpath(), > the subpath being added is an Append path because it's a multi-level > partition. So in the condition "else if (special_subpaths != NULL)", > both *subpaths and *special_paths get updated. But then it is not > returning from there when it should. Instead, the control passes to > the end of function where *subpaths is again populated : > *subpaths = lappend(*subpaths, path); > which leads to an extra child with partial subpath getting added. Boy, what a dumb mistake on my part. Thanks for finding the bug, Rajkumar. Thanks for analyzing it, Amit. I have committed the fix. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company