Incorrect explain output for updates/delete operations with returning-list on partitioned tables

Поиск
Список
Период
Сортировка
От SAIKIRAN AVULA
Тема Incorrect explain output for updates/delete operations with returning-list on partitioned tables
Дата
Msg-id CABg8Gm41MMwVpBmJ-jDDtfrT6QN+OJjToboH7jCETL4gAnV1=Q@mail.gmail.com
обсуждение исходный текст
Ответы Re: Incorrect explain output for updates/delete operations with returning-list on partitioned tables
Список pgsql-hackers
Hi PostgreSQL Community,

I have been working on partitioned tables recently, and I have noticed something that doesn't seem correct with the EXPLAIN output of an update/delete query with a returning list.

For example, consider two partitioned tables, "t1" and "t2," with partitions "t11," "t12," and "t21," "t22," respectively. The table definitions are as follows:

```sql
postgres=# \d+ t1
                                     Partitioned table "public.t1"
 Column |  Type   | Collation | Nullable | Default | Storage | Compression | Stats target | Description
--------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
 a      | integer |           |          |         | plain   |             |              |
 b      | integer |           |          |         | plain   |             |              |
 c      | integer |           |          |         | plain   |             |              |
Partition key: RANGE (a)
Partitions: t11 FOR VALUES FROM (0) TO (1000),
            t12 FOR VALUES FROM (1000) TO (10000)

postgres=# \d+ t2
                                     Partitioned table "public.t2"
 Column |  Type   | Collation | Nullable | Default | Storage | Compression | Stats target | Description
--------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
 a      | integer |           |          |         | plain   |             |              |
 b      | integer |           |          |         | plain   |             |              |
 c      | integer |           |          |         | plain   |             |              |
Partition key: RANGE (a)
Partitions: t21 FOR VALUES FROM (0) TO (1000),
            t22 FOR VALUES FROM (1000) TO (10000)
```

The EXPLAIN output for an update query with a returning list doesn't seem correct to me. Here are the examples (the part that doesn't seem right is highlighted in bold):

Query1:
```
postgres=# explain verbose update t1 set b = 10 from t2 where t1.a = t2.a  returning t1.c;
                                        QUERY PLAN                                        
-------------------------------------------------------------------------------------------
 Update on public.t1  (cost=0.00..125187.88 rows=41616 width=14)
   Output: t1_1.c     -----> something not right??
   Update on public.t11 t1_1
   Update on public.t12 t1_2
   ->  Append  (cost=0.00..125187.88 rows=41616 width=14)
         ->  Nested Loop  (cost=0.00..62489.90 rows=20808 width=14)
               Output: 10, t1_1.tableoid, t1_1.ctid
               Join Filter: (t1_1.a = t2_1.a)
               ->  Seq Scan on public.t11 t1_1  (cost=0.00..30.40 rows=2040 width=14)
                     Output: t1_1.a, t1_1.tableoid, t1_1.ctid
               ->  Materialize  (cost=0.00..40.60 rows=2040 width=4)
                     Output: t2_1.a
                     ->  Seq Scan on public.t21 t2_1  (cost=0.00..30.40 rows=2040 width=4)
                           Output: t2_1.a
         ->  Nested Loop  (cost=0.00..62489.90 rows=20808 width=14)
               Output: 10, t1_2.tableoid, t1_2.ctid
               Join Filter: (t1_2.a = t2_2.a)
               ->  Seq Scan on public.t12 t1_2  (cost=0.00..30.40 rows=2040 width=14)
                     Output: t1_2.a, t1_2.tableoid, t1_2.ctid
               ->  Materialize  (cost=0.00..40.60 rows=2040 width=4)
                     Output: t2_2.a
                     ->  Seq Scan on public.t22 t2_2  (cost=0.00..30.40 rows=2040 width=4)
                           Output: t2_2.a
(23 rows)
```

Query2:
```
postgres=# explain verbose update t1 set b = 10 from t2 where t1.a = t2.a  returning t2.c;
                                        QUERY PLAN                                        
-------------------------------------------------------------------------------------------
 Update on public.t1  (cost=0.00..125187.88 rows=41616 width=18)
   Output: t2.c
   Update on public.t11 t1_1
   Update on public.t12 t1_2
   ->  Append  (cost=0.00..125187.88 rows=41616 width=18)
         ->  Nested Loop  (cost=0.00..62489.90 rows=20808 width=18)
               Output: 10, t2_1.c, t1_1.tableoid, t1_1.ctid
               Join Filter: (t1_1.a = t2_1.a)
               ->  Seq Scan on public.t11 t1_1  (cost=0.00..30.40 rows=2040 width=14)
                     Output: t1_1.a, t1_1.tableoid, t1_1.ctid
               ->  Materialize  (cost=0.00..40.60 rows=2040 width=8)
                     Output: t2_1.c, t2_1.a
                     ->  Seq Scan on public.t21 t2_1  (cost=0.00..30.40 rows=2040 width=8)
                           Output: t2_1.c, t2_1.a
         ->  Nested Loop  (cost=0.00..62489.90 rows=20808 width=18)
               Output: 10, t2_2.c, t1_2.tableoid, t1_2.ctid
               Join Filter: (t1_2.a = t2_2.a)
               ->  Seq Scan on public.t12 t1_2  (cost=0.00..30.40 rows=2040 width=14)
                     Output: t1_2.a, t1_2.tableoid, t1_2.ctid
               ->  Materialize  (cost=0.00..40.60 rows=2040 width=8)
                     Output: t2_2.c, t2_2.a
                     ->  Seq Scan on public.t22 t2_2  (cost=0.00..30.40 rows=2040 width=8)
                           Output: t2_2.c, t2_2.a
(23 rows)
```

After further investigation into the code, I noticed following:

1. In the 'grouping_planner()' function, while generating paths for the final relation (https://github.com/postgres/postgres/blob/master/src/backend/optimizer/plan/planner.c#L1857), we only take care of adjusting the append_rel_attributes in returningList for resultRelation. Shouldn't we do that for other relations as well in query? Example for Query2 above, adjust_appendrel_attrs_multilevel is a no-op.
2. After plan creation (https://github.com/postgres/postgres/blob/master/src/backend/optimizer/plan/createplan.c#L351), shouldn't we perform tlist labeling for the `returningList` as well? I suspect this is resulting in incorrect output in Query1.

I suspect that similar issues might also be present for `withCheckOptions`, `mergeActionList`, and `mergeJoinCondition`.

I would appreciate it if the community could provide insights or clarifications regarding this observation.

Thank you for your time and consideration.


Regards
Saikiran Avula,
SDE, Amazon Web Services.


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexander Korotkov
Дата:
Сообщение: Re: Removing unneeded self joins
Следующее
От: Robert Haas
Дата:
Сообщение: Re: On disable_cost