Обсуждение: BUG #16293: postgres segfaults and returns SQLSTATE 08006

Поиск
Список
Период
Сортировка

BUG #16293: postgres segfaults and returns SQLSTATE 08006

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      16293
Logged by:          Daniel WM
Email address:      dwilches@gmail.com
PostgreSQL version: 11.7
Operating system:   Ubuntu 16.04
Description:

I was getting this error each time I ran a series of autoamted tests in a CI
server:
```
2020-03-06 23:32:57,051 WARN  main c.z.h.p.ProxyConnection - HikariPool-2 -
Connection org.postgresql.jdbc.PgConnection@42e3ede4 marked as broken
because of SQLSTATE(08006), ErrorCode(0) {}
 org.postgresql.util.PSQLException: An I/O error occurred while sending to
the backend.
    at
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:335)
    at
org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:441)
    at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:365)
    at
org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:143)
    at
org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:106)
```

I looked at `dmesg` and I saw this:

    [1383242.997083] postgres[7998]: segfault at 100000048 ip
000055c587913e4b sp 00007fffa492e6f0 error 4 in
postgres[55c587424000+72d000]

Using `gdb` I got this backtrace:
```
Core was generated by `postgres: REDACTED REDACTED 127.0.0.1(49990) INSERT
                       '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000055c587913e4b in pfree ()
(gdb) bt
#0  0x000055c587913e4b in pfree ()
#1  0x000055c587687475 in ExecSetSlotDescriptor ()
#2  0x000055c58767eb61 in ExecConstraints ()
#3  0x000055c5876a0efc in ?? ()
#4  0x000055c5876a2085 in ?? ()
#5  0x000055c58767cd1b in standard_ExecutorRun ()
#6  0x000055c5877d22e5 in ?? ()
#7  0x000055c5877d2538 in ?? ()
#8  0x000055c5877d2855 in ?? ()
#9  0x000055c5877d3427 in PortalRun ()
#10 0x000055c5877cfeec in PostgresMain ()
#11 0x000055c5874ddd37 in ?? ()
#12 0x000055c58775a882 in PostmasterMain ()
#13 0x000055c5874df0e5 in main ()
```

My full version of Postgres is: `PostgreSQL 11.7 (Ubuntu 11.7-1.pgdg16.04+1)
on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12)
5.4.0 20160609, 64-bit`

I finally found the issue was that one of the tests was trying to insert a
row on a partitioned table, but there was no partition where to put that
row, so it had to go to the default partition. But, the default partition
had a constraint that disallowed that row, so the row couldn't be inserted
and Postgres ended with a segfault.

I enabled query logging and managed to find the offending "insert":

    insert into "myschema"."mytable" ("custcode", "custcar", "custdob",
"closed") values ('a33113f2-930c-47de-95a6-b9e07650468a', 'hellow world',
'2020-02-02 01:00:00+00:00', 'f')

That is a partitioned table on the "custdob" column, with these
partitions:

```
\d+ mytable
                                                           Table
"myschema.mytable"
   Column   |           Type           | Collation | Nullable |
  Default                 | Storage  | Stats target | Description 

------------+--------------------------+-----------+----------+----------------------------------------+----------+--------------+-------------
 id         | bigint                   |           | not null |
nextval('mytable_id_seq'::regclass)    | plain    |              | 
 custcode   | uuid                     |           | not null |
                          | plain    |              | 
 custcar    | character varying        |           | not null |
                          | extended |              | 
 custdob    | timestamp with time zone |           | not null |
                          | plain    |              | 
 closed     | boolean                  |           | not null | false
                          | plain    |              | 
Partition key: RANGE (custdob)
Partitions: mytable_201902_partition FOR VALUES FROM ('2019-02-01
00:00:00+00') TO ('2019-03-01 00:00:00+00'),
            mytable_201903_partition FOR VALUES FROM ('2019-03-01
00:00:00+00') TO ('2019-04-01 00:00:00+00'),
            mytable_201908_partition FOR VALUES FROM ('2019-08-02
00:00:00+00') TO ('2019-09-01 00:00:00+00'),
            mytable_202003_partition FOR VALUES FROM ('2020-03-01
00:00:00+00') TO ('2020-04-01 00:00:00+00'),
            mytable_202004_partition FOR VALUES FROM ('2020-04-01
00:00:00+00') TO ('2020-05-01 00:00:00+00'),
            mytable_000000_partition DEFAULT
```

Notice the INSERT wants to insert in February's partition, but that
partition is missing in my CI server, so it should insert the row in the
DEFAULT partition. The issue is, the DEFAULT partition has this
constraint:

```
"mytable_partition_check" CHECK (custdob < '2019-08-02
00:00:00+00'::timestamp with time zone)
```

So Postgres seems to be getting into a bug because it can't insert a record
for February while that constraint is in there. If I drop this constraint
and re-issue the offending INSERT, it works this time.

I first reported this issue in SO:
https://stackoverflow.com/questions/60573071/postgres-segfault-and-returns-sqlstate-08006


Re: BUG #16293: postgres segfaults and returns SQLSTATE 08006

От
Tom Lane
Дата:
PG Bug reporting form <noreply@postgresql.org> writes:
> I finally found the issue was that one of the tests was trying to insert a
> row on a partitioned table, but there was no partition where to put that
> row, so it had to go to the default partition. But, the default partition
> had a constraint that disallowed that row, so the row couldn't be inserted
> and Postgres ended with a segfault.

Hm, works for me:

regression=# create table tt (f1 int, f2 text) partition by list(f1);
CREATE TABLE
regression=# create table t1 partition of tt for values in (1);
CREATE TABLE
regression=# create table tother partition of tt default;
CREATE TABLE
regression=# alter table tother add check (length(f2) > 1);
ALTER TABLE
regression=# insert into tt values (1, 'f');
INSERT 0 1
regression=# insert into tt values (3, 'f');
ERROR:  new row for relation "tother" violates check constraint "tother_f2_check"
DETAIL:  Failing row contains (3, f).
regression=#

Admittedly this is v11 branch tip not exactly 11.7, but I don't see
anything related-looking in the commit log.  So I think there is some
contributing factor you didn't mention.  Could you provide a
self-contained reproduction script?

            regards, tom lane



Re: BUG #16293: postgres segfaults and returns SQLSTATE 08006

От
Daniel WM
Дата:
I tried your steps and it didn't crash here either. I'll try to find a reproducible set of steps.

I isolated the test that was failing and it didn't fail by itself, so there is some interaction with my other tests. I tried running all tests in my laptop and yeap, it segfaults here too. So we can add "Ubuntu 18.04.4" with this Postgres:

```
# select version();
                                                              version                                                              
-----------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 11.7 (Ubuntu 11.7-2.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0, 64-bit
```

--
Daniel Wilches



On Tue, Mar 10, 2020 at 3:19 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
PG Bug reporting form <noreply@postgresql.org> writes:
> I finally found the issue was that one of the tests was trying to insert a
> row on a partitioned table, but there was no partition where to put that
> row, so it had to go to the default partition. But, the default partition
> had a constraint that disallowed that row, so the row couldn't be inserted
> and Postgres ended with a segfault.

Hm, works for me:

regression=# create table tt (f1 int, f2 text) partition by list(f1);
CREATE TABLE
regression=# create table t1 partition of tt for values in (1);
CREATE TABLE
regression=# create table tother partition of tt default;
CREATE TABLE
regression=# alter table tother add check (length(f2) > 1);
ALTER TABLE
regression=# insert into tt values (1, 'f');
INSERT 0 1
regression=# insert into tt values (3, 'f');
ERROR:  new row for relation "tother" violates check constraint "tother_f2_check"
DETAIL:  Failing row contains (3, f).
regression=#

Admittedly this is v11 branch tip not exactly 11.7, but I don't see
anything related-looking in the commit log.  So I think there is some
contributing factor you didn't mention.  Could you provide a
self-contained reproduction script?

                        regards, tom lane

Re: BUG #16293: postgres segfaults and returns SQLSTATE 08006

От
Daniel WM
Дата:
Hello,

I have finally isolated the issue and I have a set of steps that reliably cause the segfault:

CREATE TABLE parent_table (
    custdob timestamp with time zone not null,
    closed BOOLEAN NOT NULL DEFAULT FALSE
) PARTITION BY RANGE (custdob);

CREATE TABLE default_partition (
    custdob timestamp with time zone not null,
    CONSTRAINT dummy_check CHECK (custdob  < '2019-08-02T00:00Z')
);

-- This is needed for the crash, if I add this column when creating the table "default_partition" then I don't get the crash, but when it's added with an "ALTER TABLE" then I get the crash.
ALTER TABLE default_partition ADD COLUMN closed BOOLEAN NOT NULL DEFAULT FALSE;

ALTER TABLE parent_table ATTACH PARTITION default_partition DEFAULT;

INSERT INTO parent_table VALUES ('2020-02-02 01:00:00+00:00', 'f');


--
Daniel Wilches

Re: BUG #16293: postgres segfaults and returns SQLSTATE 08006

От
Tom Lane
Дата:
Daniel WM <dwilches@gmail.com> writes:
> I have finally isolated the issue and I have a set of steps that reliably
> cause the segfault:

Interesting.  I do *not* see a crash in v12 or HEAD, but v11 gives
me an assertion failure:

TRAP: FailedAssertion("!(!slot->tts_fixedTupleDescriptor)", File: "execTuples.c", Line: 284)
2020-03-14 17:16:06.626 EDT [1101] LOG:  server process (PID 20822) was terminated by signal 6: Aborted
2020-03-14 17:16:06.626 EDT [1101] DETAIL:  Failed process was running: INSERT INTO parent_table VALUES ('2020-02-02
01:00:00+00:00','f'); 

here:

#3  0x000000000063ef45 in ExecSetSlotDescriptor (slot=0x1cf4060,
    tupdesc=0x7f903a2dd0f8) at execTuples.c:284
#4  0x00000000006360eb in ExecConstraints (resultRelInfo=0x1cf47b8,
    slot=0x1cf4060, estate=0x1cf3778) at execMain.c:2062
#5  0x000000000065a44d in ExecInsert (mtstate=0x1cf3ae0, slot=0x1cf4060,
    planSlot=0x1cf4060, estate=0x1cf3778, canSetTag=true)
    at nodeModifyTable.c:398
#6  0x000000000065b80b in ExecModifyTable (pstate=<value optimized out>)
    at nodeModifyTable.c:2159
#7  0x0000000000636c27 in ExecProcNode (queryDesc=0x1ce51e8,
    direction=<value optimized out>, count=0, execute_once=224)
    at ../../../src/include/executor/executor.h:247
#8  ExecutePlan (queryDesc=0x1ce51e8, direction=<value optimized out>,
    count=0, execute_once=224) at execMain.c:1723
#9  standard_ExecutorRun (queryDesc=0x1ce51e8,
    direction=<value optimized out>, count=0, execute_once=224)

So this looks like something to do with Andres' tupletableslot work.

            regards, tom lane



Re: BUG #16293: postgres segfaults and returns SQLSTATE 08006

От
Andres Freund
Дата:
Hi,

On 2020-03-14 17:21:04 -0400, Tom Lane wrote:
> Daniel WM <dwilches@gmail.com> writes:
> > I have finally isolated the issue and I have a set of steps that reliably
> > cause the segfault:
> 
> Interesting.  I do *not* see a crash in v12 or HEAD, but v11 gives
> me an assertion failure:
> 
> TRAP: FailedAssertion("!(!slot->tts_fixedTupleDescriptor)", File: "execTuples.c", Line: 284)
> 2020-03-14 17:16:06.626 EDT [1101] LOG:  server process (PID 20822) was terminated by signal 6: Aborted
> 2020-03-14 17:16:06.626 EDT [1101] DETAIL:  Failed process was running: INSERT INTO parent_table VALUES ('2020-02-02
01:00:00+00:00','f');
 
> 
> here:
> 
> #3  0x000000000063ef45 in ExecSetSlotDescriptor (slot=0x1cf4060, 
>     tupdesc=0x7f903a2dd0f8) at execTuples.c:284
> #4  0x00000000006360eb in ExecConstraints (resultRelInfo=0x1cf47b8, 
>     slot=0x1cf4060, estate=0x1cf3778) at execMain.c:2062
> #5  0x000000000065a44d in ExecInsert (mtstate=0x1cf3ae0, slot=0x1cf4060, 
>     planSlot=0x1cf4060, estate=0x1cf3778, canSetTag=true)
>     at nodeModifyTable.c:398
> #6  0x000000000065b80b in ExecModifyTable (pstate=<value optimized out>)
>     at nodeModifyTable.c:2159
> #7  0x0000000000636c27 in ExecProcNode (queryDesc=0x1ce51e8, 
>     direction=<value optimized out>, count=0, execute_once=224)
>     at ../../../src/include/executor/executor.h:247
> #8  ExecutePlan (queryDesc=0x1ce51e8, direction=<value optimized out>, 
>     count=0, execute_once=224) at execMain.c:1723
> #9  standard_ExecutorRun (queryDesc=0x1ce51e8, 
>     direction=<value optimized out>, count=0, execute_once=224)
> 
> So this looks like something to do with Andres' tupletableslot work.

Interesting. There wasn't that much in v11. It's not the general rework,
but just the allocation of the descriptor together slot when the
descriptor is known at creation time - or a conflict with concurrent
partitioning work.  It's plausible that enough changed in master for
that to not be a problem anymore.

Looking.

Greetings,

Andres Freund



Re: BUG #16293: postgres segfaults and returns SQLSTATE 08006

От
Andres Freund
Дата:
Hi,

On 2020-03-14 14:26:57 -0700, Andres Freund wrote:
> On 2020-03-14 17:21:04 -0400, Tom Lane wrote:
> > Interesting.  I do *not* see a crash in v12 or HEAD, but v11 gives
> > me an assertion failure:
> > 
> > TRAP: FailedAssertion("!(!slot->tts_fixedTupleDescriptor)", File: "execTuples.c", Line: 284)
> > 2020-03-14 17:16:06.626 EDT [1101] LOG:  server process (PID 20822) was terminated by signal 6: Aborted
> > 2020-03-14 17:16:06.626 EDT [1101] DETAIL:  Failed process was running: INSERT INTO parent_table VALUES
('2020-02-0201:00:00+00:00', 'f');
 
> > 
> > here:
> > 
> > #3  0x000000000063ef45 in ExecSetSlotDescriptor (slot=0x1cf4060, 
> >     tupdesc=0x7f903a2dd0f8) at execTuples.c:284
> > #4  0x00000000006360eb in ExecConstraints (resultRelInfo=0x1cf47b8, 
> >     slot=0x1cf4060, estate=0x1cf3778) at execMain.c:2062
> > #5  0x000000000065a44d in ExecInsert (mtstate=0x1cf3ae0, slot=0x1cf4060, 
> >     planSlot=0x1cf4060, estate=0x1cf3778, canSetTag=true)
> >     at nodeModifyTable.c:398
> > #6  0x000000000065b80b in ExecModifyTable (pstate=<value optimized out>)
> >     at nodeModifyTable.c:2159
> > #7  0x0000000000636c27 in ExecProcNode (queryDesc=0x1ce51e8, 
> >     direction=<value optimized out>, count=0, execute_once=224)
> >     at ../../../src/include/executor/executor.h:247
> > #8  ExecutePlan (queryDesc=0x1ce51e8, direction=<value optimized out>, 
> >     count=0, execute_once=224) at execMain.c:1723
> > #9  standard_ExecutorRun (queryDesc=0x1ce51e8, 
> >     direction=<value optimized out>, count=0, execute_once=224)
> > 
> > So this looks like something to do with Andres' tupletableslot work.
> 
> Interesting. There wasn't that much in v11. It's not the general rework,
> but just the allocation of the descriptor together slot when the
> descriptor is known at creation time - or a conflict with concurrent
> partitioning work.  It's plausible that enough changed in master for
> that to not be a problem anymore.

Hm. So the relevant slot is created at (huray for rr making it trivial
to determine that):

#0  0x000055d6c1b127c6 in MakeTupleTableSlot (tupleDesc=0x55d6c3dd64f8) at
/home/andres/src/postgresql-11/src/backend/executor/execTuples.c:135
#1  0x000055d6c1b12895 in ExecAllocTableSlot (tupleTable=0x55d6c3dd5dd8, desc=0x55d6c3dd64f8)
    at /home/andres/src/postgresql-11/src/backend/executor/execTuples.c:169
#2  0x000055d6c1b1389a in ExecInitResultTupleSlotTL (estate=0x55d6c3dd5d28, planstate=0x55d6c3dd62e8)
    at /home/andres/src/postgresql-11/src/backend/executor/execTuples.c:907
#3  0x000055d6c1b3e98b in ExecInitResult (node=0x55d6c3de2638, estate=0x55d6c3dd5d28, eflags=0)
    at /home/andres/src/postgresql-11/src/backend/executor/nodeResult.c:220
#4  0x000055d6c1b0e690 in ExecInitNode (node=0x55d6c3de2638, estate=0x55d6c3dd5d28, eflags=0)
    at /home/andres/src/postgresql-11/src/backend/executor/execProcnode.c:164
#5  0x000055d6c1b3beec in ExecInitModifyTable (node=0x55d6c3d0cdb0, estate=0x55d6c3dd5d28, eflags=0)
    at /home/andres/src/postgresql-11/src/backend/executor/nodeModifyTable.c:2304
#6  0x000055d6c1b0e6ce in ExecInitNode (node=0x55d6c3d0cdb0, estate=0x55d6c3dd5d28, eflags=0)
    at /home/andres/src/postgresql-11/src/backend/executor/execProcnode.c:174
#7  0x000055d6c1b04c21 in InitPlan (queryDesc=0x55d6c3dec888, eflags=0) at
/home/andres/src/postgresql-11/src/backend/executor/execMain.c:1046

I don't think it's ok for ExecConstraint() to overwrite the tuple
descriptor of a slot "owned" by nodeResult.c. Am I missing something, or
is that broken?

In v12+ that problem doesn't exist, because we simply allocate a new
slot (this is just for printing an error message).

There's other places using ExecSetSlotDescriptor(), e.g.
    else if (resultRelInfo->ri_FdwRoutine)
    {
        HeapTuple   tuple;

        /*
         * delete from foreign table: let the FDW do it
         *
         * We offer the trigger tuple slot as a place to store RETURNING data,
         * although the FDW can return some other slot if it wants.  Set up
         * the slot's tupdesc so the FDW doesn't need to do that for itself.
         */
        slot = estate->es_trig_tuple_slot;
        if (slot->tts_tupleDescriptor != RelationGetDescr(resultRelationDesc))
            ExecSetSlotDescriptor(slot, RelationGetDescr(resultRelationDesc));

        slot = resultRelInfo->ri_FdwRoutine->ExecForeignDelete(estate,
                                                               resultRelInfo,
                                                               slot,
                                                               planSlot);

but that's different, because it's using a slot (kind of) owned by the
modify node, rather just a random subsidiary node's slot.


I think the only reason this issue doesn't have actually bad
consequences is that we throw an error afterwards - which'll result in
the executor tree not being used any further.

I'm inclined to simply copy v12's approach of just ad-hoc creating a
slot in those routines?


It seems fairly insane how many places have this approximate code,
btw. In 11 we have a copy of nearly the same ~40 lines each in at least
ExecPartitionCheckEmitError(), ExecConstraints (twice) and
ExecWitchCheckOption().

Greetings,

Andres Freund



Re: BUG #16293: postgres segfaults and returns SQLSTATE 08006

От
Amit Langote
Дата:
Hi,

On Sun, Mar 15, 2020 at 9:36 AM Andres Freund <andres@anarazel.de> wrote:
> On 2020-03-14 14:26:57 -0700, Andres Freund wrote:
> > On 2020-03-14 17:21:04 -0400, Tom Lane wrote:
> > > Interesting.  I do *not* see a crash in v12 or HEAD, but v11 gives
> > > me an assertion failure:
> > >
> > > TRAP: FailedAssertion("!(!slot->tts_fixedTupleDescriptor)", File: "execTuples.c", Line: 284)
> > > 2020-03-14 17:16:06.626 EDT [1101] LOG:  server process (PID 20822) was terminated by signal 6: Aborted
> > > 2020-03-14 17:16:06.626 EDT [1101] DETAIL:  Failed process was running: INSERT INTO parent_table VALUES
('2020-02-0201:00:00+00:00', 'f');
 
> > >
> > > here:
> > >
> > > #3  0x000000000063ef45 in ExecSetSlotDescriptor (slot=0x1cf4060,
> > >     tupdesc=0x7f903a2dd0f8) at execTuples.c:284
> > > #4  0x00000000006360eb in ExecConstraints (resultRelInfo=0x1cf47b8,
> > >     slot=0x1cf4060, estate=0x1cf3778) at execMain.c:2062
> > > #5  0x000000000065a44d in ExecInsert (mtstate=0x1cf3ae0, slot=0x1cf4060,
> > >     planSlot=0x1cf4060, estate=0x1cf3778, canSetTag=true)
> > >     at nodeModifyTable.c:398
> > > #6  0x000000000065b80b in ExecModifyTable (pstate=<value optimized out>)
> > >     at nodeModifyTable.c:2159
> > > #7  0x0000000000636c27 in ExecProcNode (queryDesc=0x1ce51e8,
> > >     direction=<value optimized out>, count=0, execute_once=224)
> > >     at ../../../src/include/executor/executor.h:247
> > > #8  ExecutePlan (queryDesc=0x1ce51e8, direction=<value optimized out>,
> > >     count=0, execute_once=224) at execMain.c:1723
> > > #9  standard_ExecutorRun (queryDesc=0x1ce51e8,
> > >     direction=<value optimized out>, count=0, execute_once=224)
> > >
> > > So this looks like something to do with Andres' tupletableslot work.
> >
> > Interesting. There wasn't that much in v11. It's not the general rework,
> > but just the allocation of the descriptor together slot when the
> > descriptor is known at creation time - or a conflict with concurrent
> > partitioning work.  It's plausible that enough changed in master for
> > that to not be a problem anymore.
>
> Hm. So the relevant slot is created at (huray for rr making it trivial
> to determine that):
>
> #0  0x000055d6c1b127c6 in MakeTupleTableSlot (tupleDesc=0x55d6c3dd64f8) at
/home/andres/src/postgresql-11/src/backend/executor/execTuples.c:135
> #1  0x000055d6c1b12895 in ExecAllocTableSlot (tupleTable=0x55d6c3dd5dd8, desc=0x55d6c3dd64f8)
>     at /home/andres/src/postgresql-11/src/backend/executor/execTuples.c:169
> #2  0x000055d6c1b1389a in ExecInitResultTupleSlotTL (estate=0x55d6c3dd5d28, planstate=0x55d6c3dd62e8)
>     at /home/andres/src/postgresql-11/src/backend/executor/execTuples.c:907
> #3  0x000055d6c1b3e98b in ExecInitResult (node=0x55d6c3de2638, estate=0x55d6c3dd5d28, eflags=0)
>     at /home/andres/src/postgresql-11/src/backend/executor/nodeResult.c:220
> #4  0x000055d6c1b0e690 in ExecInitNode (node=0x55d6c3de2638, estate=0x55d6c3dd5d28, eflags=0)
>     at /home/andres/src/postgresql-11/src/backend/executor/execProcnode.c:164
> #5  0x000055d6c1b3beec in ExecInitModifyTable (node=0x55d6c3d0cdb0, estate=0x55d6c3dd5d28, eflags=0)
>     at /home/andres/src/postgresql-11/src/backend/executor/nodeModifyTable.c:2304
> #6  0x000055d6c1b0e6ce in ExecInitNode (node=0x55d6c3d0cdb0, estate=0x55d6c3dd5d28, eflags=0)
>     at /home/andres/src/postgresql-11/src/backend/executor/execProcnode.c:174
> #7  0x000055d6c1b04c21 in InitPlan (queryDesc=0x55d6c3dec888, eflags=0) at
/home/andres/src/postgresql-11/src/backend/executor/execMain.c:1046
>
> I don't think it's ok for ExecConstraint() to overwrite the tuple
> descriptor of a slot "owned" by nodeResult.c. Am I missing something, or
> is that broken?

Looking into it, that partitioning code in ExecConstraint() would not
"normally" overwrite a slot that is not managed by partitioning.
Normally, there would be a dedicated "partition" slot that would be
used, but only if tuple conversion from root parent to leaf partition
is necessary before routing the tuple.

What is not "normal" in this case is that tuple conversion is deemed
necessary when converting from leaf partition to root parent whereas
not in the other direction.  It's because one of the partition's
attributes has atthasmissing set to true, which triggers this code in
convert_tuples_by_name():

            /*
             * If the input column has a missing attribute, we need a
             * conversion.
             */
            if (inatt->atthasmissing)
            {
                same = false;
                break;
            }

This asymmetry causes the code in ExecConstraints() to try to convert
the tuple whereas no conversion would have been performed before,
meaning to dedicated partition slot would have been passed to
ExecConstraints().

> In v12+ that problem doesn't exist, because we simply allocate a new
> slot (this is just for printing an error message).

Right.

> I'm inclined to simply copy v12's approach of just ad-hoc creating a
> slot in those routines?

Agreed.

> It seems fairly insane how many places have this approximate code,
> btw. In 11 we have a copy of nearly the same ~40 lines each in at least
> ExecPartitionCheckEmitError(), ExecConstraints (twice) and
> ExecWitchCheckOption().

That is certainly bad.  I think we should get that code in one place
and inside ExecBuildSlotValueDescription() seems like a good place,
because that's what needs a converted tuple to build a string from it.
I have tried that in the attached -- need different patches for
different branches as there have been a lot of small changes in this
code over releases.

--
Thank you,
Amit

Вложения

Re: BUG #16293: postgres segfaults and returns SQLSTATE 08006

От
Andres Freund
Дата:
Hi,

On 2020-03-17 19:49:43 +0900, Amit Langote wrote:
> On Sun, Mar 15, 2020 at 9:36 AM Andres Freund <andres@anarazel.de> wrote:
> > I don't think it's ok for ExecConstraint() to overwrite the tuple
> > descriptor of a slot "owned" by nodeResult.c. Am I missing something, or
> > is that broken?
> 
> Looking into it, that partitioning code in ExecConstraint() would not
> "normally" overwrite a slot that is not managed by partitioning.
> Normally, there would be a dedicated "partition" slot that would be
> used, but only if tuple conversion from root parent to leaf partition
> is necessary before routing the tuple.

> What is not "normal" in this case is that tuple conversion is deemed
> necessary when converting from leaf partition to root parent whereas
> not in the other direction.  It's because one of the partition's
> attributes has atthasmissing set to true, which triggers this code in
> convert_tuples_by_name():

Hm. I don't think we generally reach this path only with a "partition"
slot? I'm looking at 11, for the purpose of this bug. The INSERT case
"normally" is only reached with the slot returned by
ExecPrepareTupleRouting(), true. But ExecConstraint() is also
e.g. called from ExecUpdate() - and as far as I can tell that will be
either the slot from the plan, or the slot from the junkfilter.


> > It seems fairly insane how many places have this approximate code,
> > btw. In 11 we have a copy of nearly the same ~40 lines each in at least
> > ExecPartitionCheckEmitError(), ExecConstraints (twice) and
> > ExecWitchCheckOption().
> 
> That is certainly bad.  I think we should get that code in one place
> and inside ExecBuildSlotValueDescription() seems like a good place,
> because that's what needs a converted tuple to build a string from it.
> I have tried that in the attached -- need different patches for
> different branches as there have been a lot of small changes in this
> code over releases.

That certainly looks better.

Greetings,

Andres Freund



Re: BUG #16293: postgres segfaults and returns SQLSTATE 08006

От
Andres Freund
Дата:
Hi,

On 2020-03-21 23:52:17 -0700, Andres Freund wrote:
> On 2020-03-17 19:49:43 +0900, Amit Langote wrote:
> > On Sun, Mar 15, 2020 at 9:36 AM Andres Freund <andres@anarazel.de> wrote:
> > > I don't think it's ok for ExecConstraint() to overwrite the tuple
> > > descriptor of a slot "owned" by nodeResult.c. Am I missing something, or
> > > is that broken?
> > 
> > Looking into it, that partitioning code in ExecConstraint() would not
> > "normally" overwrite a slot that is not managed by partitioning.
> > Normally, there would be a dedicated "partition" slot that would be
> > used, but only if tuple conversion from root parent to leaf partition
> > is necessary before routing the tuple.
> 
> > What is not "normal" in this case is that tuple conversion is deemed
> > necessary when converting from leaf partition to root parent whereas
> > not in the other direction.  It's because one of the partition's
> > attributes has atthasmissing set to true, which triggers this code in
> > convert_tuples_by_name():
> 
> Hm. I don't think we generally reach this path only with a "partition"
> slot? I'm looking at 11, for the purpose of this bug. The INSERT case
> "normally" is only reached with the slot returned by
> ExecPrepareTupleRouting(), true. But ExecConstraint() is also
> e.g. called from ExecUpdate() - and as far as I can tell that will be
> either the slot from the plan, or the slot from the junkfilter.

Ah - it looks like we'll usually not (never?) have to do the conversion
for ExecUpdate(), because it'll execute ExecConstraints() with the leaf
partition's ResultRelInfo.

I'm a bit confused as to what this whole thing is supposed to be doing,
tbh. The explanator comment says:
                /*
                 * If the tuple has been routed, it's been converted to the
                 * partition's rowtype, which might differ from the root
                 * table's.  We must convert it back to the root table's
                 * rowtype so that val_desc shown error message matches the
                 * input tuple.
                 */

but it's not clear at all why this is a good thing to be doing. For one,
the violation very well might be dependent on the child table's
definition. For another, we actually display the child table in the
error message:
SCHEMA NAME:  public
TABLE NAME:  other_partition
CONSTRAINT NAME:  dummy_check
so it's not clear why we'd want to show the tuple in the "root" format.

And lastly, given that we're not actually showing the input tuple
(i.e. planSlot), but rather the tuple already modified by default
values, triggers, etc, I fail to understand why this is something we
should do at all.


> > > It seems fairly insane how many places have this approximate code,
> > > btw. In 11 we have a copy of nearly the same ~40 lines each in at least
> > > ExecPartitionCheckEmitError(), ExecConstraints (twice) and
> > > ExecWitchCheckOption().
> > 
> > That is certainly bad.  I think we should get that code in one place
> > and inside ExecBuildSlotValueDescription() seems like a good place,
> > because that's what needs a converted tuple to build a string from it.
> > I have tried that in the attached -- need different patches for
> > different branches as there have been a lot of small changes in this
> > code over releases.
> 
> That certainly looks better.

One thing that concerns me with this change is that now
ExecBuildSlotValueDescription() creates a slot each time its
called. That's fine and dandy for routines that immediately throw an
error, but it's not too hard to imagine ExecBuildSlotValueDescription()
being used for other things too.


Btw, I don't think this is something we can apply to the back branches
without very good reasons - changing ExecBuildSlotValueDescription()
could break extensions. I'm only mentioning this because your cleanup
patch doesn't seem to be based on HEAD.


If we actually want to keep this conversion, I wonder if the right
answer would be to change those routines to pass in
mtstate->mt_root_tuple_slot instead (and ensure it's created when
needed). One way to achieve that would be to move
ModifyTableState->mt_root_tuple_slot to ResultRelInfo - and potentially
create it on demand?

Greetings,

Andres Freund



Re: BUG #16293: postgres segfaults and returns SQLSTATE 08006

От
Andres Freund
Дата:
Hi,

On 2020-03-14 17:35:41 -0700, Andres Freund wrote:
> I think the only reason this issue doesn't have actually bad
> consequences is that we throw an error afterwards - which'll result in
> the executor tree not being used any further.
> 
> I'm inclined to simply copy v12's approach of just ad-hoc creating a
> slot in those routines?

Pushed a commit doing so in 10, 11 and in a seperate commit added tests
to 10-master.

Greetings,

Andres Freund



Re: BUG #16293: postgres segfaults and returns SQLSTATE 08006

От
Amit Langote
Дата:
Hi Andres,

Thanks for the commit fixing the crash.  Replying here to your
comments; sorry about the delay.

On Tue, Mar 24, 2020 at 3:53 AM Andres Freund <andres@anarazel.de> wrote:
> On 2020-03-21 23:52:17 -0700, Andres Freund wrote:
> > On 2020-03-17 19:49:43 +0900, Amit Langote wrote:
> > > On Sun, Mar 15, 2020 at 9:36 AM Andres Freund <andres@anarazel.de> wrote:
> > > > I don't think it's ok for ExecConstraint() to overwrite the tuple
> > > > descriptor of a slot "owned" by nodeResult.c. Am I missing something, or
> > > > is that broken?
> > >
> > > Looking into it, that partitioning code in ExecConstraint() would not
> > > "normally" overwrite a slot that is not managed by partitioning.
> > > Normally, there would be a dedicated "partition" slot that would be
> > > used, but only if tuple conversion from root parent to leaf partition
> > > is necessary before routing the tuple.
> >
> > > What is not "normal" in this case is that tuple conversion is deemed
> > > necessary when converting from leaf partition to root parent whereas
> > > not in the other direction.  It's because one of the partition's
> > > attributes has atthasmissing set to true, which triggers this code in
> > > convert_tuples_by_name():
> >
> > Hm. I don't think we generally reach this path only with a "partition"
> > slot? I'm looking at 11, for the purpose of this bug. The INSERT case
> > "normally" is only reached with the slot returned by
> > ExecPrepareTupleRouting(), true. But ExecConstraint() is also
> > e.g. called from ExecUpdate() - and as far as I can tell that will be
> > either the slot from the plan, or the slot from the junkfilter.
>
> Ah - it looks like we'll usually not (never?) have to do the conversion
> for ExecUpdate(), because it'll execute ExecConstraints() with the leaf
> partition's ResultRelInfo.

ExecConstraints() is called with leaf partition's ResultRelInfo in any
case.  Also, the tuple and the slot that it receives are always in the
leaf partition format.

In the INSERT case and the case in which UPDATE might need to use
tuple routing, we set ResultRelInfo's ri_PartitionRoot, mainly to use
for for showing root format tuples in the error messages.  However, if
UPDATE doesn't need tuple routing, because the partition key is not
changed, ri_PartitionRoot is not set, so the tuples shown in the error
messages are in the leaf partition format in that case. Admittedly,
that does seem a bit inconsistemt.

> I'm a bit confused as to what this whole thing is supposed to be doing,
> tbh. The explanator comment says:
>                                 /*
>                                  * If the tuple has been routed, it's been converted to the
>                                  * partition's rowtype, which might differ from the root
>                                  * table's.  We must convert it back to the root table's
>                                  * rowtype so that val_desc shown error message matches the
>                                  * input tuple.
>                                  */
>
> but it's not clear at all why this is a good thing to be doing. For one,
> the violation very well might be dependent on the child table's
> definition. For another, we actually display the child table in the
> error message:
> SCHEMA NAME:  public
> TABLE NAME:  other_partition
> CONSTRAINT NAME:  dummy_check
> so it's not clear why we'd want to show the tuple in the "root" format.
>
> And lastly, given that we're not actually showing the input tuple
> (i.e. planSlot), but rather the tuple already modified by default
> values, triggers, etc, I fail to understand why this is something we
> should do at all.

I thought at the time this code was first written that showing the
tuple in the format of the table mentioned in the query is more
user-firendly.  I am fine though if we would like to change this to
show the tuple in the leaf partition format.  As noted above, there
does exist a case where we show the tuple in the leaf partition format
even if it's not the table mentioned in the query.

> > > > It seems fairly insane how many places have this approximate code,
> > > > btw. In 11 we have a copy of nearly the same ~40 lines each in at least
> > > > ExecPartitionCheckEmitError(), ExecConstraints (twice) and
> > > > ExecWitchCheckOption().
> > >
> > > That is certainly bad.  I think we should get that code in one place
> > > and inside ExecBuildSlotValueDescription() seems like a good place,
> > > because that's what needs a converted tuple to build a string from it.
> > > I have tried that in the attached -- need different patches for
> > > different branches as there have been a lot of small changes in this
> > > code over releases.
> >
> > That certainly looks better.
>
> One thing that concerns me with this change is that now
> ExecBuildSlotValueDescription() creates a slot each time its
> called. That's fine and dandy for routines that immediately throw an
> error, but it's not too hard to imagine ExecBuildSlotValueDescription()
> being used for other things too.

Fair point.  Maybe using a fixed root slot, such as
mt_root_tuple_slot, would take care of that.

> Btw, I don't think this is something we can apply to the back branches
> without very good reasons - changing ExecBuildSlotValueDescription()
> could break extensions. I'm only mentioning this because your cleanup
> patch doesn't seem to be based on HEAD.

Hmm, ExecBuildSlotValueDescription isn't exported. Given that, I
thought it would be okay to back-patch, at least to make the code look
similar in all branches.

> If we actually want to keep this conversion, I wonder if the right
> answer would be to change those routines to pass in
> mtstate->mt_root_tuple_slot instead (and ensure it's created when
> needed). One way to achieve that would be to move
> ModifyTableState->mt_root_tuple_slot to ResultRelInfo - and potentially
> create it on demand?

Something like ri_PartitionRootSlot sounds like a good idea, but we
should make that a reference to mt_root_tuple_slot instead of its
replacement.  Would you be interested in seeing a patch?

--
Thank you,

Amit Langote
EnterpriseDB: http://www.enterprisedb.com



Re: BUG #16293: postgres segfaults and returns SQLSTATE 08006

От
Amit Langote
Дата:
On Tue, Mar 31, 2020 at 3:30 PM Amit Langote <amitlangote09@gmail.com> wrote:
> On Tue, Mar 24, 2020 at 3:53 AM Andres Freund <andres@anarazel.de> wrote:
> > If we actually want to keep this conversion, I wonder if the right
> > answer would be to change those routines to pass in
> > mtstate->mt_root_tuple_slot instead (and ensure it's created when
> > needed). One way to achieve that would be to move
> > ModifyTableState->mt_root_tuple_slot to ResultRelInfo - and potentially
> > create it on demand?
>
> Something like ri_PartitionRootSlot sounds like a good idea, but we
> should make that a reference to mt_root_tuple_slot instead of its
> replacement.  Would you be interested in seeing a patch?

I have updated the patch for HEAD to be this way.

-- 
Thank you,

Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Вложения