BUG #17337: Segmentation fault on updating row with ltree GIST index

Поиск
Список
Период
Сортировка
От PG Bug reporting form
Тема BUG #17337: Segmentation fault on updating row with ltree GIST index
Дата
Msg-id 17337-6264099dfbeb4705@postgresql.org
обсуждение исходный текст
Ответы Re: BUG #17337: Segmentation fault on updating row with ltree GIST index  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
The following bug has been logged on the website:

Bug reference:      17337
Logged by:          Ken Barber
Email address:      kbarber@salesforce.com
PostgreSQL version: 13.5
Operating system:   Ubuntu Focal 20.04
Description:

Greetings friends,

I've got a customer that was seeing a segfault whenever they were performing
an update operation on a table after an upgrade from Postgres 10 to 13.5. I
can't show all the data as this is private, but let me give you what details
I have.

The pertinent shape of the table:

# \d+ public.groups
                                                                       Table
"public.groups"
          Column           |            Type             | Collation |
Nullable |                  Default                  | Storage  | Stats
target | Description

---------------------------+-----------------------------+-----------+----------+-------------------------------------------+----------+--------------+-------------
 id                        | integer                     |           | not
null | nextval('public.groups_id_seq'::regclass) | plain    |
|
 *ltree_path                | public.ltree                |           |
    |                                           | extended |
|
 timestamp_at | timestamp without time zone |           |          |
                                  | plain    |              |
Indexes:
    "groups_pkey" PRIMARY KEY, btree (id)
    "index_groups_on_timestamp" btree (timestamp_at)
    "index_groups_on_ltree_path" gist (ltree_path)

After update to 13.5, the following DML started causing segfaults
intermittently (1 in every 5-10 updates):

    UPDATE public.groups SET "timestamp_at" = now() WHERE groups.id IN
(7777, 8888);

For example:

# UPDATE public.groups SET "timestamp_at" = now() WHERE groups.id IN (7777,
8888);
SSL SYSCALL error: EOF detected
The connection to the server was lost. Attempting reset: ^[[A
Succeeded.

So this didn't always occur, but by using now() and doing it repetitively we
could reproduce it.

We managed to retrieve a backtrace of the segfault from the core dump it
dropped:

Reading symbols from /usr/lib/postgresql/13/bin/postgres...
Reading symbols from
/usr/lib/debug/.build-id/0d/1c4bdf6025eb99411985137ffcce796a3cb404.debug...
[New LWP 28]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: postgres da1lj7q5v62ffr 85.222.134.37(4447)
UPDATE                  '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  __memmove_avx_unaligned_erms () at
../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:240
240     ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: No such
file or directory.
(gdb) bt
#0  __memmove_avx_unaligned_erms () at
../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:240
#1  0x00007fbd4c8cd816 in memcpy (__len=<optimized out>,
__src=0x556d91a2e488, __dest=<optimized out>) at
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:34
#2  ltree_gist_alloc (isalltrue=<optimized out>,
sign=sign@entry=0x556d91a2e6a8 "\375\373\377\377\356\377\377\377\200",
siglen=siglen@entry=28, left=left@entry=0x7fbd4ed908c4,
right=right@entry=0x556d91a2e488)
    at ./build/../contrib/ltree/ltree_gist.c:66
#3  0x00007fbd4c8cde80 in ltree_union (fcinfo=<optimized out>) at
./build/../contrib/ltree/ltree_gist.c:250
#4  0x0000556d909e9f59 in FunctionCall2Coll
(flinfo=flinfo@entry=0x556d91a2aa10, collation=<optimized out>,
arg1=arg1@entry=140729949558624, arg2=arg2@entry=140729949558620) at
./build/../src/backend/utils/fmgr/fmgr.c:1164
#5  0x0000556d90604687 in gistMakeUnionKey
(giststate=giststate@entry=0x556d91a2a3e8, attno=attno@entry=0,
entry1=entry1@entry=0x7ffe3ea719a0, isnull1=<optimized out>,
entry2=entry2@entry=0x7ffe3ea71da0, 
    isnull2=<optimized out>, dst=0x7ffe3ea718a0, dstisnull=0x7ffe3ea71880)
at ./build/../src/backend/access/gist/gistutil.c:272
#6  0x0000556d90604d64 in gistgetadjusted (r=0x7fbd4ca886c8,
oldtup=0x7fbd4ed90898, addtup=addtup@entry=0x556d91a2e478,
giststate=giststate@entry=0x556d91a2a3e8) at
./build/../src/backend/access/gist/gistutil.c:335
#7  0x0000556d905fc1a8 in gistdoinsert (r=0x7fbd4ca886c8,
itup=0x556d91a2e478, freespace=<optimized out>, giststate=0x556d91a2a3e8,
heapRel=<optimized out>, is_build=<optimized out>)
    at ./build/../src/backend/access/gist/gist.c:765
#8  0x0000556d905fc7d1 in gistinsert (r=0x7fbd4ca886c8, values=<optimized
out>, isnull=<optimized out>, ht_ctid=0x556d91a1ac48,
heapRel=0x7fbd4ca80350, checkUnique=<optimized out>,
indexInfo=0x556d91a16390)
    at ./build/../src/backend/access/gist/gist.c:180
#9  0x0000556d9076a2a7 in ExecInsertIndexTuples
(slot=slot@entry=0x556d91a1ac18, estate=estate@entry=0x556d91a15778,
noDupErr=noDupErr@entry=false, specConflict=specConflict@entry=0x0,
arbiterIndexes=arbiterIndexes@entry=0x0)
    at ./build/../src/backend/executor/execIndexing.c:393
#10 0x0000556d907938df in ExecUpdate (mtstate=0x556d91a15d50,
tupleid=0x7ffe3ea7264a, oldtuple=0x0, slot=0x556d91a1ac18,
planSlot=0x556d91a1aad8, epqstate=0x556d91a15e48, estate=0x556d91a15778,
canSetTag=true)
    at ./build/../src/backend/executor/nodeModifyTable.c:1550
#11 0x0000556d907948a5 in ExecModifyTable (pstate=0x556d91a15d50) at
./build/../src/backend/executor/nodeModifyTable.c:2329
#12 0x0000556d9076b244 in ExecProcNode (node=0x556d91a15d50) at
./build/../src/include/executor/executor.h:248
#13 ExecutePlan (execute_once=<optimized out>, dest=0x556d91a0d440,
direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>,
operation=CMD_UPDATE, use_parallel_mode=<optimized out>,
planstate=0x556d91a15d50, 
    estate=0x556d91a15778) at
./build/../src/backend/executor/execMain.c:1632
#14 standard_ExecutorRun (queryDesc=0x556d91a15368, direction=<optimized
out>, count=0, execute_once=<optimized out>) at
./build/../src/backend/executor/execMain.c:350
#15 0x00007fbdb38ed8b5 in pgss_ExecutorRun (queryDesc=0x556d91a15368,
direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at
./build/../contrib/pg_stat_statements/pg_stat_statements.c:1045
#16 0x0000556d908d24c7 in ProcessQuery (plan=<optimized out>, 
    sourceText=0x556d918effd8 "UPDATE public.\"groups\" SET \"updated_at\" =
'2021-12-02 08:27:59.157022', \"timestamp_at\" = now() WHERE
\"groups\".\"id\" IN (7777, 8888)\n;", params=0x0, 
    queryEnv=0x0, dest=0x556d91a0d440, qc=0x7ffe3ea72a90) at
./build/../src/backend/tcop/pquery.c:160
#17 0x0000556d908d2ec9 in PortalRunMulti
(portal=portal@entry=0x556d919affe8, isTopLevel=isTopLevel@entry=true,
setHoldSnapshot=setHoldSnapshot@entry=false, dest=dest@entry=0x556d91a0d440,

    altdest=altdest@entry=0x556d91a0d440, qc=qc@entry=0x7ffe3ea72a90) at
./build/../src/backend/tcop/pquery.c:1271
#18 0x0000556d908d3257 in PortalRun (portal=0x556d919affe8,
count=9223372036854775807, isTopLevel=<optimized out>, run_once=<optimized
out>, dest=0x556d91a0d440, altdest=0x556d91a0d440, qc=0x7ffe3ea72a90)
    at ./build/../src/backend/tcop/pquery.c:788
#19 0x0000556d908cee97 in exec_simple_query (
    query_string=0x556d918effd8 "UPDATE public.\"groups\" SET \"updated_at\"
= '2021-12-02 08:27:59.157022', \"timestamp_at\" = now() WHERE
\"groups\".\"id\" IN (7777, 8888)\n;")
    at ./build/../src/backend/tcop/postgres.c:1239
#20 0x0000556d908d0b0d in PostgresMain (argc=<optimized out>,
argv=argv@entry=0x556d91948420, dbname=<optimized out>, username=<optimized
out>) at ./build/../src/backend/tcop/postgres.c:4337
#21 0x0000556d908588fc in BackendRun (port=0x556d91942310,
port=0x556d91942310) at
./build/../src/backend/postmaster/postmaster.c:4550
#22 BackendStartup (port=0x556d91942310) at
./build/../src/backend/postmaster/postmaster.c:4234
#23 ServerLoop () at ./build/../src/backend/postmaster/postmaster.c:1739
#24 0x0000556d9085985f in PostmasterMain (argc=5, argv=<optimized out>) at
./build/../src/backend/postmaster/postmaster.c:1412
#25 0x0000556d905d8d0a in main (argc=5, argv=0x556d918eafb0) at
./build/../src/backend/main/main.c:210

The workaround is simple enough, we drop and recreate the ltree gist
index:

DROP INDEX index_groups_on_ltree_path;
CREATE INDEX index_groups_on_ltree_path ON groups USING GIST (ltree_path);

And the update command works fine afterwards.

We believe we've seen this on multiple upgrade attempts from 10 for the same
data. If you require anything else, please let me know.

I hope this helps.

Regards

Ken Barber
Salesforce, Elastic Data (aka Heroku Data)


В списке pgsql-bugs по дате отправления:

Предыдущее
От: Alex Enachioaie
Дата:
Сообщение: Re: BUG #17327: Postgres server does not correctly emit error for max_slot_wal_keep_size being breached
Следующее
От: Tom Lane
Дата:
Сообщение: Re: BUG #17337: Segmentation fault on updating row with ltree GIST index