Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

Поиск
Список
Период
Сортировка
От Ondřej Jirman
Тема Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker
Дата
Msg-id 20191121135523.aybyo7o6yzxxungd@core.my.home
обсуждение исходный текст
Ответ на Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-bugs
On Thu, Nov 21, 2019 at 02:32:37PM +0100, Tomas Vondra wrote:
> On Thu, Nov 21, 2019 at 01:59:13PM +0100, Ondřej Jirman wrote:
> > On Thu, Nov 21, 2019 at 12:53:26PM +0100, postgresql wrote:
> > > Hello,
> > > 
> > > On Thu, Nov 21, 2019 at 11:39:40AM +0100, Tomas Vondra wrote:
> > > > On Thu, Nov 21, 2019 at 01:14:18AM +0000, PG Bug reporting form wrote:
> > > > > Replication of one of my databases (running on ARMv7 machine) started
> > > > > segfaulting on the subscriber side (x86_64) like this:
> > > > >
> > > > > #0  0x00007fc259739917 in __memmove_sse2_unaligned_erms () from
> > > > > /usr/lib/libc.so.6
> > > > > #1  0x000055d033e93d44 in memcpy (__len=620701425, __src=<optimized out>,
> > > > > __dest=0x55d0356da804) at /usr/include/bits/string_fortified.h:34
> > > > > #2  tts_virtual_materialize (slot=0x55d0356da3b8) at execTuples.c:235
> > > > > #3  0x000055d033e94d32 in ExecFetchSlotHeapTuple
> > > > > (slot=slot@entry=0x55d0356da3b8, materialize=materialize@entry=true,
> > > > > shouldFree=shouldFree@entry=0x7fff0e7cf387) at execTuples.c:1624
> > > 
> > > I forgot to add that publisher is still PostgreSQL 11.5.
> > > 
> > 
> > I can also add that I have data checksumming enabled on both ends, and
> > it did not detect any corruption:
> > 
> > # pg_verify_checksums -D /var/lib/postgres/data
> > Checksum scan completed
> > Data checksum version: 1
> > Files scanned:  1751
> > Blocks scanned: 86592
> > Bad checksums:  0
> > 
> > # pg_checksums /var/lib/postgres/data
> > Checksum operation completed
> > Files scanned:  22777
> > Blocks scanned: 3601527
> > Bad checksums:  0
> > Data checksum version: 1
> > 
> > WAL log on the publisher is also dumpable to a state hours after the issues
> > started:
> > 
> > I've put the dump here, if it's of any use: https://megous.com/dl/tmp/wal_dump.txt
> > 
> > Dump ends with:
> > 
> > pg_waldump: FATAL:  error in WAL record at 2/BBE0E538: invalid record length at 2/BBE0E5A8: wanted 24, got 0
> > 
> > But that seems normal. I get that error on my other database clusters, too.
> > 
> > I managed to extract the failing logical decoding data from the publisher, if
> > that helps:
> > 
> > 
> > SELECT * FROM pg_logical_slot_peek_binary_changes('l5_hometv', NULL, NULL, 'proto_version', '1',
'publication_names','pub');
 
> > 
> > 2/BBD86EA0 | 56395 | \x4200000002bbd880b800023acd790ce5510000dc4b
> > 2/BBD87E90 | 56395 |
\x5200004a687075626c696300766964656f73006400080169640000000017ffffffff007469746c650000000019ffffffff00636f7665725f696d6167650000000011ffffffff006d657461646174610000000edafffffff
> >
f0063617465676f72790000000017ffffffff007075626c6973686564000000043affffffff006164646564000000045affffffff00706c617965640000000010ffffffff
> > 2/BBD87E90 | 56395 |
\x5500004a684e0008740000000438333933740000005650617a6465726b613a204f206dc3a964696120736520706f7665646520626f6a2e204b64796279206ec3a17320706f6c6974696369206d696c6f76616c692c20627
> >
96c6f206279206ec49b636f20c5a17061746ec49b7574000001397b226964223a20226430313064343430303965323131656162323539616331663662323230656538222c202264617465223a2022323031392d31312d3138222c20226e616d65223a20
> >
2250617a6465726b613a204f206dc3a964696120736520706f7665646520626f6a2e204b64796279206ec3a17320706f6c6974696369206d696c6f76616c692c2062796c6f206279206ec49b636f20c5a17061746ec49b222c2022696d616765223a202
> >
268747470733a2f2f63646e2e7873642e637a2f726573697a652f63353535656239633131353333313632386164666539396237343534353731655f657874726163743d302c302c313931392c313038305f726573697a653d3732302c3430355f2e6a70
> >
673f686173683d6362316362623836336230353361613561333761346666616439303865303431227d7400000003323432740000000a323031392d31312d3138740000001a323031392d31312d31382031323a35303a30312e383136333736740000000
> > 174
> > 2/BBD880E8 | 56395 | \x430000000002bbd880b800000002bbd880e800023acd790ce551
> > 
> 
> Can you show the whole transaction? From the WAL dump it seems it just
> did a single UPDATE:

I'm not sure what you mean. The above is the whole output I got from that
SELECT.

> rmgr: Heap        len (rec/tot):     59/  4075, tx:      56395, lsn: 2/BBD86EA0, prev 2/BBD86E68, desc: LOCK off 2:
xid56395: flags 0 LOCK_ONLY EXCL_LOCK
 
>     blkref #0: rel 1663/19030/19048 fork main blk 415 (FPW); hole: offset: 56, length: 4176
> rmgr: Heap        len (rec/tot):    523/   523, tx:      56395, lsn: 2/BBD87E90, prev 2/BBD86EA0, desc: HOT_UPDATE
off2 xmax 56395 ; new off 9 xmax 0
 
>     blkref #0: rel 1663/19030/19048 fork main blk 415
> rmgr: Transaction len (rec/tot):     46/    46, tx:      56395, lsn: 2/BBD880B8, prev 2/BBD87E90, desc: COMMIT
2019-11-2023:28:47.016273 CET
 
> 
> Which fields does it update?

The only update that my system does in this database is:

UPDATE videos SET played = TRUE WHERE id = ?

The rest is just inserts:

INSERT INTO categories (name, metadata, cover_image, provider) VALUES (?, ?, ?, (SELECT id FROM providers WHERE
system_name= ?))
 
INSERT INTO videos (title, metadata, cover_image, category, published) VALUES (?, ?, ?, ?, ?)

There are no other data modification statements executed.

There may be updates in other databases in the same cluster, but replication
worker is segfaulting only on the "hometv" database.

regards,
    o.

> regards
> 
> -- 
> Tomas Vondra                  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Ondřej Jirman
Дата:
Сообщение: Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker