Обсуждение: BUG #16129: Segfault in tts_virtual_materialize in logical replication worker

Поиск

Список

Период

Сортировка

BUG #16129: Segfault in tts_virtual_materialize in logical replication worker

От

PG Bug reporting form

Дата:

21 ноября 2019 г., 04:14:18

The following bug has been logged on the website:

Bug reference:      16129
Logged by:          Ondrej Jirman
Email address:      ienieghapheoghaiwida@xff.cz
PostgreSQL version: 12.1
Operating system:   Arch Linux
Description:

Hello,

I've upgraded my main PostgreSQL cluster from 11.5 to 12.1 via pg_dumpall
method and after a while I started getting segfault in logical replication
worker.

My setup is fairly vanilla, non-default options:

shared_buffers = 256MB
work_mem = 512MB
temp_buffers = 64MB
maintenance_work_mem = 4GB
effective_cache_size = 16GB
max_logical_replication_workers = 30
max_replication_slots = 30
max_worker_processes = 30
wal_level = logical

I have several databases that I subscribe to from this database cluster
using logical replication. 

Replication of one of my databases (running on ARMv7 machine) started
segfaulting on the subscriber side (x86_64) like this:

#0  0x00007fc259739917 in __memmove_sse2_unaligned_erms () from
/usr/lib/libc.so.6
#1  0x000055d033e93d44 in memcpy (__len=620701425, __src=<optimized out>,
__dest=0x55d0356da804) at /usr/include/bits/string_fortified.h:34
#2  tts_virtual_materialize (slot=0x55d0356da3b8) at execTuples.c:235
#3  0x000055d033e94d32 in ExecFetchSlotHeapTuple
(slot=slot@entry=0x55d0356da3b8, materialize=materialize@entry=true,
shouldFree=shouldFree@entry=0x7fff0e7cf387) at execTuples.c:1624
#4  0x000055d033d272a1 in heapam_tuple_update (relation=0x7fc24573a228,
otid=0x55d0356dac38, slot=0x55d0356da3b8, cid=0, snapshot=<optimized out>,
crosscheck=0x0, wait=true, tmfd=0x7fff0e7cf400, 
    lockmode=0x7fff0e7cf3fc, update_indexes=0x7fff0e7cf457) at
heapam_handler.c:325
#5  0x000055d033d537cd in table_tuple_update (update_indexes=0x7fff0e7cf457,
lockmode=0x7fff0e7cf3fc, tmfd=0x7fff0e7cf400, wait=true, crosscheck=0x0,
snapshot=0x0, cid=<optimized out>, 
    slot=0x55d0356da3b8, otid=0x55d0356dac38, rel=0x7fc24573a228) at
../../../../src/include/access/tableam.h:1261
#6  simple_table_tuple_update (rel=rel@entry=0x7fc24573a228,
otid=otid@entry=0x55d0356dac38, slot=slot@entry=0x55d0356da3b8,
snapshot=0x0, update_indexes=update_indexes@entry=0x7fff0e7cf457)
    at tableam.c:325
#7  0x000055d033e914c4 in ExecSimpleRelationUpdate
(estate=estate@entry=0x55d03572cff0, epqstate=epqstate@entry=0x7fff0e7cf4b0,
searchslot=searchslot@entry=0x55d0356dac08, 
    slot=slot@entry=0x55d0356da3b8) at execReplication.c:493
#8  0x000055d033f9158e in apply_handle_update (s=s@entry=0x7fff0e7d6a80) at
worker.c:750
#9  0x000055d033f91ab0 in apply_dispatch (s=s@entry=0x7fff0e7d6a80) at
worker.c:968
#10 0x000055d033f9206e in LogicalRepApplyLoop (last_received=11741462160) at
worker.c:1176
#11 0x000055d033f927c4 in ApplyWorkerMain (main_arg=<optimized out>) at
worker.c:1734
#12 0x000055d033f601d5 in StartBackgroundWorker () at bgworker.c:834
#13 0x000055d033f6e818 in do_start_bgworker (rw=rw@entry=0x55d035620780) at
postmaster.c:5770
#14 0x000055d033f6e9a4 in maybe_start_bgworkers () at postmaster.c:5996
#15 0x000055d033f6f216 in sigusr1_handler (postgres_signal_arg=<optimized
out>) at postmaster.c:5167
#16 <signal handler called>
#17 0x00007fc259787e2a in select () from /usr/lib/libc.so.6
#18 0x000055d033f6fb37 in ServerLoop () at postmaster.c:1668
#19 0x000055d033f70c5a in PostmasterMain (argc=3, argv=<optimized out>) at
postmaster.c:1377
#20 0x000055d033ed9731 in main (argc=3, argv=0x55d0355f3260) at main.c:228


Subscriber:

2019-11-20 23:22:50.315 CET [1480050] LOG:  data stream from publisher has
ended
2019-11-20 23:22:50.315 CET [1480050] ERROR:  could not send
end-of-streaming message to primary: COPY se neprovádí
2019-11-20 23:22:50.318 CET [178527] LOG:  background worker "logical
replication worker" (PID 1480050) exited with exit code 1
2019-11-20 23:22:50.320 CET [1480051] LOG:  data stream from publisher has
ended
2019-11-20 23:22:50.320 CET [1480051] ERROR:  could not send
end-of-streaming message to primary: COPY se neprovádí
2019-11-20 23:22:50.321 CET [178527] LOG:  background worker "logical
replication worker" (PID 1480051) exited with exit code 1
2019-11-20 23:22:50.321 CET [2195905] LOG:  logical replication apply worker
for subscription "l5_hometv" has started
2019-11-20 23:22:50.325 CET [1480052] LOG:  data stream from publisher has
ended
2019-11-20 23:22:50.325 CET [1480052] ERROR:  could not send
end-of-streaming message to primary: COPY se neprovádí
2019-11-20 23:22:50.326 CET [178527] LOG:  background worker "logical
replication worker" (PID 1480052) exited with exit code 1
2019-11-20 23:22:50.395 CET [2195905] ERROR:  could not connect to the
publisher: FATAL:  the database system is shutting down
        FATAL:  the database system is shutting down
2019-11-20 23:22:50.396 CET [178527] LOG:  background worker "logical
replication worker" (PID 2195905) exited with exit code 1
2019-11-20 23:22:55.401 CET [2195919] LOG:  logical replication apply worker
for subscription "l5_hometv" has started
2019-11-20 23:22:55.411 CET [2195920] LOG:  logical replication apply worker
for subscription "l5_ledger" has started
2019-11-20 23:22:55.421 CET [2195921] LOG:  logical replication apply worker
for subscription "l5_news" has started
2019-11-20 23:28:52.488 CET [178527] LOG:  background worker "logical
replication worker" (PID 2195919) was terminated by signal 11: Neoprávněný
přístup do pam>
2019-11-20 23:28:52.488 CET [178527] LOG:  terminating any other active
server processes
2019-11-20 23:28:52.488 CET [2195470] WARNING:  terminating connection
because of crash of another server process
2019-11-20 23:28:52.488 CET [2195470] DETAIL:  The postmaster has commanded
this server process to roll back the current transaction and exit, because
another >
2019-11-20 23:28:52.488 CET [2195470] HINT:  In a moment you should be able
to reconnect to the database and repeat your command.

Logs for publisher and subscriber follow. I think that the initial segfault
on the subscriber was tirggered by restart of the publisher machine. From
that point forward, subscriber segfaults on each connection to the
publisher. It may or may not be relevant to the upgrade, but looks like the
code in execTuples.c was changed quite a bit in 12.x, so it may.

Publisher:

lis 20 23:29:02 l5 postgres[355]: 2019-11-20 23:29:02.983 CET [466] LOG:
unexpected EOF on standby connection
lis 20 23:29:02 l5 postgres[355]: 2019-11-20 23:29:02.983 CET [467] LOG:
unexpected EOF on standby connection
lis 20 23:29:02 l5 postgres[355]: 2019-11-20 23:29:02.982 CET [466] LOG:
could not receive data from client: Spojení zrušeno druhou stranou
lis 20 23:29:02 l5 postgres[355]: 2019-11-20 23:29:02.982 CET [467] LOG:
could not receive data from client: Spojení zrušeno druhou stranou
lis 20 23:29:02 l5 postgres[355]: 2019-11-20 23:29:02.980 CET [465] LOG:
unexpected EOF on standby connection
lis 20 23:29:02 l5 postgres[355]: 2019-11-20 23:29:02.980 CET [465] LOG:
could not receive data from client: Spojení zrušeno druhou stranou
lis 20 23:28:54 l5 postgres[355]: 2019-11-20 23:28:54.588 CET [467] DETAIL:
There are no running transactions.
lis 20 23:28:54 l5 postgres[355]: 2019-11-20 23:28:54.588 CET [467] LOG:
logical decoding found consistent point at 2/BBD880E8
lis 20 23:28:54 l5 postgres[355]: 2019-11-20 23:28:54.587 CET [467] DETAIL:
Streaming transactions committing after 2/BBD88120, reading WAL from
2/BBD880E8.
lis 20 23:28:54 l5 postgres[355]: 2019-11-20 23:28:54.587 CET [467] LOG:
starting logical decoding for slot "l5_news"
lis 20 23:28:54 l5 postgres[355]: 2019-11-20 23:28:54.579 CET [466] DETAIL:
There are no running transactions.
lis 20 23:28:54 l5 postgres[355]: 2019-11-20 23:28:54.579 CET [466] LOG:
logical decoding found consistent point at 2/BBD880E8
lis 20 23:28:54 l5 postgres[355]: 2019-11-20 23:28:54.578 CET [466] DETAIL:
Streaming transactions committing after 2/BBD88120, reading WAL from
2/BBD880E8.
lis 20 23:28:54 l5 postgres[355]: 2019-11-20 23:28:54.578 CET [466] LOG:
starting logical decoding for slot "l5_ledger"
lis 20 23:28:54 l5 postgres[355]: 2019-11-20 23:28:54.572 CET [465] DETAIL:
There are no running transactions.
lis 20 23:28:54 l5 postgres[355]: 2019-11-20 23:28:54.572 CET [465] LOG:
logical decoding found consistent point at 2/BBD86E68
lis 20 23:28:54 l5 postgres[355]: 2019-11-20 23:28:54.570 CET [465] DETAIL:
Streaming transactions committing after 2/BBD86EA0, reading WAL from
2/BBD86E68.
lis 20 23:28:54 l5 postgres[355]: 2019-11-20 23:28:54.570 CET [465] LOG:
starting logical decoding for slot "l5_hometv"
lis 20 23:28:52 l5 postgres[355]: 2019-11-20 23:28:52.495 CET [385] LOG:
unexpected EOF on standby connection
lis 20 23:28:52 l5 postgres[355]: 2019-11-20 23:28:52.495 CET [385] LOG:
could not receive data from client: Spojení zrušeno druhou stranou
lis 20 23:28:52 l5 postgres[355]: 2019-11-20 23:28:52.491 CET [384] LOG:
unexpected EOF on standby connection
lis 20 23:28:52 l5 postgres[355]: 2019-11-20 23:28:52.491 CET [384] LOG:
could not receive data from client: Spojení zrušeno druhou stranou
lis 20 23:28:52 l5 postgres[355]: 2019-11-20 23:28:52.489 CET [386] LOG:
unexpected EOF on standby connection
lis 20 23:28:52 l5 postgres[355]: 2019-11-20 23:28:52.489 CET [386] LOG:
could not receive data from client: Spojení zrušeno druhou stranou


lis 20 23:22:58 l5 postgres[355]: 2019-11-20 23:22:58.107 CET [386] DETAIL:
There are no running transactions.
lis 20 23:22:58 l5 postgres[355]: 2019-11-20 23:22:58.107 CET [386] LOG:
logical decoding found consistent point at 2/BBD86CE0
lis 20 23:22:58 l5 postgres[355]: 2019-11-20 23:22:58.107 CET [384] DETAIL:
There are no running transactions.
lis 20 23:22:58 l5 postgres[355]: 2019-11-20 23:22:58.107 CET [384] LOG:
logical decoding found consistent point at 2/BBD86CE0
lis 20 23:22:58 l5 postgres[355]: 2019-11-20 23:22:58.107 CET [385] DETAIL:
There are no running transactions.
lis 20 23:22:58 l5 postgres[355]: 2019-11-20 23:22:58.107 CET [385] LOG:
logical decoding found consistent point at 2/BBD86CE0
lis 20 23:22:58 l5 postgres[355]: 2019-11-20 23:22:58.106 CET [386] DETAIL:
Streaming transactions committing after 2/BBD86CE0, reading WAL from
2/BBD86CE0.
lis 20 23:22:58 l5 postgres[355]: 2019-11-20 23:22:58.106 CET [386] LOG:
starting logical decoding for slot "l5_hometv"
lis 20 23:22:58 l5 postgres[355]: 2019-11-20 23:22:58.106 CET [384] DETAIL:
Streaming transactions committing after 2/BBD86CE0, reading WAL from
2/BBD86CE0.
lis 20 23:22:58 l5 postgres[355]: 2019-11-20 23:22:58.106 CET [384] LOG:
starting logical decoding for slot "l5_ledger"
lis 20 23:22:58 l5 postgres[355]: 2019-11-20 23:22:58.106 CET [385] DETAIL:
Streaming transactions committing after 2/BBD86CE0, reading WAL from
2/BBD86CE0.
lis 20 23:22:58 l5 postgres[355]: 2019-11-20 23:22:58.106 CET [385] LOG:
starting logical decoding for slot "l5_news"
lis 20 23:22:54 l5 systemd[1]: Started PostgreSQL database server.
lis 20 23:22:54 l5 postgres[355]: 2019-11-20 23:22:54.295 CET [355] LOG:
database system is ready to accept connections
lis 20 23:22:54 l5 postgres[355]: 2019-11-20 23:22:54.146 CET [356] LOG:
database system was shut down at 2019-11-20 23:22:50 CET
lis 20 23:22:54 l5 systemd-networkd[326]: eth0: Gained IPv6LL
lis 20 23:22:53 l5 postgres[355]: 2019-11-20 23:22:53.544 CET [355] LOG:
listening on Unix socket "/run/postgresql/.s.PGSQL.5432"
lis 20 23:22:53 l5 postgres[355]: 2019-11-20 23:22:53.539 CET [355] LOG:
listening on IPv6 address "::", port 5432
lis 20 23:22:53 l5 postgres[355]: 2019-11-20 23:22:53.539 CET [355] LOG:
listening on IPv4 address "0.0.0.0", port 5432


thank you and regards,
   Ondrej

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Tomas Vondra

Дата:

21 ноября 2019 г., 13:39:40

On Thu, Nov 21, 2019 at 01:14:18AM +0000, PG Bug reporting form wrote:
>The following bug has been logged on the website:
>
>Bug reference:      16129
>Logged by:          Ondrej Jirman
>Email address:      ienieghapheoghaiwida@xff.cz
>PostgreSQL version: 12.1
>Operating system:   Arch Linux
>Description:
>
>Hello,
>
>I've upgraded my main PostgreSQL cluster from 11.5 to 12.1 via pg_dumpall
>method and after a while I started getting segfault in logical replication
>worker.
>
>My setup is fairly vanilla, non-default options:
>
>shared_buffers = 256MB
>work_mem = 512MB
>temp_buffers = 64MB
>maintenance_work_mem = 4GB
>effective_cache_size = 16GB
>max_logical_replication_workers = 30
>max_replication_slots = 30
>max_worker_processes = 30
>wal_level = logical
>
>I have several databases that I subscribe to from this database cluster
>using logical replication.
>
>Replication of one of my databases (running on ARMv7 machine) started
>segfaulting on the subscriber side (x86_64) like this:
>
>#0  0x00007fc259739917 in __memmove_sse2_unaligned_erms () from
>/usr/lib/libc.so.6
>#1  0x000055d033e93d44 in memcpy (__len=620701425, __src=<optimized out>,
>__dest=0x55d0356da804) at /usr/include/bits/string_fortified.h:34
>#2  tts_virtual_materialize (slot=0x55d0356da3b8) at execTuples.c:235
>#3  0x000055d033e94d32 in ExecFetchSlotHeapTuple
>(slot=slot@entry=0x55d0356da3b8, materialize=materialize@entry=true,
>shouldFree=shouldFree@entry=0x7fff0e7cf387) at execTuples.c:1624

Hmmm, so it's failing on this memcpy() in tts_virtual_materialize:

     else
     {
         Size        data_length = 0;

         data = (char *) att_align_nominal(data, att->attalign);
         data_length = att_addlength_datum(data_length, att->attlen, val);

         memcpy(data, DatumGetPointer(val), data_length);

         slot->tts_values[natt] = PointerGetDatum(data);
         data += data_length;
     }

The question is, which of the pointers is bogus. You seem to already
have a core file, so can you inspect the variables in frame #2? I think
especially

  p *slot
  p natt
  p val
  p *att

would be interesting to see.

Also, how does the replicated schema look like? Can we see the table
definitions?

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Ondřej Jirman

Дата:

21 ноября 2019 г., 14:53:26

Hello,

On Thu, Nov 21, 2019 at 11:39:40AM +0100, Tomas Vondra wrote:
> On Thu, Nov 21, 2019 at 01:14:18AM +0000, PG Bug reporting form wrote:
> > The following bug has been logged on the website:
> > 
> > Bug reference:      16129
> > Logged by:          Ondrej Jirman
> > Email address:      ienieghapheoghaiwida@xff.cz
> > PostgreSQL version: 12.1
> > Operating system:   Arch Linux
> > Description:
> > 
> > Hello,
> > 
> > I've upgraded my main PostgreSQL cluster from 11.5 to 12.1 via pg_dumpall
> > method and after a while I started getting segfault in logical replication
> > worker.
> > 
> > My setup is fairly vanilla, non-default options:
> > 
> > shared_buffers = 256MB
> > work_mem = 512MB
> > temp_buffers = 64MB
> > maintenance_work_mem = 4GB
> > effective_cache_size = 16GB
> > max_logical_replication_workers = 30
> > max_replication_slots = 30
> > max_worker_processes = 30
> > wal_level = logical
> > 
> > I have several databases that I subscribe to from this database cluster
> > using logical replication.
> > 
> > Replication of one of my databases (running on ARMv7 machine) started
> > segfaulting on the subscriber side (x86_64) like this:
> > 
> > #0  0x00007fc259739917 in __memmove_sse2_unaligned_erms () from
> > /usr/lib/libc.so.6
> > #1  0x000055d033e93d44 in memcpy (__len=620701425, __src=<optimized out>,
> > __dest=0x55d0356da804) at /usr/include/bits/string_fortified.h:34
> > #2  tts_virtual_materialize (slot=0x55d0356da3b8) at execTuples.c:235
> > #3  0x000055d033e94d32 in ExecFetchSlotHeapTuple
> > (slot=slot@entry=0x55d0356da3b8, materialize=materialize@entry=true,
> > shouldFree=shouldFree@entry=0x7fff0e7cf387) at execTuples.c:1624

I forgot to add that publisher is still PostgreSQL 11.5.

> Hmmm, so it's failing on this memcpy() in tts_virtual_materialize:
> 
>     else
>     {
>         Size        data_length = 0;
> 
>         data = (char *) att_align_nominal(data, att->attalign);
>         data_length = att_addlength_datum(data_length, att->attlen, val);
> 
>         memcpy(data, DatumGetPointer(val), data_length);

This is a bytea column in one of the tables on the publisher.

So this looks like it's segfaulting while trying to copy too much bytes
(data_length determined as 620,701,425 bytes):

#1  0x000055d033e93d44 in memcpy (__len=620701425, __src=<optimized out>, __dest=0x55d0356da804) at
/usr/include/bits/string_fortified.h:34

But maximum length of any bytea valaue in the publisher database is <200kB.

> 
>         slot->tts_values[natt] = PointerGetDatum(data);
>         data += data_length;
>     }
> 
> The question is, which of the pointers is bogus. You seem to already
> have a core file, so can you inspect the variables in frame #2? I think
> especially
> 
>  p *slot
>  p natt
>  p val
>  p *att
> 
> would be interesting to see.

(gdb) p *slot
$1 = {type = T_TupleTableSlot, tts_flags = 20, tts_nvalid = 8, tts_ops = 0x558149ff4da0 <TTSOpsVirtual>,
tts_tupleDescriptor= 0x7fcca2ea7548, tts_values = 0x55814adfbc10,
 
  tts_isnull = 0x55814adfbc50, tts_mcxt = 0x55814adfb6e0, tts_tid = {ip_blkid = {bi_hi = 65535, bi_lo = 65535},
ip_posid= 0}, tts_tableOid = 0}
 
(gdb) p natt
$2 = 2
(gdb) p val
$3 = <optimized out>
(gdb) p slot->tts_values[nat]
No symbol "nat" in current context.
(gdb) p slot->tts_values[natt]
$4 = 94013795319824
(gdb) p *slot->tts_values[natt]
$5 = -1812161596
(gdb) p *att
$6 = {attrelid = 55240, attname = {data = "cover_image", '\000' <repeats 52 times>}, atttypid = 17, attstattarget = -1,
attlen= -1, attnum = 3, attndims = 0, attcacheoff = -1, atttypmod = -1,
 
  attbyval = false, attstorage = 120 'x', attalign = 105 'i', attnotnull = true, atthasdef = false, atthasmissing =
false,attidentity = 0 '\000', attgenerated = 0 '\000', attisdropped = false,
 
  attislocal = true, attinhcount = 0, attcollation = 0}

> Also, how does the replicated schema look like? Can we see the table
> definitions?

SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SELECT pg_catalog.set_config('search_path', '', false);
SET check_function_bodies = false;
SET xmloption = content;
SET client_min_messages = warning;
SET row_security = off;
SET default_tablespace = '';
SET default_table_access_method = heap;
CREATE TABLE public.categories (
    id integer NOT NULL,
    name text NOT NULL,
    description text,
    metadata jsonb NOT NULL,
    provider integer NOT NULL,
    subscribed boolean DEFAULT false NOT NULL,
    cover_image bytea
);
CREATE SEQUENCE public.categories_id_seq
    START WITH 1
    INCREMENT BY 1
    NO MINVALUE
    NO MAXVALUE
    CACHE 1;
ALTER SEQUENCE public.categories_id_seq OWNED BY public.categories.id;
CREATE TABLE public.providers (
    id integer NOT NULL,
    system_name text NOT NULL,
    name text NOT NULL
);
CREATE SEQUENCE public.providers_id_seq
    START WITH 1
    INCREMENT BY 1
    NO MINVALUE
    NO MAXVALUE
    CACHE 1;
ALTER SEQUENCE public.providers_id_seq OWNED BY public.providers.id;
CREATE TABLE public.videos (
    id integer NOT NULL,
    title text NOT NULL,
    cover_image bytea NOT NULL,
    metadata jsonb NOT NULL,
    category integer NOT NULL,
    published date NOT NULL,
    added timestamp without time zone DEFAULT now() NOT NULL,
    played boolean DEFAULT false NOT NULL
);
CREATE SEQUENCE public.videos_id_seq
    START WITH 1
    INCREMENT BY 1
    NO MINVALUE
    NO MAXVALUE
    CACHE 1;
ALTER SEQUENCE public.videos_id_seq OWNED BY public.videos.id;
ALTER TABLE ONLY public.categories ALTER COLUMN id SET DEFAULT nextval('public.categories_id_seq'::regclass);
ALTER TABLE ONLY public.providers ALTER COLUMN id SET DEFAULT nextval('public.providers_id_seq'::regclass);
ALTER TABLE ONLY public.videos ALTER COLUMN id SET DEFAULT nextval('public.videos_id_seq'::regclass);
ALTER TABLE ONLY public.categories
    ADD CONSTRAINT categories_pkey PRIMARY KEY (id);
ALTER TABLE ONLY public.providers
    ADD CONSTRAINT providers_pkey PRIMARY KEY (id);
ALTER TABLE ONLY public.providers
    ADD CONSTRAINT providers_system_name_key UNIQUE (system_name);
ALTER TABLE ONLY public.videos
    ADD CONSTRAINT videos_pkey PRIMARY KEY (id);
ALTER TABLE ONLY public.categories
    ADD CONSTRAINT categories_provider_fkey FOREIGN KEY (provider) REFERENCES public.providers(id);
ALTER TABLE ONLY public.videos
    ADD CONSTRAINT videos_category_fkey FOREIGN KEY (category) REFERENCES public.categories(id);
CREATE SUBSCRIPTION l5_hometv CONNECTION 'host=redacted port=5432 user=redacted password=redacted dbname=hometv'
PUBLICATIONpub WITH (connect = false, slot_name = 'l5_hometv');
 


Publisher:


SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SELECT pg_catalog.set_config('search_path', '', false);
SET check_function_bodies = false;
SET xmloption = content;
SET client_min_messages = warning;
SET row_security = off;
SET default_tablespace = '';
SET default_with_oids = false;
CREATE TABLE public.categories (
    id integer NOT NULL,
    name text NOT NULL,
    description text,
    metadata jsonb NOT NULL,
    provider integer NOT NULL,
    subscribed boolean DEFAULT false NOT NULL,
    cover_image bytea
);
CREATE SEQUENCE public.categories_id_seq
    START WITH 1
    INCREMENT BY 1
    NO MINVALUE
    NO MAXVALUE
    CACHE 1;
ALTER SEQUENCE public.categories_id_seq OWNED BY public.categories.id;
CREATE TABLE public.providers (
    id integer NOT NULL,
    system_name text NOT NULL,
    name text NOT NULL
);
CREATE SEQUENCE public.providers_id_seq
    START WITH 1
    INCREMENT BY 1
    NO MINVALUE
    NO MAXVALUE
    CACHE 1;
ALTER SEQUENCE public.providers_id_seq OWNED BY public.providers.id;
CREATE TABLE public.videos (
    id integer NOT NULL,
    title text NOT NULL,
    cover_image bytea NOT NULL,
    metadata jsonb NOT NULL,
    category integer NOT NULL,
    published date NOT NULL,
    added timestamp without time zone DEFAULT now() NOT NULL,
    played boolean DEFAULT false NOT NULL
);
CREATE SEQUENCE public.videos_id_seq
    START WITH 1
    INCREMENT BY 1
    NO MINVALUE
    NO MAXVALUE
    CACHE 1;
ALTER SEQUENCE public.videos_id_seq OWNED BY public.videos.id;
ALTER TABLE ONLY public.categories ALTER COLUMN id SET DEFAULT nextval('public.categories_id_seq'::regclass);
ALTER TABLE ONLY public.providers ALTER COLUMN id SET DEFAULT nextval('public.providers_id_seq'::regclass);
ALTER TABLE ONLY public.videos ALTER COLUMN id SET DEFAULT nextval('public.videos_id_seq'::regclass);
ALTER TABLE ONLY public.categories
    ADD CONSTRAINT categories_pkey PRIMARY KEY (id);
ALTER TABLE ONLY public.providers
    ADD CONSTRAINT providers_pkey PRIMARY KEY (id);
ALTER TABLE ONLY public.providers
    ADD CONSTRAINT providers_system_name_key UNIQUE (system_name);
ALTER TABLE ONLY public.videos
    ADD CONSTRAINT videos_pkey PRIMARY KEY (id);
ALTER TABLE ONLY public.categories
    ADD CONSTRAINT categories_provider_fkey FOREIGN KEY (provider) REFERENCES public.providers(id);
ALTER TABLE ONLY public.videos
    ADD CONSTRAINT videos_category_fkey FOREIGN KEY (category) REFERENCES public.categories(id);
CREATE PUBLICATION pub FOR ALL TABLES WITH (publish = 'insert, update, delete, truncate');


thank you and regards,
     Ondrej

> regards
> 
> -- 
> Tomas Vondra                  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Ondřej Jirman

Дата:

21 ноября 2019 г., 15:59:13

On Thu, Nov 21, 2019 at 12:53:26PM +0100, postgresql wrote:
> Hello,
> 
> On Thu, Nov 21, 2019 at 11:39:40AM +0100, Tomas Vondra wrote:
> > On Thu, Nov 21, 2019 at 01:14:18AM +0000, PG Bug reporting form wrote:
> > > Replication of one of my databases (running on ARMv7 machine) started
> > > segfaulting on the subscriber side (x86_64) like this:
> > > 
> > > #0  0x00007fc259739917 in __memmove_sse2_unaligned_erms () from
> > > /usr/lib/libc.so.6
> > > #1  0x000055d033e93d44 in memcpy (__len=620701425, __src=<optimized out>,
> > > __dest=0x55d0356da804) at /usr/include/bits/string_fortified.h:34
> > > #2  tts_virtual_materialize (slot=0x55d0356da3b8) at execTuples.c:235
> > > #3  0x000055d033e94d32 in ExecFetchSlotHeapTuple
> > > (slot=slot@entry=0x55d0356da3b8, materialize=materialize@entry=true,
> > > shouldFree=shouldFree@entry=0x7fff0e7cf387) at execTuples.c:1624
> 
> I forgot to add that publisher is still PostgreSQL 11.5.
> 

I can also add that I have data checksumming enabled on both ends, and 
it did not detect any corruption:

# pg_verify_checksums -D /var/lib/postgres/data
Checksum scan completed
Data checksum version: 1
Files scanned:  1751
Blocks scanned: 86592
Bad checksums:  0

# pg_checksums /var/lib/postgres/data
Checksum operation completed
Files scanned:  22777
Blocks scanned: 3601527
Bad checksums:  0
Data checksum version: 1

WAL log on the publisher is also dumpable to a state hours after the issues
started:

I've put the dump here, if it's of any use: https://megous.com/dl/tmp/wal_dump.txt

Dump ends with:

pg_waldump: FATAL:  error in WAL record at 2/BBE0E538: invalid record length at 2/BBE0E5A8: wanted 24, got 0

But that seems normal. I get that error on my other database clusters, too.

I managed to extract the failing logical decoding data from the publisher, if
that helps:


SELECT * FROM pg_logical_slot_peek_binary_changes('l5_hometv', NULL, NULL, 'proto_version', '1', 'publication_names',
'pub');

                                                                              
 
 2/BBD86EA0 | 56395 | \x4200000002bbd880b800023acd790ce5510000dc4b
 2/BBD87E90 | 56395 |
\x5200004a687075626c696300766964656f73006400080169640000000017ffffffff007469746c650000000019ffffffff00636f7665725f696d6167650000000011ffffffff006d657461646174610000000edafffffff

f0063617465676f72790000000017ffffffff007075626c6973686564000000043affffffff006164646564000000045affffffff00706c617965640000000010ffffffff
 2/BBD87E90 | 56395 |
\x5500004a684e0008740000000438333933740000005650617a6465726b613a204f206dc3a964696120736520706f7665646520626f6a2e204b64796279206ec3a17320706f6c6974696369206d696c6f76616c692c20627

96c6f206279206ec49b636f20c5a17061746ec49b7574000001397b226964223a20226430313064343430303965323131656162323539616331663662323230656538222c202264617465223a2022323031392d31312d3138222c20226e616d65223a20

2250617a6465726b613a204f206dc3a964696120736520706f7665646520626f6a2e204b64796279206ec3a17320706f6c6974696369206d696c6f76616c692c2062796c6f206279206ec49b636f20c5a17061746ec49b222c2022696d616765223a202

268747470733a2f2f63646e2e7873642e637a2f726573697a652f63353535656239633131353333313632386164666539396237343534353731655f657874726163743d302c302c313931392c313038305f726573697a653d3732302c3430355f2e6a70

673f686173683d6362316362623836336230353361613561333761346666616439303865303431227d7400000003323432740000000a323031392d31312d3138740000001a323031392d31312d31382031323a35303a30312e383136333736740000000
174
 2/BBD880E8 | 56395 | \x430000000002bbd880b800000002bbd880e800023acd790ce551


thank you and regards,
    o.

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Tomas Vondra

Дата:

21 ноября 2019 г., 15:59:30

On Thu, Nov 21, 2019 at 12:53:26PM +0100, Ondřej Jirman wrote:
>Hello,
>
>On Thu, Nov 21, 2019 at 11:39:40AM +0100, Tomas Vondra wrote:
>> On Thu, Nov 21, 2019 at 01:14:18AM +0000, PG Bug reporting form wrote:
>> > The following bug has been logged on the website:
>> >
>> > Bug reference:      16129
>> > Logged by:          Ondrej Jirman
>> > Email address:      ienieghapheoghaiwida@xff.cz
>> > PostgreSQL version: 12.1
>> > Operating system:   Arch Linux
>> > Description:
>> >
>> > Hello,
>> >
>> > I've upgraded my main PostgreSQL cluster from 11.5 to 12.1 via pg_dumpall
>> > method and after a while I started getting segfault in logical replication
>> > worker.
>> >
>> > My setup is fairly vanilla, non-default options:
>> >
>> > shared_buffers = 256MB
>> > work_mem = 512MB
>> > temp_buffers = 64MB
>> > maintenance_work_mem = 4GB
>> > effective_cache_size = 16GB
>> > max_logical_replication_workers = 30
>> > max_replication_slots = 30
>> > max_worker_processes = 30
>> > wal_level = logical
>> >
>> > I have several databases that I subscribe to from this database cluster
>> > using logical replication.
>> >
>> > Replication of one of my databases (running on ARMv7 machine) started
>> > segfaulting on the subscriber side (x86_64) like this:
>> >
>> > #0  0x00007fc259739917 in __memmove_sse2_unaligned_erms () from
>> > /usr/lib/libc.so.6
>> > #1  0x000055d033e93d44 in memcpy (__len=620701425, __src=<optimized out>,
>> > __dest=0x55d0356da804) at /usr/include/bits/string_fortified.h:34
>> > #2  tts_virtual_materialize (slot=0x55d0356da3b8) at execTuples.c:235
>> > #3  0x000055d033e94d32 in ExecFetchSlotHeapTuple
>> > (slot=slot@entry=0x55d0356da3b8, materialize=materialize@entry=true,
>> > shouldFree=shouldFree@entry=0x7fff0e7cf387) at execTuples.c:1624
>
>I forgot to add that publisher is still PostgreSQL 11.5.
>
>> Hmmm, so it's failing on this memcpy() in tts_virtual_materialize:
>>
>>     else
>>     {
>>         Size        data_length = 0;
>>
>>         data = (char *) att_align_nominal(data, att->attalign);
>>         data_length = att_addlength_datum(data_length, att->attlen, val);
>>
>>         memcpy(data, DatumGetPointer(val), data_length);
>
>This is a bytea column in one of the tables on the publisher.
>
>So this looks like it's segfaulting while trying to copy too much bytes
>(data_length determined as 620,701,425 bytes):
>
>#1  0x000055d033e93d44 in memcpy (__len=620701425, __src=<optimized out>, __dest=0x55d0356da804) at
/usr/include/bits/string_fortified.h:34
>
>But maximum length of any bytea valaue in the publisher database is <200kB.
>

OK. So the public.videos.cover_image column, I suppose?

Anyway, this likely means the pointer bogus, in some way. Maybe it's
pointing to already freed memory, or something like that.

>>
>>         slot->tts_values[natt] = PointerGetDatum(data);
>>         data += data_length;
>>     }
>>
>> The question is, which of the pointers is bogus. You seem to already
>> have a core file, so can you inspect the variables in frame #2? I think
>> especially
>>
>>  p *slot
>>  p natt
>>  p val
>>  p *att
>>
>> would be interesting to see.
>
>(gdb) p *slot
>$1 = {type = T_TupleTableSlot, tts_flags = 20, tts_nvalid = 8, tts_ops = 0x558149ff4da0 <TTSOpsVirtual>,
tts_tupleDescriptor= 0x7fcca2ea7548, tts_values = 0x55814adfbc10,
 
>  tts_isnull = 0x55814adfbc50, tts_mcxt = 0x55814adfb6e0, tts_tid = {ip_blkid = {bi_hi = 65535, bi_lo = 65535},
ip_posid= 0}, tts_tableOid = 0}
 
>(gdb) p natt
>$2 = 2
>(gdb) p val
>$3 = <optimized out>
>(gdb) p slot->tts_values[nat]
>No symbol "nat" in current context.
>(gdb) p slot->tts_values[natt]
>$4 = 94013795319824
>(gdb) p *slot->tts_values[natt]
>$5 = -1812161596
>(gdb) p *att
>$6 = {attrelid = 55240, attname = {data = "cover_image", '\000' <repeats 52 times>}, atttypid = 17, attstattarget =
-1,attlen = -1, attnum = 3, attndims = 0, attcacheoff = -1, atttypmod = -1,
 
>  attbyval = false, attstorage = 120 'x', attalign = 105 'i', attnotnull = true, atthasdef = false, atthasmissing =
false,attidentity = 0 '\000', attgenerated = 0 '\000', attisdropped = false,
 
>  attislocal = true, attinhcount = 0, attcollation = 0}
>

OK, nothing obviously bobus here, I think. It's not entirely clear to me
why this works

(gdb) p *slot->tts_values[natt]
$5 = -1812161596

because tts_values[natt] is not a pointer, I think. So I'm not sure how
to interpret the dereference. It would be interesting to see the first
couple of bytes? I think somehing like

(gdb) p ((char*)slot->tts_values[natt])[$n]

for $n between 0 and 3 should do the trick.

>> Also, how does the replicated schema look like? Can we see the table
>> definitions?
>
> ...
>CREATE TABLE public.videos (
>    id integer NOT NULL,
>    title text NOT NULL,
>    cover_image bytea NOT NULL,
>    metadata jsonb NOT NULL,
>    category integer NOT NULL,
>    published date NOT NULL,
>    added timestamp without time zone DEFAULT now() NOT NULL,
>    played boolean DEFAULT false NOT NULL
>);

OK, so this is the problematic table. Any idea what operation triggers
the issue? Having a small reproducer (say, a single-row insert or so)
would be very helpful.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Ondřej Jirman

Дата:

21 ноября 2019 г., 16:12:02

On Thu, Nov 21, 2019 at 01:59:30PM +0100, Tomas Vondra wrote:
> On Thu, Nov 21, 2019 at 12:53:26PM +0100, Ondřej Jirman wrote:
> > Hello,
> > 
> > On Thu, Nov 21, 2019 at 11:39:40AM +0100, Tomas Vondra wrote:
> > > On Thu, Nov 21, 2019 at 01:14:18AM +0000, PG Bug reporting form wrote:
> > > > The following bug has been logged on the website:
> > > >
> > > > Bug reference:      16129
> > > > Logged by:          Ondrej Jirman
> > > > Email address:      ienieghapheoghaiwida@xff.cz
> > > > PostgreSQL version: 12.1
> > > > Operating system:   Arch Linux
> > > > Description:
> > > >
> > > > Hello,
> > > >
> > > > I've upgraded my main PostgreSQL cluster from 11.5 to 12.1 via pg_dumpall
> > > > method and after a while I started getting segfault in logical replication
> > > > worker.
> > > >
> > > > My setup is fairly vanilla, non-default options:
> > > >
> > > > shared_buffers = 256MB
> > > > work_mem = 512MB
> > > > temp_buffers = 64MB
> > > > maintenance_work_mem = 4GB
> > > > effective_cache_size = 16GB
> > > > max_logical_replication_workers = 30
> > > > max_replication_slots = 30
> > > > max_worker_processes = 30
> > > > wal_level = logical
> > > >
> > > > I have several databases that I subscribe to from this database cluster
> > > > using logical replication.
> > > >
> > > > Replication of one of my databases (running on ARMv7 machine) started
> > > > segfaulting on the subscriber side (x86_64) like this:
> > > >
> > > > #0  0x00007fc259739917 in __memmove_sse2_unaligned_erms () from
> > > > /usr/lib/libc.so.6
> > > > #1  0x000055d033e93d44 in memcpy (__len=620701425, __src=<optimized out>,
> > > > __dest=0x55d0356da804) at /usr/include/bits/string_fortified.h:34
> > > > #2  tts_virtual_materialize (slot=0x55d0356da3b8) at execTuples.c:235
> > > > #3  0x000055d033e94d32 in ExecFetchSlotHeapTuple
> > > > (slot=slot@entry=0x55d0356da3b8, materialize=materialize@entry=true,
> > > > shouldFree=shouldFree@entry=0x7fff0e7cf387) at execTuples.c:1624
> > 
> > I forgot to add that publisher is still PostgreSQL 11.5.
> > 
> > > Hmmm, so it's failing on this memcpy() in tts_virtual_materialize:
> > > 
> > >     else
> > >     {
> > >         Size        data_length = 0;
> > > 
> > >         data = (char *) att_align_nominal(data, att->attalign);
> > >         data_length = att_addlength_datum(data_length, att->attlen, val);
> > > 
> > >         memcpy(data, DatumGetPointer(val), data_length);
> > 
> > This is a bytea column in one of the tables on the publisher.
> > 
> > So this looks like it's segfaulting while trying to copy too much bytes
> > (data_length determined as 620,701,425 bytes):
> > 
> > #1  0x000055d033e93d44 in memcpy (__len=620701425, __src=<optimized out>, __dest=0x55d0356da804) at
/usr/include/bits/string_fortified.h:34
> > 
> > But maximum length of any bytea valaue in the publisher database is <200kB.
> > 
> 
> OK. So the public.videos.cover_image column, I suppose?
> 
> Anyway, this likely means the pointer bogus, in some way. Maybe it's
> pointing to already freed memory, or something like that.
> 
> > > 
> > >         slot->tts_values[natt] = PointerGetDatum(data);
> > >         data += data_length;
> > >     }
> > > 
> > > The question is, which of the pointers is bogus. You seem to already
> > > have a core file, so can you inspect the variables in frame #2? I think
> > > especially
> > > 
> > >  p *slot
> > >  p natt
> > >  p val
> > >  p *att
> > > 
> > > would be interesting to see.
> > 
> > (gdb) p *slot
> > $1 = {type = T_TupleTableSlot, tts_flags = 20, tts_nvalid = 8, tts_ops = 0x558149ff4da0 <TTSOpsVirtual>,
tts_tupleDescriptor= 0x7fcca2ea7548, tts_values = 0x55814adfbc10,
 
> >  tts_isnull = 0x55814adfbc50, tts_mcxt = 0x55814adfb6e0, tts_tid = {ip_blkid = {bi_hi = 65535, bi_lo = 65535},
ip_posid= 0}, tts_tableOid = 0}
 
> > (gdb) p natt
> > $2 = 2
> > (gdb) p val
> > $3 = <optimized out>
> > (gdb) p slot->tts_values[nat]
> > No symbol "nat" in current context.
> > (gdb) p slot->tts_values[natt]
> > $4 = 94013795319824
> > (gdb) p *slot->tts_values[natt]
> > $5 = -1812161596
> > (gdb) p *att
> > $6 = {attrelid = 55240, attname = {data = "cover_image", '\000' <repeats 52 times>}, atttypid = 17, attstattarget =
-1,attlen = -1, attnum = 3, attndims = 0, attcacheoff = -1, atttypmod = -1,
 
> >  attbyval = false, attstorage = 120 'x', attalign = 105 'i', attnotnull = true, atthasdef = false, atthasmissing =
false,attidentity = 0 '\000', attgenerated = 0 '\000', attisdropped = false,
 
> >  attislocal = true, attinhcount = 0, attcollation = 0}
> > 
> 
> OK, nothing obviously bobus here, I think. It's not entirely clear to me
> why this works
> 
> (gdb) p *slot->tts_values[natt]
> $5 = -1812161596
> 
> because tts_values[natt] is not a pointer, I think. So I'm not sure how
> to interpret the dereference. It would be interesting to see the first
> couple of bytes? I think somehing like

It's probably nothing, the value there just looked big enough be a pointer, so
I dereferenced it in gdb.

> (gdb) p ((char*)slot->tts_values[natt])[$n]
> 
> for $n between 0 and 3 should do the trick.

(gdb)  p ((char*)slot->tts_values[natt])[0]
$7 = -60 '\304'
(gdb)  p ((char*)slot->tts_values[natt])[1]
$8 = -101 '\233'
(gdb)  p ((char*)slot->tts_values[natt])[2]
$9 = -4 '\374'
(gdb)  p ((char*)slot->tts_values[natt])[3]
$10 = -109 '\223'
(gdb)  p ((char*)slot->tts_values[natt])[4]
$11 = 0 '\000'
(gdb)  p ((char*)slot->tts_values[natt])[5]
$12 = 0 '\000'
(gdb)  p ((char*)slot->tts_values[natt])[6]
$13 = -8 '\370'

> > > Also, how does the replicated schema look like? Can we see the table
> > > definitions?
> > 
> > ...
> > CREATE TABLE public.videos (
> >    id integer NOT NULL,
> >    title text NOT NULL,
> >    cover_image bytea NOT NULL,
> >    metadata jsonb NOT NULL,
> >    category integer NOT NULL,
> >    published date NOT NULL,
> >    added timestamp without time zone DEFAULT now() NOT NULL,
> >    played boolean DEFAULT false NOT NULL
> > );
> 
> OK, so this is the problematic table. Any idea what operation triggers
> the issue? Having a small reproducer (say, a single-row insert or so)
> would be very helpful.

Can something be extracted from the logical decoding data I sent in the other
e-mail? That seems to contain the failing operation, but I don't know how to
decode it off the top of my head from that binary representation.

regards,
    o.

> regards
> 
> -- 
> Tomas Vondra                  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Tomas Vondra

Дата:

21 ноября 2019 г., 16:32:37

On Thu, Nov 21, 2019 at 01:59:13PM +0100, Ondřej Jirman wrote:
>On Thu, Nov 21, 2019 at 12:53:26PM +0100, postgresql wrote:
>> Hello,
>>
>> On Thu, Nov 21, 2019 at 11:39:40AM +0100, Tomas Vondra wrote:
>> > On Thu, Nov 21, 2019 at 01:14:18AM +0000, PG Bug reporting form wrote:
>> > > Replication of one of my databases (running on ARMv7 machine) started
>> > > segfaulting on the subscriber side (x86_64) like this:
>> > >
>> > > #0  0x00007fc259739917 in __memmove_sse2_unaligned_erms () from
>> > > /usr/lib/libc.so.6
>> > > #1  0x000055d033e93d44 in memcpy (__len=620701425, __src=<optimized out>,
>> > > __dest=0x55d0356da804) at /usr/include/bits/string_fortified.h:34
>> > > #2  tts_virtual_materialize (slot=0x55d0356da3b8) at execTuples.c:235
>> > > #3  0x000055d033e94d32 in ExecFetchSlotHeapTuple
>> > > (slot=slot@entry=0x55d0356da3b8, materialize=materialize@entry=true,
>> > > shouldFree=shouldFree@entry=0x7fff0e7cf387) at execTuples.c:1624
>>
>> I forgot to add that publisher is still PostgreSQL 11.5.
>>
>
>I can also add that I have data checksumming enabled on both ends, and
>it did not detect any corruption:
>
># pg_verify_checksums -D /var/lib/postgres/data
>Checksum scan completed
>Data checksum version: 1
>Files scanned:  1751
>Blocks scanned: 86592
>Bad checksums:  0
>
># pg_checksums /var/lib/postgres/data
>Checksum operation completed
>Files scanned:  22777
>Blocks scanned: 3601527
>Bad checksums:  0
>Data checksum version: 1
>
>WAL log on the publisher is also dumpable to a state hours after the issues
>started:
>
>I've put the dump here, if it's of any use: https://megous.com/dl/tmp/wal_dump.txt
>
>Dump ends with:
>
>pg_waldump: FATAL:  error in WAL record at 2/BBE0E538: invalid record length at 2/BBE0E5A8: wanted 24, got 0
>
>But that seems normal. I get that error on my other database clusters, too.
>
>I managed to extract the failing logical decoding data from the publisher, if
>that helps:
>
>
>SELECT * FROM pg_logical_slot_peek_binary_changes('l5_hometv', NULL, NULL, 'proto_version', '1', 'publication_names',
'pub');
>
> 2/BBD86EA0 | 56395 | \x4200000002bbd880b800023acd790ce5510000dc4b
> 2/BBD87E90 | 56395 |
\x5200004a687075626c696300766964656f73006400080169640000000017ffffffff007469746c650000000019ffffffff00636f7665725f696d6167650000000011ffffffff006d657461646174610000000edafffffff

>f0063617465676f72790000000017ffffffff007075626c6973686564000000043affffffff006164646564000000045affffffff00706c617965640000000010ffffffff
> 2/BBD87E90 | 56395 |
\x5500004a684e0008740000000438333933740000005650617a6465726b613a204f206dc3a964696120736520706f7665646520626f6a2e204b64796279206ec3a17320706f6c6974696369206d696c6f76616c692c20627

>96c6f206279206ec49b636f20c5a17061746ec49b7574000001397b226964223a20226430313064343430303965323131656162323539616331663662323230656538222c202264617465223a2022323031392d31312d3138222c20226e616d65223a20

>2250617a6465726b613a204f206dc3a964696120736520706f7665646520626f6a2e204b64796279206ec3a17320706f6c6974696369206d696c6f76616c692c2062796c6f206279206ec49b636f20c5a17061746ec49b222c2022696d616765223a202

>268747470733a2f2f63646e2e7873642e637a2f726573697a652f63353535656239633131353333313632386164666539396237343534353731655f657874726163743d302c302c313931392c313038305f726573697a653d3732302c3430355f2e6a70

>673f686173683d6362316362623836336230353361613561333761346666616439303865303431227d7400000003323432740000000a323031392d31312d3138740000001a323031392d31312d31382031323a35303a30312e383136333736740000000
>174
> 2/BBD880E8 | 56395 | \x430000000002bbd880b800000002bbd880e800023acd790ce551
>

Can you show the whole transaction? From the WAL dump it seems it just
did a single UPDATE:

rmgr: Heap        len (rec/tot):     59/  4075, tx:      56395, lsn: 2/BBD86EA0, prev 2/BBD86E68, desc: LOCK off 2: xid
56395:flags 0 LOCK_ONLY EXCL_LOCK
 
    blkref #0: rel 1663/19030/19048 fork main blk 415 (FPW); hole: offset: 56, length: 4176
rmgr: Heap        len (rec/tot):    523/   523, tx:      56395, lsn: 2/BBD87E90, prev 2/BBD86EA0, desc: HOT_UPDATE off
2xmax 56395 ; new off 9 xmax 0
 
    blkref #0: rel 1663/19030/19048 fork main blk 415
rmgr: Transaction len (rec/tot):     46/    46, tx:      56395, lsn: 2/BBD880B8, prev 2/BBD87E90, desc: COMMIT
2019-11-2023:28:47.016273 CET
 

Which fields does it update?

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Tomas Vondra

Дата:

21 ноября 2019 г., 16:37:14

On Thu, Nov 21, 2019 at 02:12:02PM +0100, Ondřej Jirman wrote:
>On Thu, Nov 21, 2019 at 01:59:30PM +0100, Tomas Vondra wrote:
>> On Thu, Nov 21, 2019 at 12:53:26PM +0100, Ondřej Jirman wrote:
>> >
>> > ...
>> >
>> OK, so this is the problematic table. Any idea what operation triggers
>> the issue? Having a small reproducer (say, a single-row insert or so)
>> would be very helpful.
>
>Can something be extracted from the logical decoding data I sent in the other
>e-mail? That seems to contain the failing operation, but I don't know how to
>decode it off the top of my head from that binary representation.
>

Well, why not use pg_logical_slot_peek_changes()? That should be human
readable, I think.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Ondřej Jirman

Дата:

21 ноября 2019 г., 16:46:45

On Thu, Nov 21, 2019 at 02:37:14PM +0100, Tomas Vondra wrote:
> On Thu, Nov 21, 2019 at 02:12:02PM +0100, Ondřej Jirman wrote:
> > On Thu, Nov 21, 2019 at 01:59:30PM +0100, Tomas Vondra wrote:
> > > On Thu, Nov 21, 2019 at 12:53:26PM +0100, Ondřej Jirman wrote:
> > > >
> > > > ...
> > > >
> > > OK, so this is the problematic table. Any idea what operation triggers
> > > the issue? Having a small reproducer (say, a single-row insert or so)
> > > would be very helpful.
> > 
> > Can something be extracted from the logical decoding data I sent in the other
> > e-mail? That seems to contain the failing operation, but I don't know how to
> > decode it off the top of my head from that binary representation.
> > 
> 
> Well, why not use pg_logical_slot_peek_changes()? That should be human
> readable, I think.

I would, but how?

SELECT * FROM pg_logical_slot_peek_changes('l5_hometv', NULL, NULL,
        'proto_version', '1', 'publication_names', 'pub');
ERROR:  logical decoding output plugin "pgoutput" produces binary output, but
function "pg_logical_slot_peek_changes(name,pg_lsn,integer,text[])" expects
textual data

regards,
    o.

> regards
> 
> -- 
> Tomas Vondra                  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Ondřej Jirman

Дата:

21 ноября 2019 г., 16:55:23

On Thu, Nov 21, 2019 at 02:32:37PM +0100, Tomas Vondra wrote:
> On Thu, Nov 21, 2019 at 01:59:13PM +0100, Ondřej Jirman wrote:
> > On Thu, Nov 21, 2019 at 12:53:26PM +0100, postgresql wrote:
> > > Hello,
> > > 
> > > On Thu, Nov 21, 2019 at 11:39:40AM +0100, Tomas Vondra wrote:
> > > > On Thu, Nov 21, 2019 at 01:14:18AM +0000, PG Bug reporting form wrote:
> > > > > Replication of one of my databases (running on ARMv7 machine) started
> > > > > segfaulting on the subscriber side (x86_64) like this:
> > > > >
> > > > > #0  0x00007fc259739917 in __memmove_sse2_unaligned_erms () from
> > > > > /usr/lib/libc.so.6
> > > > > #1  0x000055d033e93d44 in memcpy (__len=620701425, __src=<optimized out>,
> > > > > __dest=0x55d0356da804) at /usr/include/bits/string_fortified.h:34
> > > > > #2  tts_virtual_materialize (slot=0x55d0356da3b8) at execTuples.c:235
> > > > > #3  0x000055d033e94d32 in ExecFetchSlotHeapTuple
> > > > > (slot=slot@entry=0x55d0356da3b8, materialize=materialize@entry=true,
> > > > > shouldFree=shouldFree@entry=0x7fff0e7cf387) at execTuples.c:1624
> > > 
> > > I forgot to add that publisher is still PostgreSQL 11.5.
> > > 
> > 
> > I can also add that I have data checksumming enabled on both ends, and
> > it did not detect any corruption:
> > 
> > # pg_verify_checksums -D /var/lib/postgres/data
> > Checksum scan completed
> > Data checksum version: 1
> > Files scanned:  1751
> > Blocks scanned: 86592
> > Bad checksums:  0
> > 
> > # pg_checksums /var/lib/postgres/data
> > Checksum operation completed
> > Files scanned:  22777
> > Blocks scanned: 3601527
> > Bad checksums:  0
> > Data checksum version: 1
> > 
> > WAL log on the publisher is also dumpable to a state hours after the issues
> > started:
> > 
> > I've put the dump here, if it's of any use: https://megous.com/dl/tmp/wal_dump.txt
> > 
> > Dump ends with:
> > 
> > pg_waldump: FATAL:  error in WAL record at 2/BBE0E538: invalid record length at 2/BBE0E5A8: wanted 24, got 0
> > 
> > But that seems normal. I get that error on my other database clusters, too.
> > 
> > I managed to extract the failing logical decoding data from the publisher, if
> > that helps:
> > 
> > 
> > SELECT * FROM pg_logical_slot_peek_binary_changes('l5_hometv', NULL, NULL, 'proto_version', '1',
'publication_names','pub');
 
> > 
> > 2/BBD86EA0 | 56395 | \x4200000002bbd880b800023acd790ce5510000dc4b
> > 2/BBD87E90 | 56395 |
\x5200004a687075626c696300766964656f73006400080169640000000017ffffffff007469746c650000000019ffffffff00636f7665725f696d6167650000000011ffffffff006d657461646174610000000edafffffff
> >
f0063617465676f72790000000017ffffffff007075626c6973686564000000043affffffff006164646564000000045affffffff00706c617965640000000010ffffffff
> > 2/BBD87E90 | 56395 |
\x5500004a684e0008740000000438333933740000005650617a6465726b613a204f206dc3a964696120736520706f7665646520626f6a2e204b64796279206ec3a17320706f6c6974696369206d696c6f76616c692c20627
> >
96c6f206279206ec49b636f20c5a17061746ec49b7574000001397b226964223a20226430313064343430303965323131656162323539616331663662323230656538222c202264617465223a2022323031392d31312d3138222c20226e616d65223a20
> >
2250617a6465726b613a204f206dc3a964696120736520706f7665646520626f6a2e204b64796279206ec3a17320706f6c6974696369206d696c6f76616c692c2062796c6f206279206ec49b636f20c5a17061746ec49b222c2022696d616765223a202
> >
268747470733a2f2f63646e2e7873642e637a2f726573697a652f63353535656239633131353333313632386164666539396237343534353731655f657874726163743d302c302c313931392c313038305f726573697a653d3732302c3430355f2e6a70
> >
673f686173683d6362316362623836336230353361613561333761346666616439303865303431227d7400000003323432740000000a323031392d31312d3138740000001a323031392d31312d31382031323a35303a30312e383136333736740000000
> > 174
> > 2/BBD880E8 | 56395 | \x430000000002bbd880b800000002bbd880e800023acd790ce551
> > 
> 
> Can you show the whole transaction? From the WAL dump it seems it just
> did a single UPDATE:

I'm not sure what you mean. The above is the whole output I got from that
SELECT.

> rmgr: Heap        len (rec/tot):     59/  4075, tx:      56395, lsn: 2/BBD86EA0, prev 2/BBD86E68, desc: LOCK off 2:
xid56395: flags 0 LOCK_ONLY EXCL_LOCK
 
>     blkref #0: rel 1663/19030/19048 fork main blk 415 (FPW); hole: offset: 56, length: 4176
> rmgr: Heap        len (rec/tot):    523/   523, tx:      56395, lsn: 2/BBD87E90, prev 2/BBD86EA0, desc: HOT_UPDATE
off2 xmax 56395 ; new off 9 xmax 0
 
>     blkref #0: rel 1663/19030/19048 fork main blk 415
> rmgr: Transaction len (rec/tot):     46/    46, tx:      56395, lsn: 2/BBD880B8, prev 2/BBD87E90, desc: COMMIT
2019-11-2023:28:47.016273 CET
 
> 
> Which fields does it update?

The only update that my system does in this database is:

UPDATE videos SET played = TRUE WHERE id = ?

The rest is just inserts:

INSERT INTO categories (name, metadata, cover_image, provider) VALUES (?, ?, ?, (SELECT id FROM providers WHERE
system_name= ?))
 
INSERT INTO videos (title, metadata, cover_image, category, published) VALUES (?, ?, ?, ?, ?)

There are no other data modification statements executed.

There may be updates in other databases in the same cluster, but replication
worker is segfaulting only on the "hometv" database.

regards,
    o.

> regards
> 
> -- 
> Tomas Vondra                  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Tomas Vondra

Дата:

21 ноября 2019 г., 17:31:59

On Thu, Nov 21, 2019 at 02:46:45PM +0100, Ondřej Jirman wrote:
>On Thu, Nov 21, 2019 at 02:37:14PM +0100, Tomas Vondra wrote:
>> On Thu, Nov 21, 2019 at 02:12:02PM +0100, Ondřej Jirman wrote:
>> > On Thu, Nov 21, 2019 at 01:59:30PM +0100, Tomas Vondra wrote:
>> > > On Thu, Nov 21, 2019 at 12:53:26PM +0100, Ondřej Jirman wrote:
>> > > >
>> > > > ...
>> > > >
>> > > OK, so this is the problematic table. Any idea what operation triggers
>> > > the issue? Having a small reproducer (say, a single-row insert or so)
>> > > would be very helpful.
>> >
>> > Can something be extracted from the logical decoding data I sent in the other
>> > e-mail? That seems to contain the failing operation, but I don't know how to
>> > decode it off the top of my head from that binary representation.
>> >
>>
>> Well, why not use pg_logical_slot_peek_changes()? That should be human
>> readable, I think.
>
>I would, but how?
>
>SELECT * FROM pg_logical_slot_peek_changes('l5_hometv', NULL, NULL,
>        'proto_version', '1', 'publication_names', 'pub');
>ERROR:  logical decoding output plugin "pgoutput" produces binary output, but
>function "pg_logical_slot_peek_changes(name,pg_lsn,integer,text[])" expects
>textual data
>

Ah, right. Sorry, I forgot this is pgoutput.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Tomas Vondra

Дата:

21 ноября 2019 г., 17:35:03

On Thu, Nov 21, 2019 at 02:55:23PM +0100, Ondřej Jirman wrote:
>On Thu, Nov 21, 2019 at 02:32:37PM +0100, Tomas Vondra wrote:
>> On Thu, Nov 21, 2019 at 01:59:13PM +0100, Ondřej Jirman wrote:
>> > On Thu, Nov 21, 2019 at 12:53:26PM +0100, postgresql wrote:
>> > > Hello,
>> > >
>> > > On Thu, Nov 21, 2019 at 11:39:40AM +0100, Tomas Vondra wrote:
>> > > > On Thu, Nov 21, 2019 at 01:14:18AM +0000, PG Bug reporting form wrote:
>> > > > > Replication of one of my databases (running on ARMv7 machine) started
>> > > > > segfaulting on the subscriber side (x86_64) like this:
>> > > > >
>> > > > > #0  0x00007fc259739917 in __memmove_sse2_unaligned_erms () from
>> > > > > /usr/lib/libc.so.6
>> > > > > #1  0x000055d033e93d44 in memcpy (__len=620701425, __src=<optimized out>,
>> > > > > __dest=0x55d0356da804) at /usr/include/bits/string_fortified.h:34
>> > > > > #2  tts_virtual_materialize (slot=0x55d0356da3b8) at execTuples.c:235
>> > > > > #3  0x000055d033e94d32 in ExecFetchSlotHeapTuple
>> > > > > (slot=slot@entry=0x55d0356da3b8, materialize=materialize@entry=true,
>> > > > > shouldFree=shouldFree@entry=0x7fff0e7cf387) at execTuples.c:1624
>> > >
>> > > I forgot to add that publisher is still PostgreSQL 11.5.
>> > >
>> >
>> > I can also add that I have data checksumming enabled on both ends, and
>> > it did not detect any corruption:
>> >
>> > # pg_verify_checksums -D /var/lib/postgres/data
>> > Checksum scan completed
>> > Data checksum version: 1
>> > Files scanned:  1751
>> > Blocks scanned: 86592
>> > Bad checksums:  0
>> >
>> > # pg_checksums /var/lib/postgres/data
>> > Checksum operation completed
>> > Files scanned:  22777
>> > Blocks scanned: 3601527
>> > Bad checksums:  0
>> > Data checksum version: 1
>> >
>> > WAL log on the publisher is also dumpable to a state hours after the issues
>> > started:
>> >
>> > I've put the dump here, if it's of any use: https://megous.com/dl/tmp/wal_dump.txt
>> >
>> > Dump ends with:
>> >
>> > pg_waldump: FATAL:  error in WAL record at 2/BBE0E538: invalid record length at 2/BBE0E5A8: wanted 24, got 0
>> >
>> > But that seems normal. I get that error on my other database clusters, too.
>> >
>> > I managed to extract the failing logical decoding data from the publisher, if
>> > that helps:
>> >
>> >
>> > SELECT * FROM pg_logical_slot_peek_binary_changes('l5_hometv', NULL, NULL, 'proto_version', '1',
'publication_names','pub');
 
>> >
>> > 2/BBD86EA0 | 56395 | \x4200000002bbd880b800023acd790ce5510000dc4b
>> > 2/BBD87E90 | 56395 |
\x5200004a687075626c696300766964656f73006400080169640000000017ffffffff007469746c650000000019ffffffff00636f7665725f696d6167650000000011ffffffff006d657461646174610000000edafffffff
>> >
f0063617465676f72790000000017ffffffff007075626c6973686564000000043affffffff006164646564000000045affffffff00706c617965640000000010ffffffff
>> > 2/BBD87E90 | 56395 |
\x5500004a684e0008740000000438333933740000005650617a6465726b613a204f206dc3a964696120736520706f7665646520626f6a2e204b64796279206ec3a17320706f6c6974696369206d696c6f76616c692c20627
>> >
96c6f206279206ec49b636f20c5a17061746ec49b7574000001397b226964223a20226430313064343430303965323131656162323539616331663662323230656538222c202264617465223a2022323031392d31312d3138222c20226e616d65223a20
>> >
2250617a6465726b613a204f206dc3a964696120736520706f7665646520626f6a2e204b64796279206ec3a17320706f6c6974696369206d696c6f76616c692c2062796c6f206279206ec49b636f20c5a17061746ec49b222c2022696d616765223a202
>> >
268747470733a2f2f63646e2e7873642e637a2f726573697a652f63353535656239633131353333313632386164666539396237343534353731655f657874726163743d302c302c313931392c313038305f726573697a653d3732302c3430355f2e6a70
>> >
673f686173683d6362316362623836336230353361613561333761346666616439303865303431227d7400000003323432740000000a323031392d31312d3138740000001a323031392d31312d31382031323a35303a30312e383136333736740000000
>> > 174
>> > 2/BBD880E8 | 56395 | \x430000000002bbd880b800000002bbd880e800023acd790ce551
>> >
>>
>> Can you show the whole transaction? From the WAL dump it seems it just
>> did a single UPDATE:
>
>I'm not sure what you mean. The above is the whole output I got from that
>SELECT.
>
>> rmgr: Heap        len (rec/tot):     59/  4075, tx:      56395, lsn: 2/BBD86EA0, prev 2/BBD86E68, desc: LOCK off 2:
xid56395: flags 0 LOCK_ONLY EXCL_LOCK
 
>>     blkref #0: rel 1663/19030/19048 fork main blk 415 (FPW); hole: offset: 56, length: 4176
>> rmgr: Heap        len (rec/tot):    523/   523, tx:      56395, lsn: 2/BBD87E90, prev 2/BBD86EA0, desc: HOT_UPDATE
off2 xmax 56395 ; new off 9 xmax 0
 
>>     blkref #0: rel 1663/19030/19048 fork main blk 415
>> rmgr: Transaction len (rec/tot):     46/    46, tx:      56395, lsn: 2/BBD880B8, prev 2/BBD87E90, desc: COMMIT
2019-11-2023:28:47.016273 CET
 
>>
>> Which fields does it update?
>
>The only update that my system does in this database is:
>
>UPDATE videos SET played = TRUE WHERE id = ?
>
>The rest is just inserts:
>
>INSERT INTO categories (name, metadata, cover_image, provider) VALUES (?, ?, ?, (SELECT id FROM providers WHERE
system_name= ?))
 
>INSERT INTO videos (title, metadata, cover_image, category, published) VALUES (?, ?, ?, ?, ?)
>
>There are no other data modification statements executed.
>

Well, this is definitely the updated - both judging by the WAL dump and
also because the backtrace contains apply_handle_update.

Can you show us \d+ on the videos table? What's the replica identity?

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Ondřej Jirman

Дата:

21 ноября 2019 г., 18:07:16

On Thu, Nov 21, 2019 at 03:35:03PM +0100, Tomas Vondra wrote:
> On Thu, Nov 21, 2019 at 02:55:23PM +0100, Ondřej Jirman wrote:
> > On Thu, Nov 21, 2019 at 02:32:37PM +0100, Tomas Vondra wrote:
> > > On Thu, Nov 21, 2019 at 01:59:13PM +0100, Ondřej Jirman wrote:
> > > > On Thu, Nov 21, 2019 at 12:53:26PM +0100, postgresql wrote:
> > > > > Hello,
> > > > >
> > > > > On Thu, Nov 21, 2019 at 11:39:40AM +0100, Tomas Vondra wrote:
> > > > > > On Thu, Nov 21, 2019 at 01:14:18AM +0000, PG Bug reporting form wrote:
> > > > > > > Replication of one of my databases (running on ARMv7 machine) started
> > > > > > > segfaulting on the subscriber side (x86_64) like this:
> > > > > > >
> > > > > > > #0  0x00007fc259739917 in __memmove_sse2_unaligned_erms () from
> > > > > > > /usr/lib/libc.so.6
> > > > > > > #1  0x000055d033e93d44 in memcpy (__len=620701425, __src=<optimized out>,
> > > > > > > __dest=0x55d0356da804) at /usr/include/bits/string_fortified.h:34
> > > > > > > #2  tts_virtual_materialize (slot=0x55d0356da3b8) at execTuples.c:235
> > > > > > > #3  0x000055d033e94d32 in ExecFetchSlotHeapTuple
> > > > > > > (slot=slot@entry=0x55d0356da3b8, materialize=materialize@entry=true,
> > > > > > > shouldFree=shouldFree@entry=0x7fff0e7cf387) at execTuples.c:1624
> > > > >
> > > > > I forgot to add that publisher is still PostgreSQL 11.5.
> > > > >
> > > >
> > > > I can also add that I have data checksumming enabled on both ends, and
> > > > it did not detect any corruption:
> > > >
> > > > # pg_verify_checksums -D /var/lib/postgres/data
> > > > Checksum scan completed
> > > > Data checksum version: 1
> > > > Files scanned:  1751
> > > > Blocks scanned: 86592
> > > > Bad checksums:  0
> > > >
> > > > # pg_checksums /var/lib/postgres/data
> > > > Checksum operation completed
> > > > Files scanned:  22777
> > > > Blocks scanned: 3601527
> > > > Bad checksums:  0
> > > > Data checksum version: 1
> > > >
> > > > WAL log on the publisher is also dumpable to a state hours after the issues
> > > > started:
> > > >
> > > > I've put the dump here, if it's of any use: https://megous.com/dl/tmp/wal_dump.txt
> > > >
> > > > Dump ends with:
> > > >
> > > > pg_waldump: FATAL:  error in WAL record at 2/BBE0E538: invalid record length at 2/BBE0E5A8: wanted 24, got 0
> > > >
> > > > But that seems normal. I get that error on my other database clusters, too.
> > > >
> > > > I managed to extract the failing logical decoding data from the publisher, if
> > > > that helps:
> > > >
> > > >
> > > > SELECT * FROM pg_logical_slot_peek_binary_changes('l5_hometv', NULL, NULL, 'proto_version', '1',
'publication_names','pub');
 
> > > >
> > > > 2/BBD86EA0 | 56395 | \x4200000002bbd880b800023acd790ce5510000dc4b
> > > > 2/BBD87E90 | 56395 |
\x5200004a687075626c696300766964656f73006400080169640000000017ffffffff007469746c650000000019ffffffff00636f7665725f696d6167650000000011ffffffff006d657461646174610000000edafffffff
> > > >
f0063617465676f72790000000017ffffffff007075626c6973686564000000043affffffff006164646564000000045affffffff00706c617965640000000010ffffffff
> > > > 2/BBD87E90 | 56395 |
\x5500004a684e0008740000000438333933740000005650617a6465726b613a204f206dc3a964696120736520706f7665646520626f6a2e204b64796279206ec3a17320706f6c6974696369206d696c6f76616c692c20627
> > > >
96c6f206279206ec49b636f20c5a17061746ec49b7574000001397b226964223a20226430313064343430303965323131656162323539616331663662323230656538222c202264617465223a2022323031392d31312d3138222c20226e616d65223a20
> > > >
2250617a6465726b613a204f206dc3a964696120736520706f7665646520626f6a2e204b64796279206ec3a17320706f6c6974696369206d696c6f76616c692c2062796c6f206279206ec49b636f20c5a17061746ec49b222c2022696d616765223a202
> > > >
268747470733a2f2f63646e2e7873642e637a2f726573697a652f63353535656239633131353333313632386164666539396237343534353731655f657874726163743d302c302c313931392c313038305f726573697a653d3732302c3430355f2e6a70
> > > >
673f686173683d6362316362623836336230353361613561333761346666616439303865303431227d7400000003323432740000000a323031392d31312d3138740000001a323031392d31312d31382031323a35303a30312e383136333736740000000
> > > > 174
> > > > 2/BBD880E8 | 56395 | \x430000000002bbd880b800000002bbd880e800023acd790ce551
> > > >
> > > 
> > > Can you show the whole transaction? From the WAL dump it seems it just
> > > did a single UPDATE:
> > 
> > I'm not sure what you mean. The above is the whole output I got from that
> > SELECT.
> > 
> > > rmgr: Heap        len (rec/tot):     59/  4075, tx:      56395, lsn: 2/BBD86EA0, prev 2/BBD86E68, desc: LOCK off
2:xid 56395: flags 0 LOCK_ONLY EXCL_LOCK
 
> > >     blkref #0: rel 1663/19030/19048 fork main blk 415 (FPW); hole: offset: 56, length: 4176
> > > rmgr: Heap        len (rec/tot):    523/   523, tx:      56395, lsn: 2/BBD87E90, prev 2/BBD86EA0, desc:
HOT_UPDATEoff 2 xmax 56395 ; new off 9 xmax 0
 
> > >     blkref #0: rel 1663/19030/19048 fork main blk 415
> > > rmgr: Transaction len (rec/tot):     46/    46, tx:      56395, lsn: 2/BBD880B8, prev 2/BBD87E90, desc: COMMIT
2019-11-2023:28:47.016273 CET
 
> > > 
> > > Which fields does it update?
> > 
> > The only update that my system does in this database is:
> > 
> > UPDATE videos SET played = TRUE WHERE id = ?
> > 
> > The rest is just inserts:
> > 
> > INSERT INTO categories (name, metadata, cover_image, provider) VALUES (?, ?, ?, (SELECT id FROM providers WHERE
system_name= ?))
 
> > INSERT INTO videos (title, metadata, cover_image, category, published) VALUES (?, ?, ?, ?, ?)
> > 
> > There are no other data modification statements executed.
> > 
> 
> Well, this is definitely the updated - both judging by the WAL dump and
> also because the backtrace contains apply_handle_update.

I've also managed to get valgrind output on the segfaulting process:

https://megous.com/dl/tmp/valgrind-105524-109020.log

Though it probably doesn't show much new.

> Can you show us \d+ on the videos table? What's the replica identity?

                                                             Table "public.videos"
   Column    |            Type             | Collation | Nullable |              Default               | Storage  |
Statstarget | Description
 

-------------+-----------------------------+-----------+----------+------------------------------------+----------+--------------+-------------
 id          | integer                     |           | not null | nextval('videos_id_seq'::regclass) | plain    |
        |
 
 title       | text                        |           | not null |                                    | extended |
        |
 
 cover_image | bytea                       |           | not null |                                    | extended |
        |
 
 metadata    | jsonb                       |           | not null |                                    | extended |
        |
 
 category    | integer                     |           | not null |                                    | plain    |
        |
 
 published   | date                        |           | not null |                                    | plain    |
        |
 
 added       | timestamp without time zone |           | not null | now()                              | plain    |
        |
 
 played      | boolean                     |           | not null | false                              | plain    |
        |
 
Indexes:
    "videos_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
    "videos_category_fkey" FOREIGN KEY (category) REFERENCES categories(id)
Publications:
    "pub"

                                                             Table "public.videos"
   Column    |            Type             | Collation | Nullable |              Default               | Storage  |
Statstarget | Description
 

-------------+-----------------------------+-----------+----------+------------------------------------+----------+--------------+-------------
 id          | integer                     |           | not null | nextval('videos_id_seq'::regclass) | plain    |
        |
 
 title       | text                        |           | not null |                                    | extended |
        | 
 cover_image | bytea                       |           | not null |                                    | extended |
        |
 
 metadata    | jsonb                       |           | not null |                                    | extended |
        |
 
 category    | integer                     |           | not null |                                    | plain    |
        |
 
 published   | date                        |           | not null |                                    | plain    |
        |
 
 added       | timestamp without time zone |           | not null | now()                              | plain    |
        |
 
 played      | boolean                     |           | not null | false                              | plain    |
        |
 
Indexes:
    "videos_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
    "videos_category_fkey" FOREIGN KEY (category) REFERENCES categories(id)
Access method: heap


Googling for relica identity, it looks like it should be shown in the \d+
output, but it isn't, for me. Maybe because it's has default value?

regards,
    o.


> regards
> 
> -- 
> Tomas Vondra                  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Tomas Vondra

Дата:

21 ноября 2019 г., 18:27:32

On Thu, Nov 21, 2019 at 04:07:16PM +0100, Ondřej Jirman wrote:
>On Thu, Nov 21, 2019 at 03:35:03PM +0100, Tomas Vondra wrote:
>> On Thu, Nov 21, 2019 at 02:55:23PM +0100, Ondřej Jirman wrote:
>> > On Thu, Nov 21, 2019 at 02:32:37PM +0100, Tomas Vondra wrote:
>> > > On Thu, Nov 21, 2019 at 01:59:13PM +0100, Ondřej Jirman wrote:
>> > > > On Thu, Nov 21, 2019 at 12:53:26PM +0100, postgresql wrote:
>> > > > > Hello,
>> > > > >
>> > > > > On Thu, Nov 21, 2019 at 11:39:40AM +0100, Tomas Vondra wrote:
>> > > > > > On Thu, Nov 21, 2019 at 01:14:18AM +0000, PG Bug reporting form wrote:
>> > > > > > > Replication of one of my databases (running on ARMv7 machine) started
>> > > > > > > segfaulting on the subscriber side (x86_64) like this:
>> > > > > > >
>> > > > > > > #0  0x00007fc259739917 in __memmove_sse2_unaligned_erms () from
>> > > > > > > /usr/lib/libc.so.6
>> > > > > > > #1  0x000055d033e93d44 in memcpy (__len=620701425, __src=<optimized out>,
>> > > > > > > __dest=0x55d0356da804) at /usr/include/bits/string_fortified.h:34
>> > > > > > > #2  tts_virtual_materialize (slot=0x55d0356da3b8) at execTuples.c:235
>> > > > > > > #3  0x000055d033e94d32 in ExecFetchSlotHeapTuple
>> > > > > > > (slot=slot@entry=0x55d0356da3b8, materialize=materialize@entry=true,
>> > > > > > > shouldFree=shouldFree@entry=0x7fff0e7cf387) at execTuples.c:1624
>> > > > >
>> > > > > I forgot to add that publisher is still PostgreSQL 11.5.
>> > > > >
>> > > >
>> > > > I can also add that I have data checksumming enabled on both ends, and
>> > > > it did not detect any corruption:
>> > > >
>> > > > # pg_verify_checksums -D /var/lib/postgres/data
>> > > > Checksum scan completed
>> > > > Data checksum version: 1
>> > > > Files scanned:  1751
>> > > > Blocks scanned: 86592
>> > > > Bad checksums:  0
>> > > >
>> > > > # pg_checksums /var/lib/postgres/data
>> > > > Checksum operation completed
>> > > > Files scanned:  22777
>> > > > Blocks scanned: 3601527
>> > > > Bad checksums:  0
>> > > > Data checksum version: 1
>> > > >
>> > > > WAL log on the publisher is also dumpable to a state hours after the issues
>> > > > started:
>> > > >
>> > > > I've put the dump here, if it's of any use: https://megous.com/dl/tmp/wal_dump.txt
>> > > >
>> > > > Dump ends with:
>> > > >
>> > > > pg_waldump: FATAL:  error in WAL record at 2/BBE0E538: invalid record length at 2/BBE0E5A8: wanted 24, got 0
>> > > >
>> > > > But that seems normal. I get that error on my other database clusters, too.
>> > > >
>> > > > I managed to extract the failing logical decoding data from the publisher, if
>> > > > that helps:
>> > > >
>> > > >
>> > > > SELECT * FROM pg_logical_slot_peek_binary_changes('l5_hometv', NULL, NULL, 'proto_version', '1',
'publication_names','pub');
 
>> > > >
>> > > > 2/BBD86EA0 | 56395 | \x4200000002bbd880b800023acd790ce5510000dc4b
>> > > > 2/BBD87E90 | 56395 |
\x5200004a687075626c696300766964656f73006400080169640000000017ffffffff007469746c650000000019ffffffff00636f7665725f696d6167650000000011ffffffff006d657461646174610000000edafffffff
>> > > >
f0063617465676f72790000000017ffffffff007075626c6973686564000000043affffffff006164646564000000045affffffff00706c617965640000000010ffffffff
>> > > > 2/BBD87E90 | 56395 |
\x5500004a684e0008740000000438333933740000005650617a6465726b613a204f206dc3a964696120736520706f7665646520626f6a2e204b64796279206ec3a17320706f6c6974696369206d696c6f76616c692c20627
>> > > >
96c6f206279206ec49b636f20c5a17061746ec49b7574000001397b226964223a20226430313064343430303965323131656162323539616331663662323230656538222c202264617465223a2022323031392d31312d3138222c20226e616d65223a20
>> > > >
2250617a6465726b613a204f206dc3a964696120736520706f7665646520626f6a2e204b64796279206ec3a17320706f6c6974696369206d696c6f76616c692c2062796c6f206279206ec49b636f20c5a17061746ec49b222c2022696d616765223a202
>> > > >
268747470733a2f2f63646e2e7873642e637a2f726573697a652f63353535656239633131353333313632386164666539396237343534353731655f657874726163743d302c302c313931392c313038305f726573697a653d3732302c3430355f2e6a70
>> > > >
673f686173683d6362316362623836336230353361613561333761346666616439303865303431227d7400000003323432740000000a323031392d31312d3138740000001a323031392d31312d31382031323a35303a30312e383136333736740000000
>> > > > 174
>> > > > 2/BBD880E8 | 56395 | \x430000000002bbd880b800000002bbd880e800023acd790ce551
>> > > >
>> > >
>> > > Can you show the whole transaction? From the WAL dump it seems it just
>> > > did a single UPDATE:
>> >
>> > I'm not sure what you mean. The above is the whole output I got from that
>> > SELECT.
>> >
>> > > rmgr: Heap        len (rec/tot):     59/  4075, tx:      56395, lsn: 2/BBD86EA0, prev 2/BBD86E68, desc: LOCK off
2:xid 56395: flags 0 LOCK_ONLY EXCL_LOCK
 
>> > >     blkref #0: rel 1663/19030/19048 fork main blk 415 (FPW); hole: offset: 56, length: 4176
>> > > rmgr: Heap        len (rec/tot):    523/   523, tx:      56395, lsn: 2/BBD87E90, prev 2/BBD86EA0, desc:
HOT_UPDATEoff 2 xmax 56395 ; new off 9 xmax 0
 
>> > >     blkref #0: rel 1663/19030/19048 fork main blk 415
>> > > rmgr: Transaction len (rec/tot):     46/    46, tx:      56395, lsn: 2/BBD880B8, prev 2/BBD87E90, desc: COMMIT
2019-11-2023:28:47.016273 CET
 
>> > >
>> > > Which fields does it update?
>> >
>> > The only update that my system does in this database is:
>> >
>> > UPDATE videos SET played = TRUE WHERE id = ?
>> >
>> > The rest is just inserts:
>> >
>> > INSERT INTO categories (name, metadata, cover_image, provider) VALUES (?, ?, ?, (SELECT id FROM providers WHERE
system_name= ?))
 
>> > INSERT INTO videos (title, metadata, cover_image, category, published) VALUES (?, ?, ?, ?, ?)
>> >
>> > There are no other data modification statements executed.
>> >
>>
>> Well, this is definitely the updated - both judging by the WAL dump and
>> also because the backtrace contains apply_handle_update.
>
>I've also managed to get valgrind output on the segfaulting process:
>
>https://megous.com/dl/tmp/valgrind-105524-109020.log
>
>Though it probably doesn't show much new.
>

Is postgres actually compiled with valgrind support? What does pg_config
say?

If you're building the server locally (I don't think there are packages
with valgrind enabled), maybe try adding --enable-cassert option and

    CFLAGS"-DRANDOMIZE_ALLOCATED_MEMORY"

which will do more extensive checks at runtime.

>> Can you show us \d+ on the videos table? What's the replica identity?
>
>                                                             Table "public.videos"
>   Column    |            Type             | Collation | Nullable |              Default               | Storage  |
Statstarget | Description
 

>-------------+-----------------------------+-----------+----------+------------------------------------+----------+--------------+-------------
> id          | integer                     |           | not null | nextval('videos_id_seq'::regclass) | plain    |
         |
 
> title       | text                        |           | not null |                                    | extended |
         |
 
> cover_image | bytea                       |           | not null |                                    | extended |
         |
 
> metadata    | jsonb                       |           | not null |                                    | extended |
         |
 
> category    | integer                     |           | not null |                                    | plain    |
         |
 
> published   | date                        |           | not null |                                    | plain    |
         |
 
> added       | timestamp without time zone |           | not null | now()                              | plain    |
         |
 
> played      | boolean                     |           | not null | false                              | plain    |
         |
 
>Indexes:
>    "videos_pkey" PRIMARY KEY, btree (id)
>Foreign-key constraints:
>    "videos_category_fkey" FOREIGN KEY (category) REFERENCES categories(id)
>Publications:
>    "pub"
>
>                                                             Table "public.videos"
>   Column    |            Type             | Collation | Nullable |              Default               | Storage  |
Statstarget | Description
 

>-------------+-----------------------------+-----------+----------+------------------------------------+----------+--------------+-------------
> id          | integer                     |           | not null | nextval('videos_id_seq'::regclass) | plain    |
         |
 
> title       | text                        |           | not null |                                    | extended |
         |
 
> cover_image | bytea                       |           | not null |                                    | extended |
         |
 
> metadata    | jsonb                       |           | not null |                                    | extended |
         |
 
> category    | integer                     |           | not null |                                    | plain    |
         |
 
> published   | date                        |           | not null |                                    | plain    |
         |
 
> added       | timestamp without time zone |           | not null | now()                              | plain    |
         |
 
> played      | boolean                     |           | not null | false                              | plain    |
         |
 
>Indexes:
>    "videos_pkey" PRIMARY KEY, btree (id)
>Foreign-key constraints:
>    "videos_category_fkey" FOREIGN KEY (category) REFERENCES categories(id)
>Access method: heap
>
>
>Googling for relica identity, it looks like it should be shown in the \d+
>output, but it isn't, for me. Maybe because it's has default value?
>

Yes, I think it should be listed either for the index, or there should
be an extra section with replica identity. But that depends on psql
version. Aren't you accidentally using an older psql version? That
should be mentioned right after connecting to the server, I think.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Ondřej Jirman

Дата:

21 ноября 2019 г., 18:57:07

On Thu, Nov 21, 2019 at 04:27:32PM +0100, Tomas Vondra wrote:
> On Thu, Nov 21, 2019 at 04:07:16PM +0100, Ondřej Jirman wrote:
> > I've also managed to get valgrind output on the segfaulting process:
> > 
> > https://megous.com/dl/tmp/valgrind-105524-109020.log
> > 
> > Though it probably doesn't show much new.
> > 
> 
> Is postgres actually compiled with valgrind support? What does pg_config
> say?

I don't see any valgrind related options in configure --help, I repackaged it
just with CFLAGS set to -ggdb -Og.

CONFIGURE = '--prefix=/usr' '--mandir=/usr/share/man' '--datadir=/usr/share/postgresql' '--sysconfdir=/etc'
'--with-gssapi''--with-libxml' '--with-openssl' '--with-perl' '--with-python' '--with-tcl' '--with-pam'
'--with-system-tzdata=/usr/share/zoneinfo''--with-uuid=e2fs' '--with-icu' '--with-systemd' '--with-ldap' '--enable-nls'
'--enable-thread-safety''--disable-rpath' 'PYTHON=/usr/bin/python2' 'CFLAGS=-march=x86-64 -mtune=generic -O2 -pipe
-fstack-protector-strong-fno-plt -Og -ggdb
-fdebug-prefix-map=/workspace/external/pgsql/postgresql-pkg/src=/usr/src/debug'
'LDFLAGS=-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now''CPPFLAGS=-D_FORTIFY_SOURCE=2' 'CXXFLAGS=-march=x86-64
-mtune=generic-O2 -pipe -fstack-protector-strong -fno-plt -Og -ggdb
-fdebug-prefix-map=/workspace/external/pgsql/postgresql-pkg/src=/usr/src/debug'
CC = gcc
CPPFLAGS = -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -I/usr/include/libxml2
CFLAGS = -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels
-Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard
-Wno-format-truncation-Wno-stringop-truncation -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt
-Og-ggdb -fdebug-prefix-map=/workspace/external/pgsql/postgresql-pkg/src=/usr/src/debug
 
CFLAGS_SL = -fPIC
LDFLAGS = -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -Wl,--as-needed
LDFLAGS_EX =
LDFLAGS_SL =
LIBS = -lpgcommon -lpgport -lpthread -lxml2 -lpam -lssl -lcrypto -lgssapi_krb5 -lz -lreadline -lrt -lcrypt -ldl -lm
VERSION = PostgreSQL 12.1

> If you're building the server locally (I don't think there are packages
> with valgrind enabled), maybe try adding --enable-cassert option and
> 
>    CFLAGS"-DRANDOMIZE_ALLOCATED_MEMORY"
> 
> which will do more extensive checks at runtime.

Done. 

CONFIGURE = '--prefix=/usr' '--mandir=/usr/share/man' '--datadir=/usr/share/postgresql' '--sysconfdir=/etc'
'--with-gssapi''--with-libxml' '--with-openssl' '--with-perl' '--with-python' '--with-tcl' '--with-pam'
'--with-system-tzdata=/usr/share/zoneinfo''--with-uuid=e2fs' '--with-icu' '--with-systemd' '--with-ldap' '--enable-nls'
'--enable-thread-safety''--disable-rpath' '--enable-cassert' 'PYTHON=/usr/bin/python2' 'CFLAGS=-march=x86-64
-mtune=generic-O2 -pipe -fstack-protector-strong -fno-plt -Og -ggdb
-fdebug-prefix-map=/workspace/external/pgsql/postgresql-pkg/src=/usr/src/debug-DRANDOMIZE_ALLOCATED_MEMORY'
'LDFLAGS=-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now''CPPFLAGS=-D_FORTIFY_SOURCE=2' 'CXXFLAGS=-march=x86-64
-mtune=generic-O2 -pipe -fstack-protector-strong -fno-plt -Og -ggdb
-fdebug-prefix-map=/workspace/external/pgsql/postgresql-pkg/src=/usr/src/debug'
CC = gcc
CPPFLAGS = -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -I/usr/include/libxml2
CFLAGS = -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels
-Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard
-Wno-format-truncation-Wno-stringop-truncation -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt
-Og-ggdb -fdebug-prefix-map=/workspace/external/pgsql/postgresql-pkg/src=/usr/src/debug -DRANDOMIZE_ALLOCATED_MEMORY
 
CFLAGS_SL = -fPIC
LDFLAGS = -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -Wl,--as-needed
LDFLAGS_EX =
LDFLAGS_SL =
LIBS = -lpgcommon -lpgport -lpthread -lxml2 -lpam -lssl -lcrypto -lgssapi_krb5 -lz -lreadline -lrt -lcrypt -ldl -lm
VERSION = PostgreSQL 12.1

The difference now is that _len argument value is not stable accross crashes.

one crash...

#0  0x00007fb90c336917 in __memmove_sse2_unaligned_erms () from /usr/lib/libc.so.6
#1  0x000055ff2a7d0e64 in memcpy (__len=972662513, __src=<optimized out>, __dest=0x55ff2cc7dce4) at
/usr/include/bits/string_fortified.h:34
#2  tts_virtual_materialize (slot=0x55ff2cc7d868) at execTuples.c:235
#3  0x000055ff2a7d2a63 in ExecFetchSlotHeapTuple (slot=slot@entry=0x55ff2cc7d868, materialize=materialize@entry=true,
shouldFree=shouldFree@entry=0x7ffc80cfc607)at execTuples.c:1624
 
#4  0x000055ff2a634809 in heapam_tuple_update (relation=0x7fb8f8338100, otid=0x55ff2cc7e138, slot=0x55ff2cc7d868,
cid=0,snapshot=<optimized out>, crosscheck=0x0, wait=true, tmfd=0x7ffc80cfc680,
 
    lockmode=0x7ffc80cfc67c, update_indexes=0x7ffc80cfc6d7) at heapam_handler.c:325
#5  0x000055ff2a675d55 in table_tuple_update (update_indexes=0x7ffc80cfc6d7, lockmode=0x7ffc80cfc67c,
tmfd=0x7ffc80cfc680,wait=true, crosscheck=0x0, snapshot=0x0, cid=<optimized out>,
 
    slot=0x55ff2cc7d868, otid=0x55ff2cc7e138, rel=0x7fb8f8338100) at ../../../../src/include/access/tableam.h:1261
#6  simple_table_tuple_update (rel=rel@entry=0x7fb8f8338100, otid=otid@entry=0x55ff2cc7e138,
slot=slot@entry=0x55ff2cc7d868,snapshot=0x0, update_indexes=update_indexes@entry=0x7ffc80cfc6d7)
 
    at tableam.c:325
#7  0x000055ff2a7cdf08 in ExecSimpleRelationUpdate (estate=estate@entry=0x55ff2ccd02b8,
epqstate=epqstate@entry=0x7ffc80cfc730,searchslot=searchslot@entry=0x55ff2cc7e108,
 
    slot=slot@entry=0x55ff2cc7d868) at execReplication.c:493
#8  0x000055ff2a8eb8b2 in apply_handle_update (s=s@entry=0x7ffc80d03d00) at worker.c:750
#9  0x000055ff2a8ebe62 in apply_dispatch (s=s@entry=0x7ffc80d03d00) at worker.c:968
#10 0x000055ff2a8ec5ba in LogicalRepApplyLoop (last_received=11741462160) at worker.c:1176
#11 0x000055ff2a8ecd1a in ApplyWorkerMain (main_arg=<optimized out>) at worker.c:1734
#12 0x000055ff2a8b3927 in StartBackgroundWorker () at bgworker.c:834
#13 0x000055ff2a8c2b6b in do_start_bgworker (rw=rw@entry=0x55ff2cbbf430) at postmaster.c:5770
#14 0x000055ff2a8c2cf7 in maybe_start_bgworkers () at postmaster.c:5996
#15 0x000055ff2a8c3678 in sigusr1_handler (postgres_signal_arg=<optimized out>) at postmaster.c:5167
#16 <signal handler called>
#17 0x00007fb90c384e2a in select () from /usr/lib/libc.so.6
#18 0x000055ff2a8c40f8 in ServerLoop () at postmaster.c:1668
#19 0x000055ff2a8c54fc in PostmasterMain (argc=3, argv=<optimized out>) at postmaster.c:1377
#20 0x000055ff2a81f88a in main (argc=3, argv=0x55ff2cb94260) at main.c:228

and another ...

#0  0x00007f7a1befc917 in __memmove_sse2_unaligned_erms () from /usr/lib/libc.so.6
#1  0x0000559a5439ae64 in memcpy (__len=766338801, __src=<optimized out>, __dest=0x559a55626934) at
/usr/include/bits/string_fortified.h:34
#2  tts_virtual_materialize (slot=0x559a556264b8) at execTuples.c:235
#3  0x0000559a5439ca63 in ExecFetchSlotHeapTuple (slot=slot@entry=0x559a556264b8, materialize=materialize@entry=true,
shouldFree=shouldFree@entry=0x7fff7f2d5787)at execTuples.c:1624
 
#4  0x0000559a541fe809 in heapam_tuple_update (relation=0x7f7a07efe100, otid=0x559a55626d88, slot=0x559a556264b8,
cid=0,snapshot=<optimized out>, crosscheck=0x0, wait=true, tmfd=0x7fff7f2d5800,
 
    lockmode=0x7fff7f2d57fc, update_indexes=0x7fff7f2d5857) at heapam_handler.c:325
#5  0x0000559a5423fd55 in table_tuple_update (update_indexes=0x7fff7f2d5857, lockmode=0x7fff7f2d57fc,
tmfd=0x7fff7f2d5800,wait=true, crosscheck=0x0, snapshot=0x0, cid=<optimized out>,
 
    slot=0x559a556264b8, otid=0x559a55626d88, rel=0x7f7a07efe100) at ../../../../src/include/access/tableam.h:1261
#6  simple_table_tuple_update (rel=rel@entry=0x7f7a07efe100, otid=otid@entry=0x559a55626d88,
slot=slot@entry=0x559a556264b8,snapshot=0x0, update_indexes=update_indexes@entry=0x7fff7f2d5857)
 
    at tableam.c:325
#7  0x0000559a54397f08 in ExecSimpleRelationUpdate (estate=estate@entry=0x559a55678dc8,
epqstate=epqstate@entry=0x7fff7f2d58b0,searchslot=searchslot@entry=0x559a55626d58,
 
    slot=slot@entry=0x559a556264b8) at execReplication.c:493
#8  0x0000559a544b58b2 in apply_handle_update (s=s@entry=0x7fff7f2dce80) at worker.c:750
#9  0x0000559a544b5e62 in apply_dispatch (s=s@entry=0x7fff7f2dce80) at worker.c:968
#10 0x0000559a544b65ba in LogicalRepApplyLoop (last_received=11741462160) at worker.c:1176
#11 0x0000559a544b6d1a in ApplyWorkerMain (main_arg=<optimized out>) at worker.c:1734
#12 0x0000559a5447d927 in StartBackgroundWorker () at bgworker.c:834
#13 0x0000559a5448cb6b in do_start_bgworker (rw=rw@entry=0x559a5558c010) at postmaster.c:5770
#14 0x0000559a5448ccf7 in maybe_start_bgworkers () at postmaster.c:5996
#15 0x0000559a5448d678 in sigusr1_handler (postgres_signal_arg=<optimized out>) at postmaster.c:5167
#16 <signal handler called>
#17 0x00007f7a1bf4ae2a in select () from /usr/lib/libc.so.6
#18 0x0000559a5448e0f8 in ServerLoop () at postmaster.c:1668
#19 0x0000559a5448f4fc in PostmasterMain (argc=3, argv=<optimized out>) at postmaster.c:1377
#20 0x0000559a543e988a in main (argc=3, argv=0x559a55534260) at main.c:228

So it perhaps looks like some memory management issue in the subscribing
postmaster, rather than wal corruption on the publisher.

> > > Can you show us \d+ on the videos table? What's the replica identity?
> > 
> >                                                             Table "public.videos"
> >   Column    |            Type             | Collation | Nullable |              Default               | Storage  |
Statstarget | Description
 
> >
-------------+-----------------------------+-----------+----------+------------------------------------+----------+--------------+-------------
> > id          | integer                     |           | not null | nextval('videos_id_seq'::regclass) | plain    |
           |
 
> > title       | text                        |           | not null |                                    | extended |
           |
 
> > cover_image | bytea                       |           | not null |                                    | extended |
           |
 
> > metadata    | jsonb                       |           | not null |                                    | extended |
           |
 
> > category    | integer                     |           | not null |                                    | plain    |
           |
 
> > published   | date                        |           | not null |                                    | plain    |
           |
 
> > added       | timestamp without time zone |           | not null | now()                              | plain    |
           |
 
> > played      | boolean                     |           | not null | false                              | plain    |
           |
 
> > Indexes:
> >    "videos_pkey" PRIMARY KEY, btree (id)
> > Foreign-key constraints:
> >    "videos_category_fkey" FOREIGN KEY (category) REFERENCES categories(id)
> > Publications:
> >    "pub"
> > 
> >                                                             Table "public.videos"
> >   Column    |            Type             | Collation | Nullable |              Default               | Storage  |
Statstarget | Description
 
> >
-------------+-----------------------------+-----------+----------+------------------------------------+----------+--------------+-------------
> > id          | integer                     |           | not null | nextval('videos_id_seq'::regclass) | plain    |
           |
 
> > title       | text                        |           | not null |                                    | extended |
           |
 
> > cover_image | bytea                       |           | not null |                                    | extended |
           |
 
> > metadata    | jsonb                       |           | not null |                                    | extended |
           |
 
> > category    | integer                     |           | not null |                                    | plain    |
           |
 
> > published   | date                        |           | not null |                                    | plain    |
           |
 
> > added       | timestamp without time zone |           | not null | now()                              | plain    |
           |
 
> > played      | boolean                     |           | not null | false                              | plain    |
           |
 
> > Indexes:
> >    "videos_pkey" PRIMARY KEY, btree (id)
> > Foreign-key constraints:
> >    "videos_category_fkey" FOREIGN KEY (category) REFERENCES categories(id)
> > Access method: heap
> > 
> > 
> > Googling for relica identity, it looks like it should be shown in the \d+
> > output, but it isn't, for me. Maybe because it's has default value?
> > 
> 
> Yes, I think it should be listed either for the index, or there should
> be an extra section with replica identity. But that depends on psql
> version. Aren't you accidentally using an older psql version? That
> should be mentioned right after connecting to the server, I think.

Maybe it has something to do with my upgrade method. I dumped/restored
the replica with pg_dumpall, and then just proceded to enable subscription
and refresh publication with (copy_data=false) for all my subscriptions.

Now, I've tried to start fresh and replicate from the primary "hometv" database
into a new l5_hometv2 on my crashing 12.1 server and it replicates fine, no
crashes. Anyway even in this fresh replicated database, there's no REPLICA
IDENTITY in the \d+ output.

I don't have multiple postgresql installations on the host. There's only
a difference between primary version 11.5 (ARMv7) and replica 12.1 (x86_64).

regards,
    o.

> regards
> 
> -- 
> Tomas Vondra                  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Tomas Vondra

Дата:

21 ноября 2019 г., 19:09:06

On Thu, Nov 21, 2019 at 04:57:07PM +0100, Ondřej Jirman wrote:
>On Thu, Nov 21, 2019 at 04:27:32PM +0100, Tomas Vondra wrote:
>> On Thu, Nov 21, 2019 at 04:07:16PM +0100, Ondřej Jirman wrote:
>> > I've also managed to get valgrind output on the segfaulting process:
>> >
>> > https://megous.com/dl/tmp/valgrind-105524-109020.log
>> >
>> > Though it probably doesn't show much new.
>> >
>>
>> Is postgres actually compiled with valgrind support? What does pg_config
>> say?
>
>I don't see any valgrind related options in configure --help, I repackaged it
>just with CFLAGS set to -ggdb -Og.
>

You need to use -DUSE_VALGRIND or edit pg_config_manual.h. I usually
just use

   CFLAGS="-DRANDOMIZE_ALLOCATED_MEMORY -DUSE_VALGRIND -O0 -ggdb"

which does the trick. Without the USE_VALGRIND defined, we don't include
the valgrind instrumentation, so I suppose the previous results are bogus.

>CONFIGURE = '--prefix=/usr' '--mandir=/usr/share/man' '--datadir=/usr/share/postgresql' '--sysconfdir=/etc'
'--with-gssapi''--with-libxml' '--with-openssl' '--with-perl' '--with-python' '--with-tcl' '--with-pam'
'--with-system-tzdata=/usr/share/zoneinfo''--with-uuid=e2fs' '--with-icu' '--with-systemd' '--with-ldap' '--enable-nls'
'--enable-thread-safety''--disable-rpath' 'PYTHON=/usr/bin/python2' 'CFLAGS=-march=x86-64 -mtune=generic -O2 -pipe
-fstack-protector-strong-fno-plt -Og -ggdb
-fdebug-prefix-map=/workspace/external/pgsql/postgresql-pkg/src=/usr/src/debug'
'LDFLAGS=-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now''CPPFLAGS=-D_FORTIFY_SOURCE=2' 'CXXFLAGS=-march=x86-64
-mtune=generic-O2 -pipe -fstack-protector-strong -fno-plt -Og -ggdb
-fdebug-prefix-map=/workspace/external/pgsql/postgresql-pkg/src=/usr/src/debug'
>CC = gcc
>CPPFLAGS = -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -I/usr/include/libxml2
>CFLAGS = -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels
-Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard
-Wno-format-truncation-Wno-stringop-truncation -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt
-Og-ggdb -fdebug-prefix-map=/workspace/external/pgsql/postgresql-pkg/src=/usr/src/debug
 
>CFLAGS_SL = -fPIC
>LDFLAGS = -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -Wl,--as-needed
>LDFLAGS_EX =
>LDFLAGS_SL =
>LIBS = -lpgcommon -lpgport -lpthread -lxml2 -lpam -lssl -lcrypto -lgssapi_krb5 -lz -lreadline -lrt -lcrypt -ldl -lm
>VERSION = PostgreSQL 12.1
>
>> If you're building the server locally (I don't think there are packages
>> with valgrind enabled), maybe try adding --enable-cassert option and
>>
>>    CFLAGS"-DRANDOMIZE_ALLOCATED_MEMORY"
>>
>> which will do more extensive checks at runtime.
>
>Done.
>
>CONFIGURE = '--prefix=/usr' '--mandir=/usr/share/man' '--datadir=/usr/share/postgresql' '--sysconfdir=/etc'
'--with-gssapi''--with-libxml' '--with-openssl' '--with-perl' '--with-python' '--with-tcl' '--with-pam'
'--with-system-tzdata=/usr/share/zoneinfo''--with-uuid=e2fs' '--with-icu' '--with-systemd' '--with-ldap' '--enable-nls'
'--enable-thread-safety''--disable-rpath' '--enable-cassert' 'PYTHON=/usr/bin/python2' 'CFLAGS=-march=x86-64
-mtune=generic-O2 -pipe -fstack-protector-strong -fno-plt -Og -ggdb
-fdebug-prefix-map=/workspace/external/pgsql/postgresql-pkg/src=/usr/src/debug-DRANDOMIZE_ALLOCATED_MEMORY'
'LDFLAGS=-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now''CPPFLAGS=-D_FORTIFY_SOURCE=2' 'CXXFLAGS=-march=x86-64
-mtune=generic-O2 -pipe -fstack-protector-strong -fno-plt -Og -ggdb
-fdebug-prefix-map=/workspace/external/pgsql/postgresql-pkg/src=/usr/src/debug'
>CC = gcc
>CPPFLAGS = -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -I/usr/include/libxml2
>CFLAGS = -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels
-Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard
-Wno-format-truncation-Wno-stringop-truncation -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt
-Og-ggdb -fdebug-prefix-map=/workspace/external/pgsql/postgresql-pkg/src=/usr/src/debug -DRANDOMIZE_ALLOCATED_MEMORY
 
>CFLAGS_SL = -fPIC
>LDFLAGS = -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -Wl,--as-needed
>LDFLAGS_EX =
>LDFLAGS_SL =
>LIBS = -lpgcommon -lpgport -lpthread -lxml2 -lpam -lssl -lcrypto -lgssapi_krb5 -lz -lreadline -lrt -lcrypt -ldl -lm
>VERSION = PostgreSQL 12.1
>
>The difference now is that _len argument value is not stable accross crashes.
>
>one crash...
>
>#0  0x00007fb90c336917 in __memmove_sse2_unaligned_erms () from /usr/lib/libc.so.6
>#1  0x000055ff2a7d0e64 in memcpy (__len=972662513, __src=<optimized out>, __dest=0x55ff2cc7dce4) at
/usr/include/bits/string_fortified.h:34
>#2  tts_virtual_materialize (slot=0x55ff2cc7d868) at execTuples.c:235
>#3  0x000055ff2a7d2a63 in ExecFetchSlotHeapTuple (slot=slot@entry=0x55ff2cc7d868, materialize=materialize@entry=true,
shouldFree=shouldFree@entry=0x7ffc80cfc607)at execTuples.c:1624
 
>#4  0x000055ff2a634809 in heapam_tuple_update (relation=0x7fb8f8338100, otid=0x55ff2cc7e138, slot=0x55ff2cc7d868,
cid=0,snapshot=<optimized out>, crosscheck=0x0, wait=true, tmfd=0x7ffc80cfc680,
 
>    lockmode=0x7ffc80cfc67c, update_indexes=0x7ffc80cfc6d7) at heapam_handler.c:325
>#5  0x000055ff2a675d55 in table_tuple_update (update_indexes=0x7ffc80cfc6d7, lockmode=0x7ffc80cfc67c,
tmfd=0x7ffc80cfc680,wait=true, crosscheck=0x0, snapshot=0x0, cid=<optimized out>,
 
>    slot=0x55ff2cc7d868, otid=0x55ff2cc7e138, rel=0x7fb8f8338100) at ../../../../src/include/access/tableam.h:1261
>#6  simple_table_tuple_update (rel=rel@entry=0x7fb8f8338100, otid=otid@entry=0x55ff2cc7e138,
slot=slot@entry=0x55ff2cc7d868,snapshot=0x0, update_indexes=update_indexes@entry=0x7ffc80cfc6d7)
 
>    at tableam.c:325
>#7  0x000055ff2a7cdf08 in ExecSimpleRelationUpdate (estate=estate@entry=0x55ff2ccd02b8,
epqstate=epqstate@entry=0x7ffc80cfc730,searchslot=searchslot@entry=0x55ff2cc7e108,
 
>    slot=slot@entry=0x55ff2cc7d868) at execReplication.c:493
>#8  0x000055ff2a8eb8b2 in apply_handle_update (s=s@entry=0x7ffc80d03d00) at worker.c:750
>#9  0x000055ff2a8ebe62 in apply_dispatch (s=s@entry=0x7ffc80d03d00) at worker.c:968
>#10 0x000055ff2a8ec5ba in LogicalRepApplyLoop (last_received=11741462160) at worker.c:1176
>#11 0x000055ff2a8ecd1a in ApplyWorkerMain (main_arg=<optimized out>) at worker.c:1734
>#12 0x000055ff2a8b3927 in StartBackgroundWorker () at bgworker.c:834
>#13 0x000055ff2a8c2b6b in do_start_bgworker (rw=rw@entry=0x55ff2cbbf430) at postmaster.c:5770
>#14 0x000055ff2a8c2cf7 in maybe_start_bgworkers () at postmaster.c:5996
>#15 0x000055ff2a8c3678 in sigusr1_handler (postgres_signal_arg=<optimized out>) at postmaster.c:5167
>#16 <signal handler called>
>#17 0x00007fb90c384e2a in select () from /usr/lib/libc.so.6
>#18 0x000055ff2a8c40f8 in ServerLoop () at postmaster.c:1668
>#19 0x000055ff2a8c54fc in PostmasterMain (argc=3, argv=<optimized out>) at postmaster.c:1377
>#20 0x000055ff2a81f88a in main (argc=3, argv=0x55ff2cb94260) at main.c:228
>
>and another ...
>
>#0  0x00007f7a1befc917 in __memmove_sse2_unaligned_erms () from /usr/lib/libc.so.6
>#1  0x0000559a5439ae64 in memcpy (__len=766338801, __src=<optimized out>, __dest=0x559a55626934) at
/usr/include/bits/string_fortified.h:34
>#2  tts_virtual_materialize (slot=0x559a556264b8) at execTuples.c:235
>#3  0x0000559a5439ca63 in ExecFetchSlotHeapTuple (slot=slot@entry=0x559a556264b8, materialize=materialize@entry=true,
shouldFree=shouldFree@entry=0x7fff7f2d5787)at execTuples.c:1624
 
>#4  0x0000559a541fe809 in heapam_tuple_update (relation=0x7f7a07efe100, otid=0x559a55626d88, slot=0x559a556264b8,
cid=0,snapshot=<optimized out>, crosscheck=0x0, wait=true, tmfd=0x7fff7f2d5800,
 
>    lockmode=0x7fff7f2d57fc, update_indexes=0x7fff7f2d5857) at heapam_handler.c:325
>#5  0x0000559a5423fd55 in table_tuple_update (update_indexes=0x7fff7f2d5857, lockmode=0x7fff7f2d57fc,
tmfd=0x7fff7f2d5800,wait=true, crosscheck=0x0, snapshot=0x0, cid=<optimized out>,
 
>    slot=0x559a556264b8, otid=0x559a55626d88, rel=0x7f7a07efe100) at ../../../../src/include/access/tableam.h:1261
>#6  simple_table_tuple_update (rel=rel@entry=0x7f7a07efe100, otid=otid@entry=0x559a55626d88,
slot=slot@entry=0x559a556264b8,snapshot=0x0, update_indexes=update_indexes@entry=0x7fff7f2d5857)
 
>    at tableam.c:325
>#7  0x0000559a54397f08 in ExecSimpleRelationUpdate (estate=estate@entry=0x559a55678dc8,
epqstate=epqstate@entry=0x7fff7f2d58b0,searchslot=searchslot@entry=0x559a55626d58,
 
>    slot=slot@entry=0x559a556264b8) at execReplication.c:493
>#8  0x0000559a544b58b2 in apply_handle_update (s=s@entry=0x7fff7f2dce80) at worker.c:750
>#9  0x0000559a544b5e62 in apply_dispatch (s=s@entry=0x7fff7f2dce80) at worker.c:968
>#10 0x0000559a544b65ba in LogicalRepApplyLoop (last_received=11741462160) at worker.c:1176
>#11 0x0000559a544b6d1a in ApplyWorkerMain (main_arg=<optimized out>) at worker.c:1734
>#12 0x0000559a5447d927 in StartBackgroundWorker () at bgworker.c:834
>#13 0x0000559a5448cb6b in do_start_bgworker (rw=rw@entry=0x559a5558c010) at postmaster.c:5770
>#14 0x0000559a5448ccf7 in maybe_start_bgworkers () at postmaster.c:5996
>#15 0x0000559a5448d678 in sigusr1_handler (postgres_signal_arg=<optimized out>) at postmaster.c:5167
>#16 <signal handler called>
>#17 0x00007f7a1bf4ae2a in select () from /usr/lib/libc.so.6
>#18 0x0000559a5448e0f8 in ServerLoop () at postmaster.c:1668
>#19 0x0000559a5448f4fc in PostmasterMain (argc=3, argv=<optimized out>) at postmaster.c:1377
>#20 0x0000559a543e988a in main (argc=3, argv=0x559a55534260) at main.c:228
>
>So it perhaps looks like some memory management issue in the subscribing
>postmaster, rather than wal corruption on the publisher.
>

Right, maybe using freed memory, or something like that.

>> > > Can you show us \d+ on the videos table? What's the replica identity?
>> >
>> >                                                             Table "public.videos"
>> >   Column    |            Type             | Collation | Nullable |              Default               | Storage  |
Statstarget | Description
 
>> >
-------------+-----------------------------+-----------+----------+------------------------------------+----------+--------------+-------------
>> > id          | integer                     |           | not null | nextval('videos_id_seq'::regclass) | plain    |
            |
 
>> > title       | text                        |           | not null |                                    | extended |
            |
 
>> > cover_image | bytea                       |           | not null |                                    | extended |
            |
 
>> > metadata    | jsonb                       |           | not null |                                    | extended |
            |
 
>> > category    | integer                     |           | not null |                                    | plain    |
            |
 
>> > published   | date                        |           | not null |                                    | plain    |
            |
 
>> > added       | timestamp without time zone |           | not null | now()                              | plain    |
            |
 
>> > played      | boolean                     |           | not null | false                              | plain    |
            |
 
>> > Indexes:
>> >    "videos_pkey" PRIMARY KEY, btree (id)
>> > Foreign-key constraints:
>> >    "videos_category_fkey" FOREIGN KEY (category) REFERENCES categories(id)
>> > Publications:
>> >    "pub"
>> >
>> >                                                             Table "public.videos"
>> >   Column    |            Type             | Collation | Nullable |              Default               | Storage  |
Statstarget | Description
 
>> >
-------------+-----------------------------+-----------+----------+------------------------------------+----------+--------------+-------------
>> > id          | integer                     |           | not null | nextval('videos_id_seq'::regclass) | plain    |
            |
 
>> > title       | text                        |           | not null |                                    | extended |
            |
 
>> > cover_image | bytea                       |           | not null |                                    | extended |
            |
 
>> > metadata    | jsonb                       |           | not null |                                    | extended |
            |
 
>> > category    | integer                     |           | not null |                                    | plain    |
            |
 
>> > published   | date                        |           | not null |                                    | plain    |
            |
 
>> > added       | timestamp without time zone |           | not null | now()                              | plain    |
            |
 
>> > played      | boolean                     |           | not null | false                              | plain    |
            |
 
>> > Indexes:
>> >    "videos_pkey" PRIMARY KEY, btree (id)
>> > Foreign-key constraints:
>> >    "videos_category_fkey" FOREIGN KEY (category) REFERENCES categories(id)
>> > Access method: heap
>> >
>> >
>> > Googling for relica identity, it looks like it should be shown in the \d+
>> > output, but it isn't, for me. Maybe because it's has default value?
>> >
>>
>> Yes, I think it should be listed either for the index, or there should
>> be an extra section with replica identity. But that depends on psql
>> version. Aren't you accidentally using an older psql version? That
>> should be mentioned right after connecting to the server, I think.
>
>Maybe it has something to do with my upgrade method. I dumped/restored
>the replica with pg_dumpall, and then just proceded to enable subscription
>and refresh publication with (copy_data=false) for all my subscriptions.
>

Interesting.

>Now, I've tried to start fresh and replicate from the primary "hometv" database
>into a new l5_hometv2 on my crashing 12.1 server and it replicates fine, no
>crashes. Anyway even in this fresh replicated database, there's no REPLICA
>IDENTITY in the \d+ output.
>
>I don't have multiple postgresql installations on the host. There's only
>a difference between primary version 11.5 (ARMv7) and replica 12.1 (x86_64).
>

Hmmm, not sure. Have to check the code.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Ondřej Jirman

Дата:

21 ноября 2019 г., 19:15:02

On Thu, Nov 21, 2019 at 04:57:07PM +0100, Ondřej Jirman wrote:
> 
> Maybe it has something to do with my upgrade method. I dumped/restored
> the replica with pg_dumpall, and then just proceded to enable subscription
> and refresh publication with (copy_data=false) for all my subscriptions.

OTOH, it may not. There are 2 more databases replicated the same way from
the same database cluster, and they don't crash the replica server, and
continue replicating. The one of the other databases also has bytea columns in
some of the tables.

It really just seems related to the machine restart (a regular one) that I did
on the primary, minutes later replica crashed, and kept crashing ever since
whenever connecting to the primary for the hometv database.

So maybe something's wrong with the replica database (maybe because the
connection got killed by the walsender at unfortunate time), rather than the
original database, because I can replicate the original DB afresh into a new copy
just fine and other databases continue replicating just fine if I disable
the crashing subscription.

regards,
    o.

> Now, I've tried to start fresh and replicate from the primary "hometv" database
> into a new l5_hometv2 on my crashing 12.1 server and it replicates fine, no
> crashes. Anyway even in this fresh replicated database, there's no REPLICA
> IDENTITY in the \d+ output.
> 
> I don't have multiple postgresql installations on the host. There's only
> a difference between primary version 11.5 (ARMv7) and replica 12.1 (x86_64).
> 
> regards,
>     o.
> 
> > regards
> > 
> > -- 
> > Tomas Vondra                  http://www.2ndQuadrant.com
> > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Ondřej Jirman

Дата:

21 ноября 2019 г., 19:35:06

On Thu, Nov 21, 2019 at 05:09:06PM +0100, Tomas Vondra wrote:

> On Thu, Nov 21, 2019 at 04:57:07PM +0100, Ondřej Jirman wrote:
> > On Thu, Nov 21, 2019 at 04:27:32PM +0100, Tomas Vondra wrote:
> > > On Thu, Nov 21, 2019 at 04:07:16PM +0100, Ondřej Jirman wrote:
> > > > I've also managed to get valgrind output on the segfaulting process:
> > > >
> > > > https://megous.com/dl/tmp/valgrind-105524-109020.log
> > > >
> > > > Though it probably doesn't show much new.
> > > >
> > > 
> > > Is postgres actually compiled with valgrind support? What does pg_config
> > > say?
> > 
> > I don't see any valgrind related options in configure --help, I repackaged it
> > just with CFLAGS set to -ggdb -Og.
> > 
> 
> You need to use -DUSE_VALGRIND or edit pg_config_manual.h. I usually
> just use
> 
>   CFLAGS="-DRANDOMIZE_ALLOCATED_MEMORY -DUSE_VALGRIND -O0 -ggdb"
> 
> which does the trick. Without the USE_VALGRIND defined, we don't include
> the valgrind instrumentation, so I suppose the previous results are bogus.

Two new valgrind logs, now with -DUSE_VALGRIND:

https://megous.com/dl/tmp/valgrind-172090.log
https://megous.com/dl/tmp/valgrind-172143.log

A lots of uninitialized 1b accesses around VARATT_IS_EXTERNAL_EXPANDED.

So slot->tts_values[natt] is probably uninitialized.

Which is why the backtrace had a weird value there, I guess.

regards,
    o.

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Tomas Vondra

Дата:

21 ноября 2019 г., 19:57:52

On Thu, Nov 21, 2019 at 05:15:02PM +0100, Ondřej Jirman wrote:
>On Thu, Nov 21, 2019 at 04:57:07PM +0100, Ondřej Jirman wrote:
>>
>> Maybe it has something to do with my upgrade method. I
>> dumped/restored the replica with pg_dumpall, and then just proceded
>> to enable subscription and refresh publication with (copy_data=false)
>> for all my subscriptions.
>
>OTOH, it may not. There are 2 more databases replicated the same way
>from the same database cluster, and they don't crash the replica
>server, and continue replicating. The one of the other databases also
>has bytea columns in some of the tables.
>
>It really just seems related to the machine restart (a regular one)
>that I did on the primary, minutes later replica crashed, and kept
>crashing ever since whenever connecting to the primary for the hometv
>database.
>

Hmmm. A restart of the primary certainly should not cause any such
damage, that'd be a bug too. And it'd be a bit strange that it correctly
sends the data and it crashes the replica. How exactly did you restart
the primary? What mode - smart/fast/immediate?

>So maybe something's wrong with the replica database (maybe because the
>connection got killed by the walsender at unfortunate time), rather
>than the original database, because I can replicate the original DB
>afresh into a new copy just fine and other databases continue
>replicating just fine if I disable the crashing subscription.
>

Possibly, but what would be the damaged bit? The only thing I can think
of is the replication slot info (i.e. snapshot), and I know there were
some timing issues in the serialization.

How far is the change from the restart point of the slot (visible in
pg_replication_slots)? If there are many changes since then, that'd mean
the corrupted snapshot is unlikely.

There's a lot of moving parts in this - you're replicating between major
versions, and from ARM to x86. All of that should work, of course, but
maybe there's a bug somewhere. So it might take time to investigate and
fix. Thanks for you patience ;-)

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Tomas Vondra

Дата:

21 ноября 2019 г., 20:01:52

On Thu, Nov 21, 2019 at 05:35:06PM +0100, Ondřej Jirman wrote:
>On Thu, Nov 21, 2019 at 05:09:06PM +0100, Tomas Vondra wrote:
>
>> On Thu, Nov 21, 2019 at 04:57:07PM +0100, Ondřej Jirman wrote:
>> > On Thu, Nov 21, 2019 at 04:27:32PM +0100, Tomas Vondra wrote:
>> > > On Thu, Nov 21, 2019 at 04:07:16PM +0100, Ondřej Jirman wrote:
>> > > > I've also managed to get valgrind output on the segfaulting process:
>> > > >
>> > > > https://megous.com/dl/tmp/valgrind-105524-109020.log
>> > > >
>> > > > Though it probably doesn't show much new.
>> > > >
>> > >
>> > > Is postgres actually compiled with valgrind support? What does pg_config
>> > > say?
>> >
>> > I don't see any valgrind related options in configure --help, I repackaged it
>> > just with CFLAGS set to -ggdb -Og.
>> >
>>
>> You need to use -DUSE_VALGRIND or edit pg_config_manual.h. I usually
>> just use
>>
>>   CFLAGS="-DRANDOMIZE_ALLOCATED_MEMORY -DUSE_VALGRIND -O0 -ggdb"
>>
>> which does the trick. Without the USE_VALGRIND defined, we don't include
>> the valgrind instrumentation, so I suppose the previous results are bogus.
>
>Two new valgrind logs, now with -DUSE_VALGRIND:
>
>https://megous.com/dl/tmp/valgrind-172090.log
>https://megous.com/dl/tmp/valgrind-172143.log
>

Unfortunately, I only get HTTP 403 :-(

>A lots of uninitialized 1b accesses around VARATT_IS_EXTERNAL_EXPANDED.
>
>So slot->tts_values[natt] is probably uninitialized.
>
>Which is why the backtrace had a weird value there, I guess.
>

Possibly. Something is clearly wrong.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Ondřej Jirman

Дата:

21 ноября 2019 г., 20:35:55

On Thu, Nov 21, 2019 at 05:57:52PM +0100, Tomas Vondra wrote:
> On Thu, Nov 21, 2019 at 05:15:02PM +0100, Ondřej Jirman wrote:
> > On Thu, Nov 21, 2019 at 04:57:07PM +0100, Ondřej Jirman wrote:
> > > 
> > > Maybe it has something to do with my upgrade method. I
> > > dumped/restored the replica with pg_dumpall, and then just proceded
> > > to enable subscription and refresh publication with (copy_data=false)
> > > for all my subscriptions.
> > 
> > OTOH, it may not. There are 2 more databases replicated the same way
> > from the same database cluster, and they don't crash the replica
> > server, and continue replicating. The one of the other databases also
> > has bytea columns in some of the tables.
> > 
> > It really just seems related to the machine restart (a regular one)
> > that I did on the primary, minutes later replica crashed, and kept
> > crashing ever since whenever connecting to the primary for the hometv
> > database.
> > 
> 
> Hmmm. A restart of the primary certainly should not cause any such
> damage, that'd be a bug too. And it'd be a bit strange that it correctly
> sends the data and it crashes the replica. How exactly did you restart
> the primary? What mode - smart/fast/immediate?

Looks like Arch Linux uses fast mode. KillSignal=SIGINT

Primary:
-------

23:22:31 - primary machine booted so postgresql shutdown was ~15s prior
23:22:53 - primary cluster started (saying
           LOG:  database system was shut down at 2019-11-20 23:22:50 CET)

(there was some RTC inaccuracy, corrected by NTP earlier during boot)
lis 20 23:22:36 l5 systemd-timesyncd[312]: System clock time unset or jumped backwards, restoring from recorded
timestamp:Wed 2019-11-20 23:22:50 CET
 

postgres[355]: 2019-11-20 23:22:58.106 CET [386] LOG:  starting logical decoding for slot "l5_hometv"

And this is when the replica crashed:

postgres[355]: 2019-11-20 23:28:52.489 CET [386] LOG:  could not receive data from client: Spojení zrušeno druhou
stranou
postgres[355]: 2019-11-20 23:28:52.489 CET [386] LOG:  unexpected EOF on standby connection

Replica:
-------

2019-11-20 15:35:32.910 CET [1480049] LOG:  logical replication apply worker for subscription "l3_weatherlog" has
started
2019-11-20 15:35:32.921 CET [1480050] LOG:  logical replication apply worker for subscription "l5_hometv" has started
2019-11-20 15:35:32.931 CET [1480051] LOG:  logical replication apply worker for subscription "l5_ledger_dana" has
started
2019-11-20 15:35:32.942 CET [1480052] LOG:  logical replication apply worker for subscription "l5_news" has started

2019-11-20 23:22:50.315 CET [1480050] LOG:  data stream from publisher has ended
2019-11-20 23:22:50.315 CET [1480050] ERROR:  could not send end-of-streaming message to primary: COPY se neprovádí
2019-11-20 23:22:50.318 CET [178527] LOG:  background worker "logical replication worker" (PID 1480050) exited with
exitcode 1
 
2019-11-20 23:22:50.320 CET [1480051] LOG:  data stream from publisher has ended
2019-11-20 23:22:50.320 CET [1480051] ERROR:  could not send end-of-streaming message to primary: COPY se neprovádí
2019-11-20 23:22:50.321 CET [178527] LOG:  background worker "logical replication worker" (PID 1480051) exited with
exitcode 1
 
2019-11-20 23:22:50.321 CET [2195905] LOG:  logical replication apply worker for subscription "l5_hometv" has started
2019-11-20 23:22:50.325 CET [1480052] LOG:  data stream from publisher has ended
2019-11-20 23:22:50.325 CET [1480052] ERROR:  could not send end-of-streaming message to primary: COPY se neprovádí
2019-11-20 23:22:50.326 CET [178527] LOG:  background worker "logical replication worker" (PID 1480052) exited with
exitcode 1
 
2019-11-20 23:22:50.395 CET [2195905] ERROR:  could not connect to the publisher: FATAL:  the database system is
shuttingdown
 
        FATAL:  the database system is shutting down
2019-11-20 23:22:50.396 CET [178527] LOG:  background worker "logical replication worker" (PID 2195905) exited with
exitcode 1
 
2019-11-20 23:22:55.401 CET [2195919] LOG:  logical replication apply worker for subscription "l5_hometv" has started

nothing in the log and then crash:

Nov 20 23:28:47 core kernel: postgres[2195919]: segfault at 562d3bc56291 ip 00007faa0a760917 sp 00007ffd1f93f0c8 error
4in libc-2.30.so (deleted)[7faa0a6dd000+14d000]
 
2019-11-20 23:28:52.488 CET [178527] LOG:  background worker "logical replication worker" (PID 2195919) was terminated
bysignal 11: Neoprávněný přístup do pam>
 
2019-11-20 23:28:52.488 CET [178527] LOG:  terminating any other active server processes
2019-11-20 23:28:52.488 CET [2195470] WARNING:  terminating connection because of crash of another server process
2019-11-20 23:28:52.488 CET [2195470] DETAIL:  The postmaster has commanded this server process to roll back the
currenttransaction and exit, because another >
 

....

then it restart

2019-11-20 23:28:52.505 CET [178527] LOG:  all server processes terminated; reinitializing
2019-11-20 23:28:52.720 CET [2279954] LOG:  database system was interrupted; last known up at 2019-11-20 23:27:33 CET
2019-11-20 23:28:54.457 CET [2279954] LOG:  recovered replication state of node 1 to 0/3BBF52C8
2019-11-20 23:28:54.457 CET [2279954] LOG:  recovered replication state of node 2 to 2/BBCE38C0
2019-11-20 23:28:54.457 CET [2279954] LOG:  recovered replication state of node 3 to 0/0
2019-11-20 23:28:54.457 CET [2279954] LOG:  recovered replication state of node 4 to 2/BBD75448
2019-11-20 23:28:54.457 CET [2279954] LOG:  recovered replication state of node 5 to 0/55503B08
2019-11-20 23:28:54.457 CET [2279954] LOG:  recovered replication state of node 6 to 0/0
2019-11-20 23:28:54.457 CET [2279954] LOG:  recovered replication state of node 7 to 0/5561CE80
2019-11-20 23:28:54.457 CET [2279954] LOG:  recovered replication state of node 8 to 0/0
2019-11-20 23:28:54.457 CET [2279954] LOG:  recovered replication state of node 9 to 0/0
2019-11-20 23:28:54.457 CET [2279954] LOG:  recovered replication state of node 10 to 0/0
2019-11-20 23:28:54.457 CET [2279954] LOG:  database system was not properly shut down; automatic recovery in progress

did a redo once:

2019-11-20 23:28:54.463 CET [2279954] LOG:  redo starts at 8/6D794050
2019-11-20 23:28:54.464 CET [2279954] LOG:  invalid record length at 8/6D795370: wanted 24, got 0
2019-11-20 23:28:54.464 CET [2279954] LOG:  redo done at 8/6D795338

and went into an endless crash restart loop

> > So maybe something's wrong with the replica database (maybe because the
> > connection got killed by the walsender at unfortunate time), rather
> > than the original database, because I can replicate the original DB
> > afresh into a new copy just fine and other databases continue
> > replicating just fine if I disable the crashing subscription.
> > 
> 
> Possibly, but what would be the damaged bit? The only thing I can think
> of is the replication slot info (i.e. snapshot), and I know there were
> some timing issues in the serialization.
> 
> How far is the change from the restart point of the slot (visible in
> pg_replication_slots)? If there are many changes since then, that'd mean
> the corrupted snapshot is unlikely.

It's the same.

postgres=# select * from pg_replication_slots;

   slot_name    |  plugin  | slot_type | datoid |  database   | temporary | active | active_pid | xmin | catalog_xmin |
restart_lsn| confirmed_flush_lsn 
 

----------------+----------+-----------+--------+-------------+-----------+--------+------------+------+--------------+-------------+---------------------
 l5_news        | pgoutput | logical   |  16386 | news        | f         | f      |            |      |        56463 |
2/BBF679B8 | 2/BBF67B78
 
 l5_hometv      | pgoutput | logical   |  19030 | hometv      | f         | f      |            |      |        56395 |
2/BBD86E68 | 2/BBD86E68
 
 l5_hometv2     | pgoutput | logical   |  19030 | hometv      | f         | f      |            |      |        56463 |
2/BBF67B40 | 2/BBF67B78
 

> There's a lot of moving parts in this - you're replicating between major
> versions, and from ARM to x86. All of that should work, of course, but
> maybe there's a bug somewhere. So it might take time to investigate and
> fix. Thanks for you patience ;-)

Yeah, I readlize that. :) Thanks for your help.

One missing piece is what exactly is the contents of the outstanding output from
pgoutput plugin, that the replica crashes on and doesn't apply. Are there any
tools for inspecting the binary output from the pgoutput plugin? Maybe that can
provide a clue.

regards,
    o.

> regards
> 
> -- 
> Tomas Vondra                  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Ondřej Jirman

Дата:

21 ноября 2019 г., 20:37:17

On Thu, Nov 21, 2019 at 06:01:52PM +0100, Tomas Vondra wrote:
> On Thu, Nov 21, 2019 at 05:35:06PM +0100, Ondřej Jirman wrote:
> > On Thu, Nov 21, 2019 at 05:09:06PM +0100, Tomas Vondra wrote:
> > 
> > > On Thu, Nov 21, 2019 at 04:57:07PM +0100, Ondřej Jirman wrote:
> > > > On Thu, Nov 21, 2019 at 04:27:32PM +0100, Tomas Vondra wrote:
> > > > > On Thu, Nov 21, 2019 at 04:07:16PM +0100, Ondřej Jirman wrote:
> > > > > > I've also managed to get valgrind output on the segfaulting process:
> > > > > >
> > > > > > https://megous.com/dl/tmp/valgrind-105524-109020.log
> > > > > >
> > > > > > Though it probably doesn't show much new.
> > > > > >
> > > > >
> > > > > Is postgres actually compiled with valgrind support? What does pg_config
> > > > > say?
> > > >
> > > > I don't see any valgrind related options in configure --help, I repackaged it
> > > > just with CFLAGS set to -ggdb -Og.
> > > >
> > > 
> > > You need to use -DUSE_VALGRIND or edit pg_config_manual.h. I usually
> > > just use
> > > 
> > >   CFLAGS="-DRANDOMIZE_ALLOCATED_MEMORY -DUSE_VALGRIND -O0 -ggdb"
> > > 
> > > which does the trick. Without the USE_VALGRIND defined, we don't include
> > > the valgrind instrumentation, so I suppose the previous results are bogus.
> > 
> > Two new valgrind logs, now with -DUSE_VALGRIND:
> > 
> > https://megous.com/dl/tmp/valgrind-172090.log
> > https://megous.com/dl/tmp/valgrind-172143.log
> > 
> 
> Unfortunately, I only get HTTP 403 :-(

Sorry for that, it should be accessible now.

> > A lots of uninitialized 1b accesses around VARATT_IS_EXTERNAL_EXPANDED.
> > 
> > So slot->tts_values[natt] is probably uninitialized.
> > 
> > Which is why the backtrace had a weird value there, I guess.
> > 
> 
> Possibly. Something is clearly wrong.
> 
> 
> regards
> 
> -- 
> Tomas Vondra                  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Ondřej Jirman

Дата:

21 ноября 2019 г., 21:58:44

On Thu, Nov 21, 2019 at 06:35:55PM +0100, Ondřej Jirman wrote:
> One missing piece is what exactly is the contents of the outstanding output from
> pgoutput plugin, that the replica crashes on and doesn't apply. Are there any
> tools for inspecting the binary output from the pgoutput plugin? Maybe that can
> provide a clue.

So I've looked at it manually, and the segfaulting transaction doesn't make
much sense to me. On primary a row with 37880 byte cover_image was inserted,
but pgoutput plugin sends these records:

B
R - public.videos  title cover_image metadata,...
U - new row where cover_image tuple doesn't have any data, just a flag ('u')
C

Which means 'unchanged toast column' according to logical/proto.c

Yet cover_image is defined as not null.

I guess it's some special handling for VARATT_IS_EXTERNAL_ONDISK.

To me this looks like this transaction is assuming data for bytea column
cover_image are already replicated from earlier?

Maybe it is not though? It's certainly breaking some assumption on the
replica, because this is the precise point where segfault happens.

regards,
    o.

> regards,
>     o.
> 
> > regards
> > 
> > -- 
> > Tomas Vondra                  http://www.2ndQuadrant.com
> > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Tomas Vondra

Дата:

22 ноября 2019 г., 00:41:15

On Thu, Nov 21, 2019 at 07:58:44PM +0100, Ondřej Jirman wrote:
>On Thu, Nov 21, 2019 at 06:35:55PM +0100, Ondřej Jirman wrote:
>> One missing piece is what exactly is the contents of the outstanding output from
>> pgoutput plugin, that the replica crashes on and doesn't apply. Are there any
>> tools for inspecting the binary output from the pgoutput plugin? Maybe that can
>> provide a clue.
>
>So I've looked at it manually, and the segfaulting transaction doesn't make
>much sense to me. On primary a row with 37880 byte cover_image was inserted,
>but pgoutput plugin sends these records:
>
>B
>R - public.videos  title cover_image metadata,...
>U - new row where cover_image tuple doesn't have any data, just a flag ('u')
>C
>
>Which means 'unchanged toast column' according to logical/proto.c
>
>Yet cover_image is defined as not null.
>
>I guess it's some special handling for VARATT_IS_EXTERNAL_ONDISK.

Actually, no. The second message is 'U' so update, not insert. And we
don't send TOAST values (oversized values, stored aside) that were not
updated. Which is this case, because the cover_image was not updated (as
implied by the 'u' flag).

For the record, the decoded data looks like this:

1) relation structure (8 columns)

R, 19048, public, videos, d, 8,
   (true, id, 23 => int4, 0xffffffff)
   (false, title, 25 => text, 0xffffffff)
   (false, cover_image, 17 => bytea, 0xffffffff)
   (false, metadata, 3802 => jsonb, 0xffffffff)
   (false, category, 23 => int4, 0xffffffff)
   (false, published, 1082 => date, 0xffffffff)
   (false, added, 1114 => timestamp, 0xffffffff)
   (false, played, 16 => uuid, 0xffffffff)

2) update, with the TOAST value undefined/unchanged

U, 19048, N, 8,
   (4, 8393),
   (86, Pazderka: O média se povede boj. Kdyby nás politici milovali, bylo by něco špatně),
   u, /* unchanged bytea */
   (313, {.. json ..}),
   (3, 242),
   (10, 2019-11-18),
   (26, 2019-11-18 12:50:01.816376),
   (1, t)

So it's perfectly well formatted, I think.

>
>To me this looks like this transaction is assuming data for bytea column
>cover_image are already replicated from earlier?
>
>Maybe it is not though? It's certainly breaking some assumption on the
>replica, because this is the precise point where segfault happens.
>

I think this more and more seems like some sort of bug in the apply
code, not resetting a NULL flag and interpreting the bogus value as
a valid pointer. Wouldn't the be the first such bug.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Tomas Vondra

Дата:

22 ноября 2019 г., 01:08:54

On Thu, Nov 21, 2019 at 10:41:15PM +0100, Tomas Vondra wrote:
>On Thu, Nov 21, 2019 at 07:58:44PM +0100, Ondřej Jirman wrote:
>>On Thu, Nov 21, 2019 at 06:35:55PM +0100, Ondřej Jirman wrote:
>>>One missing piece is what exactly is the contents of the outstanding output from
>>>pgoutput plugin, that the replica crashes on and doesn't apply. Are there any
>>>tools for inspecting the binary output from the pgoutput plugin? Maybe that can
>>>provide a clue.
>>
>>So I've looked at it manually, and the segfaulting transaction doesn't make
>>much sense to me. On primary a row with 37880 byte cover_image was inserted,
>>but pgoutput plugin sends these records:
>>
>>B
>>R - public.videos  title cover_image metadata,...
>>U - new row where cover_image tuple doesn't have any data, just a flag ('u')
>>C
>>
>>Which means 'unchanged toast column' according to logical/proto.c
>>
>>Yet cover_image is defined as not null.
>>
>>I guess it's some special handling for VARATT_IS_EXTERNAL_ONDISK.
>
>Actually, no. The second message is 'U' so update, not insert. And we
>don't send TOAST values (oversized values, stored aside) that were not
>updated. Which is this case, because the cover_image was not updated (as
>implied by the 'u' flag).
>
>For the record, the decoded data looks like this:
>
>1) relation structure (8 columns)
>
>R, 19048, public, videos, d, 8,
>  (true, id, 23 => int4, 0xffffffff)
>  (false, title, 25 => text, 0xffffffff)
>  (false, cover_image, 17 => bytea, 0xffffffff)
>  (false, metadata, 3802 => jsonb, 0xffffffff)
>  (false, category, 23 => int4, 0xffffffff)
>  (false, published, 1082 => date, 0xffffffff)
>  (false, added, 1114 => timestamp, 0xffffffff)
>  (false, played, 16 => uuid, 0xffffffff)
>
>2) update, with the TOAST value undefined/unchanged
>
>U, 19048, N, 8,
>  (4, 8393),
>  (86, Pazderka: O média se povede boj. Kdyby nás politici milovali, bylo by něco špatně),
>  u, /* unchanged bytea */
>  (313, {.. json ..}),
>  (3, 242),
>  (10, 2019-11-18),
>  (26, 2019-11-18 12:50:01.816376),
>  (1, t)
>
>So it's perfectly well formatted, I think.
>
>>
>>To me this looks like this transaction is assuming data for bytea column
>>cover_image are already replicated from earlier?
>>
>>Maybe it is not though? It's certainly breaking some assumption on the
>>replica, because this is the precise point where segfault happens.
>>
>
>I think this more and more seems like some sort of bug in the apply
>code, not resetting a NULL flag and interpreting the bogus value as
>a valid pointer. Wouldn't the be the first such bug.
>

FWIW my hunch is the bug is somewhere in this chunk of code from
apply_heap_update:

     if (found)
     {
         /* Process and store remote tuple in the slot */
         oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
         ExecCopySlot(remoteslot, localslot);
         slot_modify_cstrings(remoteslot, rel, newtup.values, newtup.changed);
         MemoryContextSwitchTo(oldctx);

         EvalPlanQualSetSlot(&epqstate, remoteslot);

         /* Do the actual update. */
         ExecSimpleRelationUpdate(estate, &epqstate, localslot, remoteslot);
     }

Can you please switch to this fram in the backtrace and:

1) Print newtup.values and newtup.changed matches, so that we can check
it contains the tuple I shared before? There should be 8 elements in
each, with 'values' being values formatted as text, and 'changed' being
flags which values were updated.

2) Print remoteslot->tts_tupleDescriptor.

3) Print rel->attrmap (should have 8 elements)

4) Print remoteslot->tts_values and remoteslot->tts_isnull (8 elements).


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logical replication worker

От

Tom Lane

Дата:

22 ноября 2019 г., 01:37:25

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> FWIW my hunch is the bug is somewhere in this chunk of code from
> apply_heap_update:

>          oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
>          ExecCopySlot(remoteslot, localslot);
>          slot_modify_cstrings(remoteslot, rel, newtup.values, newtup.changed);
>          MemoryContextSwitchTo(oldctx);

My first reaction to looking at this was that ExecCopySlot() could
be dropped entirely, because the first thing that slot_modify_cstrings
does is

    slot_getallattrs(slot);
    ExecClearTuple(slot);

That slot_getallattrs call seems 100% useless as well, because
surely ExecClearTuple is just going to drop all the data.

On closer inspection, though, it looks like the author of this
code is relying on ExecClearTuple to *not* destroy the data,
because the subsequent loop might overwrite only some of the
slot's columns, before it does

    ExecStoreVirtualTuple(slot);

which supposes that all the columns are valid.

But of course this is 100% broken, because whatever ExecClearTuple
may have done or not done to the slot's datum/isnull arrays, it
certainly pfree'd the slot's physical tuple.  So any pass-by-ref
datums are now dangling pointers.

I imagine the only reason this code has gotten past the valgrind
animals is that we're not testing cases where non-replaced columns
in the subscriber table are of pass-by-ref types.

I draw a direct line between the existence of this bug and the lack
of commentary in slot_modify_cstrings about what it's doing.  If the
author had troubled to write a comment explaining these machinations,
he might've noticed the bug, or at least reviewers might have.

            regards, tom lane

Re: BUG #16129: Segfault in tts_virtual_materialize in logical replication worker

От

Tom Lane

Дата:

22 ноября 2019 г., 02:34:18

I wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> FWIW my hunch is the bug is somewhere in this chunk of code from
>> apply_heap_update:
>> oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
>> ExecCopySlot(remoteslot, localslot);
>> slot_modify_cstrings(remoteslot, rel, newtup.values, newtup.changed);
>> MemoryContextSwitchTo(oldctx);

> I imagine the only reason this code has gotten past the valgrind
> animals is that we're not testing cases where non-replaced columns
> in the subscriber table are of pass-by-ref types.

Actually, it doesn't appear to me that we're testing this with
any non-replaced columns at all.  The test modifications in the
attached proposed patch add that.  For me, the unpatched code
doesn't crash with this test, but the non-replaced column reads
back as empty which is certainly wrong.  Valgrind would likely
complain too, but I didn't try it.

            regards, tom lane

diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index ff62303..9c06b67 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -363,13 +363,19 @@ slot_store_cstrings(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
 }

 /*
- * Modify slot with user data provided as C strings.
+ * Replace selected columns with user data provided as C strings.
  * This is somewhat similar to heap_modify_tuple but also calls the type
- * input function on the user data as the input is the text representation
- * of the types.
+ * input functions on the user data.
+ * "slot" is filled with a copy of the tuple in "srcslot", with
+ * columns selected by the "replaces" array replaced with data values
+ * from "values".
+ * Caution: unreplaced pass-by-ref columns in "slot" will point into the
+ * storage for "srcslot".  This is OK for current usage, but someday we may
+ * need to materialize "slot" at the end to make it independent of "srcslot".
  */
 static void
-slot_modify_cstrings(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
+slot_modify_cstrings(TupleTableSlot *slot, TupleTableSlot *srcslot,
+                     LogicalRepRelMapEntry *rel,
                      char **values, bool *replaces)
 {
     int            natts = slot->tts_tupleDescriptor->natts;
@@ -377,9 +383,18 @@ slot_modify_cstrings(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
     SlotErrCallbackArg errarg;
     ErrorContextCallback errcallback;

-    slot_getallattrs(slot);
+    /* We'll fill "slot" with a virtual tuple, so we must start with ... */
     ExecClearTuple(slot);

+    /*
+     * Transfer all the column data from srcslot, so that we have valid values
+     * for unreplaced columns.
+     */
+    Assert(natts == srcslot->tts_tupleDescriptor->natts);
+    slot_getallattrs(srcslot);
+    memcpy(slot->tts_values, srcslot->tts_values, natts * sizeof(Datum));
+    memcpy(slot->tts_isnull, srcslot->tts_isnull, natts * sizeof(bool));
+
     /* Push callback + info on the error context stack */
     errarg.rel = rel;
     errarg.local_attnum = -1;
@@ -428,6 +443,7 @@ slot_modify_cstrings(TupleTableSlot *slot, LogicalRepRelMapEntry *rel,
     /* Pop the error context stack */
     error_context_stack = errcallback.previous;

+    /* And finally, declare that "slot" contains a valid virtual tuple */
     ExecStoreVirtualTuple(slot);
 }

@@ -740,8 +756,8 @@ apply_handle_update(StringInfo s)
     {
         /* Process and store remote tuple in the slot */
         oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
-        ExecCopySlot(remoteslot, localslot);
-        slot_modify_cstrings(remoteslot, rel, newtup.values, newtup.changed);
+        slot_modify_cstrings(remoteslot, localslot, rel,
+                             newtup.values, newtup.changed);
         MemoryContextSwitchTo(oldctx);

         EvalPlanQualSetSlot(&epqstate, remoteslot);
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index 40e306a..116e487 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 17;
+use Test::More tests => 18;

 # Initialize publisher node
 my $node_publisher = get_new_node('publisher');
@@ -45,7 +45,7 @@ $node_subscriber->safe_psql('postgres',

 # different column count and order than on publisher
 $node_subscriber->safe_psql('postgres',
-    "CREATE TABLE tab_mixed (c text, b text, a int primary key)");
+    "CREATE TABLE tab_mixed (c text default 'local', b text, a int primary key)");

 # replication of the table with included index
 $node_subscriber->safe_psql('postgres',
@@ -114,8 +114,8 @@ is($result, qq(20|-20|-1), 'check replicated changes on subscriber');

 $result =
   $node_subscriber->safe_psql('postgres', "SELECT c, b, a FROM tab_mixed");
-is( $result, qq(|foo|1
-|bar|2), 'check replicated changes with different column order');
+is( $result, qq(local|foo|1
+local|bar|2), 'check replicated changes with different column order');

 $result = $node_subscriber->safe_psql('postgres',
     "SELECT count(*), min(a), max(a) FROM tab_include");
@@ -139,11 +139,14 @@ $node_publisher->safe_psql('postgres',
     "ALTER TABLE tab_ins REPLICA IDENTITY FULL");
 $node_subscriber->safe_psql('postgres',
     "ALTER TABLE tab_ins REPLICA IDENTITY FULL");
+# tab_mixed can use DEFAULT, since it has a primary key

 # and do the updates
 $node_publisher->safe_psql('postgres', "UPDATE tab_full SET a = a * a");
 $node_publisher->safe_psql('postgres',
     "UPDATE tab_full2 SET x = 'bb' WHERE x = 'b'");
+$node_publisher->safe_psql('postgres',
+    "UPDATE tab_mixed SET b = 'baz' WHERE a = 1");

 $node_publisher->wait_for_catchup('tap_sub');

@@ -159,6 +162,12 @@ bb
 bb),
     'update works with REPLICA IDENTITY FULL and text datums');

+$result = $node_subscriber->safe_psql('postgres',
+    "SELECT * FROM tab_mixed ORDER BY a");
+is( $result, qq(local|baz|1
+local|bar|2),
+    'update works with different column order and subscriber local values');
+
 # check that change of connection string and/or publication list causes
 # restart of subscription workers. Not all of these are registered as tests
 # as we need to poll for a change but the test suite will fail none the less

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Tomas Vondra

Дата:

22 ноября 2019 г., 02:47:01

On Thu, Nov 21, 2019 at 05:37:25PM -0500, Tom Lane wrote:
>Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> FWIW my hunch is the bug is somewhere in this chunk of code from
>> apply_heap_update:
>
>>          oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
>>          ExecCopySlot(remoteslot, localslot);
>>          slot_modify_cstrings(remoteslot, rel, newtup.values, newtup.changed);
>>          MemoryContextSwitchTo(oldctx);
>
>My first reaction to looking at this was that ExecCopySlot() could
>be dropped entirely, because the first thing that slot_modify_cstrings
>does is
>
>    slot_getallattrs(slot);
>    ExecClearTuple(slot);
>
>That slot_getallattrs call seems 100% useless as well, because
>surely ExecClearTuple is just going to drop all the data.
>

I don't think that's quite true. After the ExecCopySlot call, the
pass-by-ref Datums in remoteslot will point to a tuple attached to
localslot. But it does not pass the tuple 'ownership' to the remoteslot,
i.e. the flag TTS_FLAG_SHOULDFREE won't be set, i.e. the tuple won't be
freed.

>On closer inspection, though, it looks like the author of this
>code is relying on ExecClearTuple to *not* destroy the data,
>because the subsequent loop might overwrite only some of the
>slot's columns, before it does
>
>    ExecStoreVirtualTuple(slot);
>
>which supposes that all the columns are valid.
>
>But of course this is 100% broken, because whatever ExecClearTuple
>may have done or not done to the slot's datum/isnull arrays, it
>certainly pfree'd the slot's physical tuple.  So any pass-by-ref
>datums are now dangling pointers.
>
>I imagine the only reason this code has gotten past the valgrind
>animals is that we're not testing cases where non-replaced columns
>in the subscriber table are of pass-by-ref types.
>

I haven't checked, but I'd imagine we actually do such tests. I've
however tried to reproduce this, unsuccessfully.

>I draw a direct line between the existence of this bug and the lack
>of commentary in slot_modify_cstrings about what it's doing.  If the
>author had troubled to write a comment explaining these machinations,
>he might've noticed the bug, or at least reviewers might have.
>

Not sure. More comments would not hurt, but I don't see any bug yet.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logical replication worker

От

Tom Lane

Дата:

22 ноября 2019 г., 02:56:47

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> On Thu, Nov 21, 2019 at 05:37:25PM -0500, Tom Lane wrote:
> I don't think that's quite true. After the ExecCopySlot call, the
> pass-by-ref Datums in remoteslot will point to a tuple attached to
> localslot. But it does not pass the tuple 'ownership' to the remoteslot,
> i.e. the flag TTS_FLAG_SHOULDFREE won't be set, i.e. the tuple won't be
> freed.

Nope:

static void
tts_heap_copyslot(TupleTableSlot *dstslot, TupleTableSlot *srcslot)
{
    HeapTuple    tuple;
    MemoryContext oldcontext;

    oldcontext = MemoryContextSwitchTo(dstslot->tts_mcxt);
    tuple = ExecCopySlotHeapTuple(srcslot);
    MemoryContextSwitchTo(oldcontext);

    ExecStoreHeapTuple(tuple, dstslot, true);
}

"remoteslot" will contain its very own copy of the data, which
is then summarily freed by ExecClearSlot.

>> I imagine the only reason this code has gotten past the valgrind
>> animals is that we're not testing cases where non-replaced columns
>> in the subscriber table are of pass-by-ref types.

> I haven't checked, but I'd imagine we actually do such tests. I've
> however tried to reproduce this, unsuccessfully.

I did check, and we don't.  See patch.

It's possible that the OP is seeing some different problem,
but I can definitely demonstrate that there is a problem
that this change fixes.

            regards, tom lane

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Tomas Vondra

Дата:

22 ноября 2019 г., 03:11:10

On Thu, Nov 21, 2019 at 06:56:47PM -0500, Tom Lane wrote:
>Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> On Thu, Nov 21, 2019 at 05:37:25PM -0500, Tom Lane wrote:
>> I don't think that's quite true. After the ExecCopySlot call, the
>> pass-by-ref Datums in remoteslot will point to a tuple attached to
>> localslot. But it does not pass the tuple 'ownership' to the remoteslot,
>> i.e. the flag TTS_FLAG_SHOULDFREE won't be set, i.e. the tuple won't be
>> freed.
>
>Nope:
>
>static void
>tts_heap_copyslot(TupleTableSlot *dstslot, TupleTableSlot *srcslot)
>{
>    HeapTuple    tuple;
>    MemoryContext oldcontext;
>
>    oldcontext = MemoryContextSwitchTo(dstslot->tts_mcxt);
>    tuple = ExecCopySlotHeapTuple(srcslot);
>    MemoryContextSwitchTo(oldcontext);
>
>    ExecStoreHeapTuple(tuple, dstslot, true);
>}
>
>"remoteslot" will contain its very own copy of the data, which
>is then summarily freed by ExecClearSlot.
>

But remoteslot is virtual, so it calls tts_virtual_copyslot, not the
heap one. And AFAIK tts_virtual_copyslot only copies contents of the
tts_values/tts_isnull arrays.

>>> I imagine the only reason this code has gotten past the valgrind
>>> animals is that we're not testing cases where non-replaced columns
>>> in the subscriber table are of pass-by-ref types.
>
>> I haven't checked, but I'd imagine we actually do such tests. I've
>> however tried to reproduce this, unsuccessfully.
>
>I did check, and we don't.  See patch.
>
>It's possible that the OP is seeing some different problem,
>but I can definitely demonstrate that there is a problem
>that this change fixes.
>

Hmm. Will check.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logical replication worker

От

Tom Lane

Дата:

22 ноября 2019 г., 03:33:35

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> On Thu, Nov 21, 2019 at 06:56:47PM -0500, Tom Lane wrote:
>> "remoteslot" will contain its very own copy of the data, which
>> is then summarily freed by ExecClearSlot.

> But remoteslot is virtual, so it calls tts_virtual_copyslot, not the
> heap one. And AFAIK tts_virtual_copyslot only copies contents of the
> tts_values/tts_isnull arrays.

Really?

static void
tts_virtual_copyslot(TupleTableSlot *dstslot, TupleTableSlot *srcslot)
{
    ...
    /* make sure storage doesn't depend on external memory */
    tts_virtual_materialize(dstslot);
}

In any case, I sure hope that Andres hasn't made the different
versions of ExecCopySlot have different semantics --- if he has,
he's got some explaining to do.  But at least on this point,
it looks to me like all three versions still satisfy the
semantics that were clearly defined previously (v11 and before):

/* --------------------------------
 *        ExecCopySlot
 *            Copy the source slot's contents into the destination slot.
 *
 *        The destination acquires a private copy that will not go away
 *        if the source is cleared.
 *
 *        The caller must ensure the slots have compatible tupdescs.
 * --------------------------------
 */

I'm sad that we seem to have lost this specification comment, though.

            regards, tom lane

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Ondřej Jirman

Дата:

22 ноября 2019 г., 03:36:26

On Thu, Nov 21, 2019 at 11:08:54PM +0100, Tomas Vondra wrote:
> > I think this more and more seems like some sort of bug in the apply
> > code, not resetting a NULL flag and interpreting the bogus value as
> > a valid pointer. Wouldn't the be the first such bug.
> > 
> 
> FWIW my hunch is the bug is somewhere in this chunk of code from
> apply_heap_update:
> 
>     if (found)
>     {
>         /* Process and store remote tuple in the slot */
>         oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
>         ExecCopySlot(remoteslot, localslot);
>         slot_modify_cstrings(remoteslot, rel, newtup.values, newtup.changed);
>         MemoryContextSwitchTo(oldctx);
> 
>         EvalPlanQualSetSlot(&epqstate, remoteslot);
> 
>         /* Do the actual update. */
>         ExecSimpleRelationUpdate(estate, &epqstate, localslot, remoteslot);
>     }
> 
> Can you please switch to this fram in the backtrace and:
> 
> 1) Print newtup.values and newtup.changed matches, so that we can check
> it contains the tuple I shared before? There should be 8 elements in
> each, with 'values' being values formatted as text, and 'changed' being
> flags which values were updated.

newtup.values

{0x55ac6506b488 "8393", 0x55ac6506b4a8 "Pazderka: O média se povede boj. Kdyby
nás politici milovali, bylo by něco špatně", 0x0, 0x55ac6506b540 "{\"id\":
\"d010d44009e211eab259ac1f6b220ee8\", \"date\": \"2019-11-18\", \"name\":
\"Pazderka: O média se povede boj. Kdyby nás politici milovali, bylo by něco
špatně\", \"image\": \"https://cdn.xsd.cz/resize/c"..., 0x55ac6506b758 "242",
0x55ac6506b778 "2019-11-18", 0x55ac6506b7a0 "2019-11-18 12:50:01.816376",
0x55ac6506b7d8 "t", ...garbage}

newtup.changed

{true, true, false, true, true, true, true, true, false <repeats 1656 times>}

Looks fine here.

> 2) Print remoteslot->tts_tupleDescriptor.

{natts = 8, tdtypeid = 55242, tdtypmod = -1, tdrefcount = 3, constr = 0x7f2103867738, attrs = 0x7f2103868448}

> 3) Print rel->attrmap (should have 8 elements)

Has numbers 0 to 7.

> 4) Print remoteslot->tts_values and remoteslot->tts_isnull (8 elements).

(gdb) p remoteslot->tts_values[0] 
$26 = 8393
(gdb) p remoteslot->tts_values[1] 
$27 = 94198917676232
(gdb) p remoteslot->tts_values[2] 
$28 = 94198917676320
(gdb) p remoteslot->tts_values[3] 
$29 = 94198918028352
(gdb) p remoteslot->tts_values[4] 
$30 = 242
(gdb) p remoteslot->tts_values[5] 
$31 = 7261
(gdb) p remoteslot->tts_values[6] 
$32 = 627396601816376
(gdb) p remoteslot->tts_values[7] 
$33 = 1

(gdb) p remoteslot->tts_isnull[0]
$36 = false
(gdb) p remoteslot->tts_isnull[1]
$37 = false
(gdb) p remoteslot->tts_isnull[2]
$38 = false
(gdb) p remoteslot->tts_isnull[3]
$39 = false
(gdb) p remoteslot->tts_isnull[4]
$40 = false
(gdb) p remoteslot->tts_isnull[5]
$41 = false
(gdb) p remoteslot->tts_isnull[6]
$42 = false
(gdb) p remoteslot->tts_isnull[7]
$43 = false

Nothing surprising to me here.

regards,
    o.

> 
> regards
> 
> -- 
> Tomas Vondra                  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logical replication worker

От

Tom Lane

Дата:

22 ноября 2019 г., 03:45:34

=?utf-8?Q?Ond=C5=99ej?= Jirman <ienieghapheoghaiwida@xff.cz> writes:
> On Thu, Nov 21, 2019 at 11:08:54PM +0100, Tomas Vondra wrote:
>> 1) Print newtup.values and newtup.changed matches, so that we can check
>> it contains the tuple I shared before? There should be 8 elements in
>> each, with 'values' being values formatted as text, and 'changed' being
>> flags which values were updated.

> newtup.changed

> {true, true, false, true, true, true, true, true, false <repeats 1656 times>}

So column 3 is not getting replaced.  That seems to jibe with my
theory that slot_modify_cstrings is failing to handle that case
correctly.  Can you try applying the patch I posted and see if
it fixes the issue for you?

            regards, tom lane

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Ondřej Jirman

Дата:

22 ноября 2019 г., 03:54:01

On Thu, Nov 21, 2019 at 07:45:34PM -0500, Tom Lane wrote:
> =?utf-8?Q?Ond=C5=99ej?= Jirman <ienieghapheoghaiwida@xff.cz> writes:
> > On Thu, Nov 21, 2019 at 11:08:54PM +0100, Tomas Vondra wrote:
> >> 1) Print newtup.values and newtup.changed matches, so that we can check
> >> it contains the tuple I shared before? There should be 8 elements in
> >> each, with 'values' being values formatted as text, and 'changed' being
> >> flags which values were updated.
> 
> > newtup.changed
> 
> > {true, true, false, true, true, true, true, true, false <repeats 1656 times>}
> 
> So column 3 is not getting replaced.  That seems to jibe with my
> theory that slot_modify_cstrings is failing to handle that case
> correctly.  Can you try applying the patch I posted and see if
> it fixes the issue for you?

With the patch the replication resumes, without a crash.

confirmed_flush_lsn is now same for all my replication slots.

thank you and regards,
    o.

>             regards, tom lane

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Tomas Vondra

Дата:

22 ноября 2019 г., 04:10:05

On Thu, Nov 21, 2019 at 07:33:35PM -0500, Tom Lane wrote:
>Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> On Thu, Nov 21, 2019 at 06:56:47PM -0500, Tom Lane wrote:
>>> "remoteslot" will contain its very own copy of the data, which
>>> is then summarily freed by ExecClearSlot.
>
>> But remoteslot is virtual, so it calls tts_virtual_copyslot, not the
>> heap one. And AFAIK tts_virtual_copyslot only copies contents of the
>> tts_values/tts_isnull arrays.
>
>Really?
>
>static void
>tts_virtual_copyslot(TupleTableSlot *dstslot, TupleTableSlot *srcslot)
>{
>    ...
>    /* make sure storage doesn't depend on external memory */
>    tts_virtual_materialize(dstslot);
>}
>

D'oh! I entirely missed that last call, for some reason. And yes, the
new test does show we're using freed memory, the \x7F\x7F\x7F\x7F...
pattern is pretty clear.

It's a interesting it happens only with mismatching descriptors, though.
(Which the OP does not have, AFAICS.) I wonder if this might be due to
upstream being 11.6 ...

>In any case, I sure hope that Andres hasn't made the different
>versions of ExecCopySlot have different semantics --- if he has,
>he's got some explaining to do.  But at least on this point,
>it looks to me like all three versions still satisfy the
>semantics that were clearly defined previously (v11 and before):
>
>/* --------------------------------
> *        ExecCopySlot
> *            Copy the source slot's contents into the destination slot.
> *
> *        The destination acquires a private copy that will not go away
> *        if the source is cleared.
> *
> *        The caller must ensure the slots have compatible tupdescs.
> * --------------------------------
> */
>
>I'm sad that we seem to have lost this specification comment, though.
>

True.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logical replication worker

От

Tom Lane

Дата:

22 ноября 2019 г., 04:21:15

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> It's a interesting it happens only with mismatching descriptors, though.
> (Which the OP does not have, AFAICS.) I wonder if this might be due to
> upstream being 11.6 ...

I was wondering about dropped columns on one or both sides, myself.
A quick look through src/test/subscription/t/ finds no evidence that
we test such cases.

            regards, tom lane

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Tomas Vondra

Дата:

22 ноября 2019 г., 19:12:48

On Fri, Nov 22, 2019 at 01:54:01AM +0100, Ondřej Jirman wrote:
>On Thu, Nov 21, 2019 at 07:45:34PM -0500, Tom Lane wrote:
>> =?utf-8?Q?Ond=C5=99ej?= Jirman <ienieghapheoghaiwida@xff.cz> writes:
>> > On Thu, Nov 21, 2019 at 11:08:54PM +0100, Tomas Vondra wrote:
>> >> 1) Print newtup.values and newtup.changed matches, so that we can check
>> >> it contains the tuple I shared before? There should be 8 elements in
>> >> each, with 'values' being values formatted as text, and 'changed' being
>> >> flags which values were updated.
>>
>> > newtup.changed
>>
>> > {true, true, false, true, true, true, true, true, false <repeats 1656 times>}
>>
>> So column 3 is not getting replaced.  That seems to jibe with my
>> theory that slot_modify_cstrings is failing to handle that case
>> correctly.  Can you try applying the patch I posted and see if
>> it fixes the issue for you?
>
>With the patch the replication resumes, without a crash.
>
>confirmed_flush_lsn is now same for all my replication slots.
>
>thank you and regards,

Can you show us the attribute list as defined in the system, including
e.g. dropped columns? That is, something like

   SELECT attnum, attname, atttypid FROM pg_attribute
    WHERE attrelid = 'public.videos'::regclass;

both from the published and subscriber.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16129: Segfault in tts_virtual_materialize in logical replication worker

От

Tom Lane

Дата:

22 ноября 2019 г., 19:50:03

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> On Fri, Nov 22, 2019 at 01:54:01AM +0100, Ondřej Jirman wrote:
>> On Thu, Nov 21, 2019 at 07:45:34PM -0500, Tom Lane wrote:
>>> =?utf-8?Q?Ond=C5=99ej?= Jirman <ienieghapheoghaiwida@xff.cz> writes:
>>>> newtup.changed
>>>> {true, true, false, true, true, true, true, true, false <repeats 1656 times>}

>>> So column 3 is not getting replaced.

> Can you show us the attribute list as defined in the system, including
> e.g. dropped columns? That is, something like
>    SELECT attnum, attname, atttypid FROM pg_attribute
>     WHERE attrelid = 'public.videos'::regclass;
> both from the published and subscriber.

After digging around a bit more in the logrep code, it seems like we'd
only have a situation with newtup.changed[i] = false if the publisher
had hit this bit in logicalrep_write_tuple:

        else if (att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(values[i]))
        {
            pq_sendbyte(out, 'u');    /* unchanged toast column */
            continue;
        }

So this seems to indicate that Ondřej's case involves an update on a row
that contained a toasted-out-of-line column that did not get updated.
That's different from the trigger conditions that I postulated, but the
end result is the same: slot_modify_cstrings would have a case where
it's supposed to retain the old data for a pass-by-ref column, and it'd
fail to do that correctly.

I also discovered that before v12, the calling code looked like this:

        ExecStoreTuple(localslot->tts_tuple, remoteslot, InvalidBuffer, false);
        slot_modify_cstrings(remoteslot, rel, newtup.values, newtup.changed);

so that at that time, "remoteslot" indeed did not have its own local
copy of the tuple, and thus the code failed to fail.  Perhaps this was
intentional on the part of whoever wrote this, but it's still an
unacceptable abuse of the API if you ask me.

I added some tests involving dropped columns to what I had last night
and pushed it.   However, looking at this now, I see that we really
still don't have enough coverage --- the code path I showed above
is not hit by the regression tests, according to the code coverage
bot.  I don't think it's really acceptable for such a corner case
bit of the logrep protocol to not get tested :-(

            regards, tom lane

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Ondřej Jirman

Дата:

22 ноября 2019 г., 20:32:04

On Fri, Nov 22, 2019 at 11:50:03AM -0500, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> > On Fri, Nov 22, 2019 at 01:54:01AM +0100, Ondřej Jirman wrote:
> >> On Thu, Nov 21, 2019 at 07:45:34PM -0500, Tom Lane wrote:
> >>> =?utf-8?Q?Ond=C5=99ej?= Jirman <ienieghapheoghaiwida@xff.cz> writes:
> >>>> newtup.changed
> >>>> {true, true, false, true, true, true, true, true, false <repeats 1656 times>}
> 
> >>> So column 3 is not getting replaced.
> 
> > Can you show us the attribute list as defined in the system, including
> > e.g. dropped columns? That is, something like
> >    SELECT attnum, attname, atttypid FROM pg_attribute
> >     WHERE attrelid = 'public.videos'::regclass;
> > both from the published and subscriber.
> 
> After digging around a bit more in the logrep code, it seems like we'd
> only have a situation with newtup.changed[i] = false if the publisher
> had hit this bit in logicalrep_write_tuple:
> 
>         else if (att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(values[i]))
>         {
>             pq_sendbyte(out, 'u');    /* unchanged toast column */
>             continue;
>         }
> 
> So this seems to indicate that Ondřej's case involves an update on a row
> that contained a toasted-out-of-line column that did not get updated.
> That's different from the trigger conditions that I postulated, but the
> end result is the same: slot_modify_cstrings would have a case where
> it's supposed to retain the old data for a pass-by-ref column, and it'd
> fail to do that correctly.

Yes, it was UPDATE videos SET played = TRUE; on a row that had 38kB BYTEA
value in cover_image.

Thank you both, Tom and Tomas, for your help.

regards,
    o.

> I also discovered that before v12, the calling code looked like this:
> 
>         ExecStoreTuple(localslot->tts_tuple, remoteslot, InvalidBuffer, false);
>         slot_modify_cstrings(remoteslot, rel, newtup.values, newtup.changed);
> 
> so that at that time, "remoteslot" indeed did not have its own local
> copy of the tuple, and thus the code failed to fail.  Perhaps this was
> intentional on the part of whoever wrote this, but it's still an
> unacceptable abuse of the API if you ask me.
> 
> I added some tests involving dropped columns to what I had last night
> and pushed it.   However, looking at this now, I see that we really
> still don't have enough coverage --- the code path I showed above
> is not hit by the regression tests, according to the code coverage
> bot.  I don't think it's really acceptable for such a corner case
> bit of the logrep protocol to not get tested :-(
> 
>             regards, tom lane

Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

От

Ondřej Jirman

Дата:

22 ноября 2019 г., 20:40:03

On Fri, Nov 22, 2019 at 05:12:48PM +0100, Tomas Vondra wrote:
> On Fri, Nov 22, 2019 at 01:54:01AM +0100, Ondřej Jirman wrote:
> > On Thu, Nov 21, 2019 at 07:45:34PM -0500, Tom Lane wrote:
> > > =?utf-8?Q?Ond=C5=99ej?= Jirman <ienieghapheoghaiwida@xff.cz> writes:
> > > > On Thu, Nov 21, 2019 at 11:08:54PM +0100, Tomas Vondra wrote:
> > > >> 1) Print newtup.values and newtup.changed matches, so that we can check
> > > >> it contains the tuple I shared before? There should be 8 elements in
> > > >> each, with 'values' being values formatted as text, and 'changed' being
> > > >> flags which values were updated.
> > > 
> > > > newtup.changed
> > > 
> > > > {true, true, false, true, true, true, true, true, false <repeats 1656 times>}
> > > 
> > > So column 3 is not getting replaced.  That seems to jibe with my
> > > theory that slot_modify_cstrings is failing to handle that case
> > > correctly.  Can you try applying the patch I posted and see if
> > > it fixes the issue for you?
> > 
> > With the patch the replication resumes, without a crash.
> > 
> > confirmed_flush_lsn is now same for all my replication slots.
> > 
> > thank you and regards,
> 
> Can you show us the attribute list as defined in the system, including
> e.g. dropped columns? That is, something like
> 
>   SELECT attnum, attname, atttypid FROM pg_attribute
>    WHERE attrelid = 'public.videos'::regclass;
> 
> both from the published and subscriber.

I don't think there were any dropped columns and replica was always created
by schema copy + subscribe with copy_data=true. Anyway, the original database is
gone now (on both ends), so I can't get any more info.

regards,
    o.

> regards
> 
> -- 
> Tomas Vondra                  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: BUG #16129: Segfault in tts_virtual_materialize in logical replication worker