Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker

Поиск
Список
Период
Сортировка
От Ondřej Jirman
Тема Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker
Дата
Msg-id 20191122173204.hevpm4hgpk2tiyl6@core.my.home
обсуждение исходный текст
Ответ на Re: BUG #16129: Segfault in tts_virtual_materialize in logical replication worker  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
On Fri, Nov 22, 2019 at 11:50:03AM -0500, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> > On Fri, Nov 22, 2019 at 01:54:01AM +0100, Ondřej Jirman wrote:
> >> On Thu, Nov 21, 2019 at 07:45:34PM -0500, Tom Lane wrote:
> >>> =?utf-8?Q?Ond=C5=99ej?= Jirman <ienieghapheoghaiwida@xff.cz> writes:
> >>>> newtup.changed
> >>>> {true, true, false, true, true, true, true, true, false <repeats 1656 times>}
> 
> >>> So column 3 is not getting replaced.
> 
> > Can you show us the attribute list as defined in the system, including
> > e.g. dropped columns? That is, something like
> >    SELECT attnum, attname, atttypid FROM pg_attribute
> >     WHERE attrelid = 'public.videos'::regclass;
> > both from the published and subscriber.
> 
> After digging around a bit more in the logrep code, it seems like we'd
> only have a situation with newtup.changed[i] = false if the publisher
> had hit this bit in logicalrep_write_tuple:
> 
>         else if (att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(values[i]))
>         {
>             pq_sendbyte(out, 'u');    /* unchanged toast column */
>             continue;
>         }
> 
> So this seems to indicate that Ondřej's case involves an update on a row
> that contained a toasted-out-of-line column that did not get updated.
> That's different from the trigger conditions that I postulated, but the
> end result is the same: slot_modify_cstrings would have a case where
> it's supposed to retain the old data for a pass-by-ref column, and it'd
> fail to do that correctly.

Yes, it was UPDATE videos SET played = TRUE; on a row that had 38kB BYTEA
value in cover_image.

Thank you both, Tom and Tomas, for your help.

regards,
    o.

> I also discovered that before v12, the calling code looked like this:
> 
>         ExecStoreTuple(localslot->tts_tuple, remoteslot, InvalidBuffer, false);
>         slot_modify_cstrings(remoteslot, rel, newtup.values, newtup.changed);
> 
> so that at that time, "remoteslot" indeed did not have its own local
> copy of the tuple, and thus the code failed to fail.  Perhaps this was
> intentional on the part of whoever wrote this, but it's still an
> unacceptable abuse of the API if you ask me.
> 
> I added some tests involving dropped columns to what I had last night
> and pushed it.   However, looking at this now, I see that we really
> still don't have enough coverage --- the code path I showed above
> is not hit by the regression tests, according to the code coverage
> bot.  I don't think it's really acceptable for such a corner case
> bit of the logrep protocol to not get tested :-(
> 
>             regards, tom lane



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: BUG #16129: Segfault in tts_virtual_materialize in logical replication worker
Следующее
От: Ondřej Jirman
Дата:
Сообщение: Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker