Re: skink's test_decoding failures in 9.4 branch

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: skink's test_decoding failures in 9.4 branch
Дата
Msg-id 11035.1469058247@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: skink's test_decoding failures in 9.4 branch  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> I guess either using valgrind's gdb server on error, or putting some
> asserts checking the size would be best. I can look into it, but it'll
> not be today likely.

I believe the problem is that DecodeUpdate is not on the same page as the
WAL-writing routines about how much data there is for an old_key_tuple.
Specifically, I see this in 9.4's log_heap_update():
       if (old_key_tuple)       {           ...           xlhdr_idx.t_len = old_key_tuple->t_len;
           rdata[nr].data = (char *) old_key_tuple->t_data               + offsetof(HeapTupleHeaderData, t_bits);
   rdata[nr].len = old_key_tuple->t_len               - offsetof(HeapTupleHeaderData, t_bits);           ...       }
 

so that the amount of tuple data that's *actually* in WAL is
offsetof(HeapTupleHeaderData, t_bits) less than what t_len says.
However, over in DecodeUpdate, this is processed with
       xl_heap_header_len xlhdr;
       memcpy(&xlhdr, data, sizeof(xlhdr));       ...       datalen = xlhdr.t_len + SizeOfHeapHeader;       ...
DecodeXLogTuple(data,datalen, change->data.tp.oldtuple);
 

and what DecodeXLogTuple does is
   int            datalen = len - SizeOfHeapHeader;   (so we're back to datalen == xlhdr.t_len)   ...   memcpy(((char
*)tuple->tuple.t_data) + offsetof(HeapTupleHeaderData, t_bits),          data + SizeOfHeapHeader,          datalen);
 

so that we are copying offsetof(HeapTupleHeaderData, t_bits) too much
data from the WAL buffer.  Most of the time this doesn't hurt but it's
making valgrind complain, and on a unlucky day we might crash entirely.

I have not looked to see if the bug also exists in > 9.4.  Also, it's
not very clear to me whether other call sites for DecodeXLogTuple might
have related bugs.
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Password identifiers, protocol aging and SCRAM protocol
Следующее
От: Noah Misch
Дата:
Сообщение: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <