Re: uninitialized values in revised prepared xact code

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: uninitialized values in revised prepared xact code
Дата
Msg-id 544A5E9D.2090808@vmware.com
обсуждение исходный текст
Ответ на Re: uninitialized values in revised prepared xact code  (Bruce Momjian <bruce@momjian.us>)
Ответы Re: uninitialized values in revised prepared xact code
Список pgsql-hackers
On 10/11/2014 10:20 PM, Bruce Momjian wrote:
>
> Uh, was this fixed.  I see a cleanup commit for this C file, but this
> report is from June:
>
>     commit 07a4a93a0e35a778c77ffbbbc18de29e859e18f0
>     Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
>     Date:   Fri May 16 09:47:50 2014 +0300
>     
>         Initialize tsId and dbId fields in WAL record of COMMIT PREPARED.
>     
>         Commit dd428c79 added dbId and tsId to the xl_xact_commit struct but missed
>         that prepared transaction commits reuse that struct. Fix that.
>     
>         Because those fields were left unitialized, replaying a commit prepared WAL
>         record in a hot standby node would fail to remove the relcache init file.
>         That can lead to "could not open file" errors on the standby. Relcache init
>         file only needs to be removed when a system table/index is rewritten in the
>         transaction using two phase commit, so that should be rare in practice. In
>         HEAD, the incorrect dbId/tsId values are also used for filtering in logical
>         replication code, causing the transaction to always be filtered out.
>     
>         Analysis and fix by Andres Freund. Backpatch to 9.0 where hot standby was
>         introduced.

No, that was a different issue.

(more below)

> On Mon, Jun 30, 2014 at 11:58:59AM +0200, Andres Freund wrote:
>> Hi,
>>
>> I've just rerun valgrind for the first time in a while and saw the
>> following splat. My guess is it exists since bb38fb0d43c, but that's
>> blindly guessing:
>>
>> ==2049== Use of uninitialised value of size 8
>> ==2049==    at 0x4FE66D: EndPrepare (twophase.c:1063)
>> ==2049==    by 0x4F231B: PrepareTransaction (xact.c:2217)
>> ==2049==    by 0x4F2A38: CommitTransactionCommand (xact.c:2676)
>> ==2049==    by 0x79013E: finish_xact_command (postgres.c:2408)
>> ==2049==    by 0x78DE97: exec_simple_query (postgres.c:1062)
>> ==2049==    by 0x791FDD: PostgresMain (postgres.c:4010)
>> ==2049==    by 0x71B13B: BackendRun (postmaster.c:4113)
>> ==2049==    by 0x71A86D: BackendStartup (postmaster.c:3787)
>> ==2049==    by 0x71714C: ServerLoop (postmaster.c:1566)
>> ==2049==    by 0x716804: PostmasterMain (postmaster.c:1219)
>> ==2049==    by 0x679405: main (main.c:219)
>> ==2049==  Uninitialised value was created by a stack allocation
>> ==2049==    at 0x4FE16C: StartPrepare (twophase.c:942)
>> ==2049==
>> ==2049== Syscall param write(buf) points to uninitialised byte(s)
>> ==2049==    at 0x5C69640: __write_nocancel (syscall-template.S:81)
>> ==2049==    by 0x4FE6AE: EndPrepare (twophase.c:1064)
>> ==2049==    by 0x4F231B: PrepareTransaction (xact.c:2217)
>> ==2049==    by 0x4F2A38: CommitTransactionCommand (xact.c:2676)
>> ==2049==    by 0x79013E: finish_xact_command (postgres.c:2408)
>> ==2049==    by 0x78DE97: exec_simple_query (postgres.c:1062)
>> ==2049==    by 0x791FDD: PostgresMain (postgres.c:4010)
>> ==2049==    by 0x71B13B: BackendRun (postmaster.c:4113)
>> ==2049==    by 0x71A86D: BackendStartup (postmaster.c:3787)
>> ==2049==    by 0x71714C: ServerLoop (postmaster.c:1566)
>> ==2049==    by 0x716804: PostmasterMain (postmaster.c:1219)
>> ==2049==    by 0x679405: main (main.c:219)
>> ==2049==  Address 0x64694ed is 1,389 bytes inside a block of size 8,192 alloc'd
>> ==2049==    at 0x4C27B8F: malloc (vg_replace_malloc.c:298)
>> ==2049==    by 0x8E766E: AllocSetAlloc (aset.c:853)
>> ==2049==    by 0x8E8E04: MemoryContextAllocZero (mcxt.c:627)
>> ==2049==    by 0x8A54D3: AtStart_Inval (inval.c:704)
>> ==2049==    by 0x4F1DFC: StartTransaction (xact.c:1841)
>> ==2049==    by 0x4F28D1: StartTransactionCommand (xact.c:2529)
>> ==2049==    by 0x7900A7: start_xact_command (postgres.c:2383)
>> ==2049==    by 0x78DAF4: exec_simple_query (postgres.c:860)
>> ==2049==    by 0x791FDD: PostgresMain (postgres.c:4010)
>> ==2049==    by 0x71B13B: BackendRun (postmaster.c:4113)
>> ==2049==    by 0x71A86D: BackendStartup (postmaster.c:3787)
>> ==2049==    by 0x71714C: ServerLoop (postmaster.c:1566)
>> ==2049==  Uninitialised value was created by a stack allocation
>> ==2049==    at 0x4FE16C: StartPrepare (twophase.c:942)
>>
>> It's probably just padding - twophase.c:1063 is the CRC32 computation of
>> the record data.

Yeah. The padding bytes in TwoPhaseFileHeader were not initialized.

That's simple enough to fix, but when I run valgrind, I get a lot whole 
bunch of similar messages. A few are from pgstat: the padding bytes in 
the pgstat messages are not initialized. One comes from 
write_relcache_init_file(); again I believe it's padding bytes being 
uninitialized (in FormData_pg_attribute). And one from the XLogInsert 
from heap_insert; there's an uninitialized padding byte in 
xl_heap_insert. And so forth.. Is it worthwhile to hunt down all of 
these? If there aren't many more than these, it probably is worth it, 
but I fear this might be an endless effort. Have we been clean of these 
warnings at any point in the past?

- Heikki




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Deferring some AtStart* allocations?
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: pg_receivexlog --status-interval add fsync feedback