Re: Corruption during WAL replay

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Corruption during WAL replay
Дата
Msg-id 20220325034301.htu27xf54xjgyoca@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: Corruption during WAL replay  (Andres Freund <andres@anarazel.de>)
Ответы Re: Corruption during WAL replay  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Hi,

On 2022-03-24 19:43:02 -0700, Andres Freund wrote:
> Just to be sure I'm going to clean out serinus' ccache dir and rerun. I'll
> leave dragonet's alone for now.

Turns out they had the same dir. But it didn't help.

I haven't yet figured out why, but I now *am* able to reproduce the problem in
the buildfarm built tree.  Wonder if there's a path length issue or such
somewhere?

Either way, I can now manipulate the tests and still repro. I made the test
abort after the first failure.

hexedit shows that the file is modified, as we'd expect:
00000000   00 00 00 00  C0 01 5B 01  16 7D 00 00  A0 03 C0 03  00 20 04 20  00 00 00 00  00 00 00 00  00 00 00 00
......[..}........ ............
 
00000020   00 9F 38 00  80 9F 38 00  60 9F 38 00  40 9F 38 00  20 9F 38 00  00 9F 38 00  E0 9E 38 00  C0 9E 38 00
..8...8.`.8.@.8..8...8...8...8.
 

And we are checking the right file:

bf@andres-postgres-edb-buildfarm-v1:~/build/buildfarm-serinus/HEAD/pgsql.build$
tmp_install/home/bf/build/buildfarm-serinus/HEAD/inst/bin/pg_checksums--check -D
/home/bf/build/buildfarm-serinus/HEAD/pgsql.build/src/bin/pg_checksums/tmp_check/t_002_actions_node_checksum_data/pgdata
--filenode16391 -v
 
pg_checksums: checksums verified in file
"/home/bf/build/buildfarm-serinus/HEAD/pgsql.build/src/bin/pg_checksums/tmp_check/t_002_actions_node_checksum_data/pgdata/pg_tblspc/16387/PG_15_202203241/5/16391"
Checksum operation completed
Files scanned:   1
Blocks scanned:  45
Bad checksums:  0
Data checksum version: 1

If I twiddle further bits, I see that page failing checksum verification, as
expected.

I made the script copy the file before twiddling it around:
00000000   00 00 00 00  C0 01 5B 01  16 7D 00 00  A0 03 C0 03  00 20 04 20  00 00 00 00  E0 9F 38 00  C0 9F 38 00
......[..}........ ......8...8.
 
00000020   A0 9F 38 00  80 9F 38 00  60 9F 38 00  40 9F 38 00  20 9F 38 00  00 9F 38 00  E0 9E 38 00  C0 9E 38 00
..8...8.`.8.@.8..8...8...8...8.
 

So it's indeed modified.


The only thing I can really conclude here is that we apparently end up with
the same checksum for exactly the modifications we are doing? Just on those
two damn instances? Reliably?


Gotta make some food. Suggestions what exactly to look at welcome.


Greetings,

Andres Freund

PS: I should really rename the hostname of that machine one of these days...



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Assert in pageinspect with NULL pages
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: Assert in pageinspect with NULL pages