Re: WAL logging problem in 9.4.3?

Поиск
Список
Период
Сортировка
От Fujii Masao
Тема Re: WAL logging problem in 9.4.3?
Дата
Msg-id CAHGQGwEXta1tMF6o-FcDcjMgCF=jnZLj3v3EoePvsEN4aexJnA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: WAL logging problem in 9.4.3?  (Martijn van Oosterhout <kleptog@svana.org>)
Ответы Re: WAL logging problem in 9.4.3?
Список pgsql-hackers
On Fri, Jul 3, 2015 at 2:20 PM, Martijn van Oosterhout
<kleptog@svana.org> wrote:
> On Fri, Jul 03, 2015 at 12:21:02AM +0200, Andres Freund wrote:
>> Hi,
>>
>> On 2015-07-03 00:05:24 +0200, Martijn van Oosterhout wrote:
>> > === Start with an empty database
>>
>> My guess is you have wal_level = minimal?
>
> Default config, was just initdb'd. So yes, the default wal_level =
> minimal.
>
>> > === Note the index file is 8KB.
>> > === At this point nuke the database server (in this case it was simply
>> > === destroying the container it was running in.
>>
>> How did you continue from there? The container has persistent storage?
>> Or are you repapplying the WAL to somewhere else?
>
> The container has persistant storage on the host. What I think is
> actually unusual is that the script that started postgres was missing
> an 'exec" so postgres never gets the signal to shutdown.
>
>> > martijn@martijn-jessie:$ sudo /usr/lib/postgresql/9.4/bin/pg_xlogdump -p /data/postgres/pg_xlog/
000000010000000000000001|grep -wE '16389|16387|16393' 
>> > rmgr: XLOG        len (rec/tot):     72/   104, tx:          0, lsn: 0/016A9240, prev 0/016A9200, bkp: 0000, desc:
checkpoint:redo 0/16A9240; tli 1; prev tli 1; fpw true; xid 0/686; oid 16387; multi 1; offset 0; oldest xid 673 in DB
1;oldest multi 1 in DB 1; oldest running xid 0; shutdown 
>> > rmgr: Storage     len (rec/tot):     16/    48, tx:          0, lsn: 0/016A92D0, prev 0/016A92A8, bkp: 0000, desc:
filecreate: base/16385/16387 
>> > rmgr: Sequence    len (rec/tot):    158/   190, tx:        686, lsn: 0/016B5E50, prev 0/016B5D88, bkp: 0000, desc:
log:rel 1663/16385/16387 
>> > rmgr: Storage     len (rec/tot):     16/    48, tx:        686, lsn: 0/016B5F10, prev 0/016B5E50, bkp: 0000, desc:
filecreate: base/16385/16389 
>> > rmgr: Storage     len (rec/tot):     16/    48, tx:        686, lsn: 0/016BB028, prev 0/016BAFD8, bkp: 0000, desc:
filecreate: base/16385/16393 
>> > rmgr: Sequence    len (rec/tot):    158/   190, tx:        686, lsn: 0/016BE4F8, prev 0/016BE440, bkp: 0000, desc:
log:rel 1663/16385/16387 
>> > rmgr: Storage     len (rec/tot):     16/    48, tx:        686, lsn: 0/016BE6B0, prev 0/016BE660, bkp: 0000, desc:
filetruncate: base/16385/16389 to 0 blocks 
>> > rmgr: Storage     len (rec/tot):     16/    48, tx:        686, lsn: 0/016BE6E0, prev 0/016BE6B0, bkp: 0000, desc:
filetruncate: base/16385/16393 to 0 blocks 
>> > pg_xlogdump: FATAL:  error in WAL record at 0/16BE710: record with zero length at 0/16BE740
>>
>> Note that the truncate will lead to a new, different, relfilenode.
>
> Really? Comparing the relfilenodes gives the same values before and
> after the truncate.

Yep, the relfilenodes are not changed in this case because CREATE TABLE and
TRUNCATE were executed in the same transaction block.

>> > ctmp=# select * from test;
>> > ERROR:  could not read block 0 in file "base/16385/16393": read only 0 of 8192 bytes
>>
>> Hm. I can't reproduce this. Can you include a bit more details about how
>> to reproduce?
>
> Hmm, for me it is 100% reproducable. Are you familiar with Docker? I
> can probably construct a Dockerfile that reproduces it pretty reliably.

I could reproduce the problem in the master branch by doing
the following steps.

1. start the PostgreSQL server with wal_level = minimal
2. execute the following SQL statements    begin;    create table test(id serial primary key);    truncate table test;
 commit; 
3. shutdown the server with immediate mode
4. restart the server (crash recovery occurs)
5. execute the following SQL statement   select * from test;

The optimization of TRUNCATE opereation that we can use when
CREATE TABLE and TRUNCATE are executed in the same transaction block
seems to cause the problem. In this case, only index file truncation is
logged, and index creation in btbuild() is not logged because wal_level
is minimal. Then at the subsequent crash recovery, index file is truncated
to 0 byte... Very simple fix is to log an index creation in that case,
but not sure if that's ok to do..

Regards,

--
Fujii Masao



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: multivariate statistics / patch v7
Следующее
От: Fabien COELHO
Дата:
Сообщение: Re: Let PostgreSQL's On Schedule checkpoint write buffer smooth spread cycle by tuning IsCheckpointOnSchedule?