Creation of an empty table is not fsync'd at checkpoint

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Creation of an empty table is not fsync'd at checkpoint
Дата
Msg-id d47d8122-415e-425c-d0a2-e0160829702d@iki.fi
обсуждение исходный текст
Ответы Re: Creation of an empty table is not fsync'd at checkpoint  (Andres Freund <andres@anarazel.de>)
Re: Creation of an empty table is not fsync'd at checkpoint  (Thomas Munro <thomas.munro@gmail.com>)
Список pgsql-hackers
If you create an empty table, it is not fsync'd. As soon as you insert a 
row to it, register_dirty_segment() gets called, and after that, the 
next checkpoint will fsync it. But before that, the creation itself is 
never fsync'd. That's obviously not great.

The lack of an fsync is a bit hard to prove because it requires a 
hardware failure, or a simulation of it, and can be affected by 
filesystem options too. But I was able to demonstrate a problem with 
these steps:

1. Create a VM with two virtual disks. Use ext4, with 'data=writeback' 
option (I'm not sure if that's required). Install PostgreSQL on one of 
the virtual disks.

2. Start the server, and create a tablespace on the other disk:

CREATE TABLESPACE foospc LOCATION '/data/heikki';

3. Do this:

CREATE TABLE foo (i int) TABLESPACE foospc;
CHECKPOINT;

4. Immediately after that, kill the VM. I used:

killall -9 qemu-system-x86_64

5. Restart the VM, restart PostgreSQL. Now when you try to use the 
table, you get an error:

postgres=# select * from crashtest ;
ERROR:  could not open file "pg_tblspc/81921/PG_15_202201271/5/98304": 
No such file or directory

I was not able to reproduce this without the tablespace on a different 
virtual disk, I presume because ext4 orders the writes so that the 
checkpoint implicitly always flushes the creation of the file to disk. I 
tried data=writeback but it didn't make a difference. But with a 
separate disk, it happens every time.

I think the simplest fix is to call register_dirty_segment() from 
mdcreate(). As in the attached. Thoughts?

- Heikki
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: refactoring basebackup.c
Следующее
От: Nathan Bossart
Дата:
Сообщение: Re: make MaxBackends available in _PG_init