Re: [PATCH] Lazy xid assingment V2

Поиск

Список

Период

Сортировка

От	Florian G. Pflug
Тема	Re: [PATCH] Lazy xid assingment V2
Дата	1 сентября 2007 г. 21:13:15
Msg-id	46D9D5E4.2080309@phlo.org обсуждение исходный текст
Ответ на	[PATCH] Lazy xid assingment V2 ("Florian G. Pflug" <fgp@phlo.org>)
Ответы	Re: [PATCH] Lazy xid assingment V2 (Tom Lane <tgl@sss.pgh.pa.us>) Re: [PATCH] Lazy xid assingment V2 (August Zajonc <augustz@augustz.com>)
Список	pgsql-hackers

Дерево обсуждения

August Zajonc wrote:
>> Yes, checkpoints would need to include a list of 
>> created-but-yet-uncommitted
>> files. I think the hardest part is figuring out a way to get that 
>> information
>> to the backend doing the checkpoint - my idea was to track them in shared
>> memory, but that would impose a hard limit on the number of concurrent
>> file creations. Not nice :-(
> I'm confused about this.
> 
> As long as we assert the rule that the file name can't change on the 
> move, then after commit the file can be in only one of two places. The 
> name of the file is known (ie, pg_class). The directories are known. 
> What needs to be carried forwarded past a checkpoint? We don't even look 
> at WAL, so checkpoints are irrelevant it seems>
> If there is a crash just after commit and before the move, no harm. You 
> just move on startup. If the move fails, no harm, you can emit warning 
> and open in /pending (or simply error, even easier).
If you're going to open the file from /pending, whats the point of moving
it in the first place?

The idea would have to be that you move on commit (Or on COMMIT-record
replay, in case of a crash), and then, after recovering the whole wal,
you could remove leftover files in /pending.

The main problem is that you have to do the move *after* flushing the COMMIT
record to disk - otherwise you're gonna leak the file if you crash between
moving and flushing.

But that implies that the transaction is *already* committed when you do
the move. Others won't know that yet (You do the move *after* flushing,
but *before* updating the CLOG) - but still, since the COMMIT-record is
on disk, you cannot rollback anymore (Since if you crash, and replay the
COMMIT record, the transaction  *will* be committed).

So, what are you going to do if the move fails? You cannot roll back, and
you cannot update the CLOG (because than others would see your new table,
but no datafile). The only option is to PANIC. This will lead to a server
restart, WAL recovery, and probably another PANIC once the COMMIT-record
is replayed (Since the move probably still won't be possible).

It might be even worse - I'm not sure that a rename is an atomic operation
on most filesystems. If it's not, then you might end up with two files if
power fails *just* as you rename, or, worse with no file at all. Even a slight
possibility of the second case seems unacceptable - I means loosing
a committed transaction.

I agree that we should eventually find a way to guarantee either no file
leakage, or at least an upper bound on the amount of wasted space. But
doing so at the cost of PANICing if the move fails seems like a bad
tradeoff...

greetings, Florian Pflug

В списке pgsql-hackers по дате отправления:

Предыдущее

От: John DeSoi
Дата: 01 сентября 2007 г., 19:56:41
Сообщение: Re: Per-function search_path => per-function GUC settings

Следующее

От: Tom Lane
Дата: 01 сентября 2007 г., 21:45:45
Сообщение: Re: [PATCH] Lazy xid assingment V2

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [PATCH] Lazy xid assingment V2

Предыдущее

Следующее