Re: corrupt pages detected by enabling checksums

Поиск

Список

Период

Сортировка

От	Andres Freund
Тема	Re: corrupt pages detected by enabling checksums
Дата	13 мая 2013 г. 16:49:31
Msg-id	20130513134922.GB27618@awork2.anarazel.de обсуждение исходный текст
Ответ на	Re: corrupt pages detected by enabling checksums (Jon Nelson <jnelson+pgsql@jamponi.net>)
Ответы	Re: corrupt pages detected by enabling checksums
Список	pgsql-hackers

Дерево обсуждения

On 2013-05-13 08:45:41 -0500, Jon Nelson wrote:
> On Mon, May 13, 2013 at 8:32 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> > On 2013-05-12 19:41:26 -0500, Jon Nelson wrote:
> >> On Sun, May 12, 2013 at 3:46 PM, Jim Nasby <jim@nasby.net> wrote:
> >> > On 5/10/13 1:06 PM, Jeff Janes wrote:
> >> >>
> >> >> Of course the paranoid DBA could turn off restart_after_crash and do a
> >> >> manual investigation on every crash, but in that case the database would
> >> >> refuse to restart even in the case where it perfectly clear that all the
> >> >> following WAL belongs to the recycled file and not the current file.
> >> >
> >> >
> >> > Perhaps we should also allow for zeroing out WAL files before reuse (or just
> >> > disable reuse). I know there's a performance hit there, but the reuse idea
> >> > happened before we had bgWriter. Theoretically the overhead creating a new
> >> > file would always fall to bgWriter and therefore not be a big deal.
> >>
> >> For filesystems like btrfs, re-using a WAL file is suboptimal to
> >> simply creating a new one and removing the old one when it's no longer
> >> required. Using fallocate (or posix_fallocate) (I have a patch for
> >> that!) to create a new one is - by my tests - 28 times faster than the
> >> currently-used method.
> >
> > I don't think the comparison between just fallocate()ing and what we
> > currently do is fair. fallocate() doesn't guarantee that the file is the
> > same size after a crash, so you would still need an fsync() or we
> > couldn't use fdatasync() anymore. And I'd guess the benefits aren't all
> > that big anymore in that case?
> 
> fallocate (16MB) + fsync is still almost certainly faster than
> write+write+write... + fsync.
> The test I performed at the time did exactly that .. posix_fallocate + pg_fsync.
Sure, the initial file creation will be faster. But are the actual
individual wal writes (small, frequently fdatasync()ed) still faster?
That's the critical path currently.
Whether it is pretty much depends on how the filesystem manages
allocated but not initialized blocks...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Mark Salter
Дата: 13 мая 2013 г., 16:48:44
Сообщение: Re: lock support for aarch64

Следующее

От: Bruce Momjian
Дата: 13 мая 2013 г., 16:53:03
Сообщение: Re: Re: [GENERAL] pg_upgrade fails, "mismatch of relation OID" - 9.1.9 to 9.2.4

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: corrupt pages detected by enabling checksums

Предыдущее

Следующее