Re: WARNINGs after starting backup server created with PITR

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: WARNINGs after starting backup server created with PITR
Дата
Msg-id 14838.1200704299@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: WARNINGs after starting backup server created with PITR  (Erik Jones <erik@myemma.com>)
Ответы Re: WARNINGs after starting backup server created with PITR  (Simon Riggs <simon@2ndquadrant.com>)
Список pgsql-general
Erik Jones <erik@myemma.com> writes:
>>> 2008-01-17 21:47:34 CST 7598 :WARNING:  relation "table_name" page
>>> 5728 is uninitialized --- fixing
>>
>> If you do a vacuum on the master, do you get the same warnings?

> /me runs VACUUM VERBOSE on the two tables that would matter.

> Nope.  What worries me is, that since I have a verified case of rsync
> thinking it had successfully transferred a WAL, the same may have
> happened with these files during the base backup.  Does that warning,
> in fact, entail that there were catalog entries for those files, but
> that the file was not there, and by "fixing" it the server just
> created empty files?

Not necessarily.  What the warning actually means is that VACUUM found
an all-zeroes page within a table.  There are scenarios where this is
not unexpected, particularly after a crash on the master.  The reason
is that adding a page to a table is a two-step process.  First we
write() a page of zeroes at the current EOF; this is basically to make
the filesystem reserve the space.  We don't want to report that we've
committed a page-full of new rows and then discover there's no disk
space for them.  Then we initialize the page (ie set up the page header)
and start putting rows into it.  But these latter operations happen
inside a shared buffer, and might not reach disk until the next
checkpoint.  Now, the insertions of the rows are entered into the WAL
log, and once the first such WAL entry has reached disk, the page will
be re-initialized by WAL replay if there's a crash.  But there's an
interval between the filesystem's extension of a table with zeroes and
the first WAL entry related to the page reaching disk.  If you get a
crash in that interval then the all-zeroes page will still be there
after recovery, and will go unused until VACUUM reclaims it (and
produces the WARNING).

So this would explain some zero pages (though not large numbers of
them) if you'd had crashes on the master.  I'm not sure offhand whether
there's any case in which bringing up a PITR slave is close enough to
crash recovery that the same mechanism could apply to produce a zero
page on the slave where there had been none on the master.

In any case, 125 different zeroed pages is pretty hard to explain
by such a mechanism (especially if they were scattered rather than
in contiguous clumps).  I tend to agree that it sounds like there
was something wrong with the rsync mirroring process.

            regards, tom lane

В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: ATTN: Clodaldo was Performance problem. Could it be related to 8.3-beta4?
Следующее
От: Robert Treat
Дата:
Сообщение: Re: WARNINGs after starting backup server created with PITR