Re: heapgetpage() and ->takenDuringRecovery

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: heapgetpage() and ->takenDuringRecovery
Дата
Msg-id 20140303133338.GD23352@awork2.anarazel.de
обсуждение исходный текст
Ответ на Re: heapgetpage() and ->takenDuringRecovery  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: heapgetpage() and ->takenDuringRecovery  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On 2014-03-03 06:57:00 -0500, Robert Haas wrote:
> On Sun, Mar 2, 2014 at 8:39 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> > I don't think this is neccessary >= 9.2. The are two only "interestings" place
> > where PD_ALL_VISIBLE is set:
> > a) lazy_vacuum_page() where a xl_heap_clean is logged *before*
> >    PD_ALL_VISIBLE/the vm is touched and that causes recovery
> >    conflicts. The heap page is locked for cleanup at that point. As the
> >    logging of xl_heap_clean sets the page's LSN there's no way the page
> >    can appear on the standby too early.
> > b) empty pages in lazy_scan_heap(). If they always were empty, there's
> >    no need for conflicts. The only other way I can see to end up there
> >    is a previous heap_page_prune() that repaired fragmentation. But that
> >    logs a WAL record with conflict information.
> 
> I don't think there's any reason to believe that lazy_scan_heap() can
> only hit pages that are empty or have just been defragged.  Suppose
> that there's a tuple on the page which was recently inserted; the
> inserting transaction has committed but there are some backends that
> still have older snapshots.  The page won't be marked all-visible
> because it isn't.  Now, eventually those older snapshots will go away,
> and sometime after that the relation will get vacuumed again, and
> we'll once again look the page.  But this time we notice that it is
> all-visible, and mark it so.

Right now I am missing how this isn't an actual correctness problem
after a crash. Without an LSN interlock we could crash *after* the heap
page has been written out, but *before* the vm WAL record has been
flushed to disk. Combined with synchronous_commit=off there could be
transactions that appeared as safely committed for vacuum (i.e. are
below GetOldestXmin()), but which are actually aborted after the
commit.
Normal hint bits circumvent that by checking XLogNeedsFlush(commitLSN),
but that doesn't work here.

Am I missing something?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: Securing "make check" (CVE-2014-0067)
Следующее
От: Kohei KaiGai
Дата:
Сообщение: Re: Triggers on foreign tables