Re: VM map freeze corruption

Поиск
Список
Период
Сортировка
От Pavan Deolasee
Тема Re: VM map freeze corruption
Дата
Msg-id CABOikdOecOFE--y0i1wO0CONr54VmyyoP2Er35CVrPYOCN8hZw@mail.gmail.com
обсуждение исходный текст
Ответ на VM map freeze corruption  ("Wood, Dan" <hexpert@amazon.com>)
Ответы Re: VM map freeze corruption  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Список pgsql-hackers


On Wed, Apr 18, 2018 at 7:37 AM, Wood, Dan <hexpert@amazon.com> wrote:


My analysis is that heap_prepare_freeze_tuple->FreezeMultiXactId() returns FRM_NOOP if the MultiXACT locked rows haven't committed.  This results in changed=false and totally_frozen=true(as initialized).  When this returns to lazy_scan_heap(), no rows are added to the frozen[] array.  Yet, tuple_totally_frozen is true.  This means the page is marked frozen in the VM, even though the MultiXACT row wasn't left untouch.

A fix to heap_prepare_freeze_tuple() that seems to do the trick is:
        else
        {
            Assert(flags & FRM_NOOP);
+          totally_frozen = false;
        }

That's a great find! This can definitely lead to various problems and could be one of the reasons behind the issue reported here [1]. For example, if we change the script slightly at the end, we can get the same error reported in the bug report.

sleep 4;  # Wait for share locks to be released

# See if another vacuum freeze advances relminmxid beyond xmax present in the
# heap
echo "vacuum (verbose, freeze) t;" | $p
echo "select pg_check_frozen('t');" | $p

# See if a vacuum freeze scanning all pages corrects the problem
echo "vacuum (verbose, freeze, disable_page_skipping) t;" | $p
echo "select pg_check_frozen('t');" | $p

Thanks,
Pavan


--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Heikki Linnakangas
Дата:
Сообщение: Re: Built-in connection pooling
Следующее
От: Ildar Musin
Дата:
Сообщение: hostorder and failover_timeout for libpq