Re: Freeze avoidance of very large table.

Поиск
Список
Период
Сортировка
От Simon Riggs
Тема Re: Freeze avoidance of very large table.
Дата
Msg-id CANP8+j+kVGu=XVoF9_tvp2hK4j82t1VSDDh4czaSmEiZTKSKuQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Freeze avoidance of very large table.  (Sawada Masahiko <sawada.mshk@gmail.com>)
Ответы Re: Freeze avoidance of very large table.  (Simon Riggs <simon@2ndQuadrant.com>)
Список pgsql-hackers
On 3 July 2015 at 09:25, Sawada Masahiko <sawada.mshk@gmail.com> wrote:
On Fri, Jul 3, 2015 at 1:23 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 2 July 2015 at 16:30, Sawada Masahiko <sawada.mshk@gmail.com> wrote:
>
>>
>> Also, the flags of each heap page header might be set PD_ALL_FROZEN,
>> as well as all-visible
>
>
> Is it possible to have VM bits set to frozen but not visible?
>
> The description makes those two states sound independent of each other.
>
> Are they? Or not? Do we test for an impossible state?
>

It's impossible to have VM bits set to frozen but not visible.
These bit are controlled independently. But eventually, when
all-frozen bit is set, all-visible is also set.

And my understanding is that if you clear all-visible you would also clear all-frozen...

So I don't understand why you have two separate calls to visibilitymap_clear() 
Surely the logic should be to clear both bits at the same time?

In my understanding the state logic is

1. Both bits unset   ~(VISIBILITYMAP_ALL_VISIBLE | VISIBILITYMAP_ALL_FROZEN)
which can be changed to state 2 only

2. VISIBILITYMAP_ALL_VISIBLE only
which can be changed state 1 or state 3

3. VISIBILITYMAP_ALL_VISIBLE | VISIBILITYMAP_ALL_FROZEN
which can be changed to state 1 only

If that is the case please simplify the logic for setting and unsetting the bits so they are set together efficiently. At the same time please also put in Asserts to ensure that the state logic is maintained when it is set and when it is tested.

I would also like to see the visibilitymap_test function exposed in SQL, so we can write code to examine the map contents for particular ctids. By doing that we can then write a formal test that shows the evolution of tuples from insertion, vacuuming and freezing, testing the map has been set correctly at each stage. I guess that needs to be done as an isolationtest so we have an observer that contrains the xmin in various ways. In light of multixact bugs, any code that changes the on-disk tuple metadata needs formal tests.

Other than that the overall concept seems sound. 

I think we need something for pg_upgrade to rewrite existing VMs. Otherwise a large read only database would suddenly require a massive revacuum after upgrade, which seems bad. That can wait for now until we all agree this patch is sound.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Uriy Zhuravlev
Дата:
Сообщение: Re: WIP: Enhanced ALTER OPERATOR
Следующее
От: Andres Freund
Дата:
Сообщение: Repeated pg_upgrade buildfarm failures on binturon