Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)
Дата
Msg-id CA+TgmoYO1Me9mxpKriVzCCDYBSk3WSVLXCEY1d9YX-mtqDQPmQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)  (Thomas Munro <thomas.munro@enterprisedb.com>)
Список pgsql-bugs
On Mon, May 11, 2015 at 7:56 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Mon, May 11, 2015 at 2:45 PM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> On Sun, May 10, 2015 at 9:41 AM, Thomas Munro
>> <thomas.munro@enterprisedb.com> wrote:
>>> On Sun, May 10, 2015 at 12:43 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>>> OK. So the next question is: if you then apply the other patch, does that prevent step 4 and thereby avoid
catastrophe?
>>>
>>> Yes, in a quick test, at step 4 I couldn't proceed.  I need to prod
>>> this some more on Monday, and also see how it interacts with
>>> autovacuum's view of what work needs to be done.
>>
>> The code in master which handles regular autovacuums seems correct
>> with this patch, because it measures member space usage by calling
>> find_multixact_start itself with the oldest multixact ID (it's not
>> dependent on anything that is updated at checkpoint time).
>>
>> The code in the patch at
>> http://www.postgresql.org/message-id/CA+TgmobbaQpE6sNqT30+rz4UMH5aPraq20gko5xd2ZGajz1-Jg@mail.gmail.com
>> would become wrong though, because it would use the (new) variable
>> MultiXactState->oldestOffset (set at checkpoint) to measure the used
>> member space.  That means it would repeatedly launch autovacuums, even
>> after clearing away the problem and advancing the oldest multixact ID,
>> until the next checkpoint updates that value.  In other words, it
>> can't see its own progress immediately (which is the right approach
>> for blocking new multixact generation, ie defer until
>> checkpoint/truncation, but the wrong approach for triggering
>> autovacuums).
>>
>> I think vacuum (SetMultiXactIdLimit?) needs to update oldestOffset,
>> not checkpoint (DetermineSafeOldestOffset).  (The reason for wanting
>> this new value in shared memory is because GetNextMultiXactId needs to
>> be able to check it cheaply for every call, so calling
>> find_multixact_start every time would presumably not fly).
>
> Here's a new version of the patch to do that.  As before, it tracks
> the oldest offset in shared memory, but now that is updated in
> SetMultiXactIdLimit, so it is always updated at the same time as
> MultiXactState->oldestMultiXactId (at startup and after full scan
> vacuums).
>
> The value is used in the following places:
>
> 1.  GetNewMultiXactId uses it to see if it needs to send
> PMSIGNAL_START_AUTOVAC_LAUNCHER to request autovacuums even if
> autovacuum is set to off.  That is the main purpose of this patch.
> (GetNewMultiXactId *doesn't* use it for member wraparound prevention:
> that is based on offsetStopLimit, set by checkpoint code after
> truncation of physical storage.)
>
> 2.  SetMultiXactIdLimit itself also uses it to send a
> PMSIGNAL_START_AUTOVAC_LAUNCHER signal to the postmaster (according to
> comments this allows immediately doing some more vacuuming upon
> completion if necessary).
>
> 3.  ReadMultiXactCounts, called by regular vacuum and autovacuum,
> rather than doing its own call to find_multixact_start, now also reads
> it from shared memory.  (Incidentally the code this replaces suffers
> from the problem fixed elsewhere it can call find_multixact_start for
> a multixact that doesn't exist yet).
>
> Vacuum runs as expected with with autovacuum off.

Great.  I've committed this and back-patched it with 9.3, after making
your code look a little more like what I already committed for the
same task, and whacking the comments around.

> Do you think we
> should be using MULTIXACT_MEMBER_DANGER_THRESHOLD as the trigger level
> for forced vacuums instead of MULTIXACT_MEMBER_SAFE_THRESHOLD, or
> something else?

No, I think you have it right.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: BUG #13269: "alter constraint child_parent deferrable initially deferred" sometimes does not make FK deferred
Следующее
От: Tom Lane
Дата:
Сообщение: Re: BUG #13267: Some timezones in pg_timezone_names are missing in pg_timezone_abbrevs