Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)
Дата
Msg-id CAEepm=0+Yj8Vh_mEJHqw=G2_NEdTArvi8ch3L9Y7bFhd2=szyw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)  (Thomas Munro <thomas.munro@enterprisedb.com>)
Список pgsql-bugs
On Wed, May 6, 2015 at 9:46 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Wed, May 6, 2015 at 9:26 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Tue, May 5, 2015 at 3:58 AM, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
>>> Ok, so if you have autovacuum_freeze_max_age = 400 million multixacts
>>> before wraparound vacuum, which is ~10% of 2^32, we would interpret
>>> that to mean 400 million multixacts OR ~10% * some_constant of member
>>> space, in other worlds autovacuum_freeze_max_age * some_constant
>>> members, whichever comes first.  But what should some_constant be?
>>
>> some_constant should be all the member space there is.  So we trigger
>> autovac if we've used more than ~10% of the offsets OR more than ~10%
>> of the members.  Why is autovacuum_multixact_freeze_max_age
>> configurable in the place?  It's configurable so that you can set it
>> low enough that wraparound scans complete and advance the minmxid
>> before you hit the wall, but high enough to avoid excessive scanning.
>> The only problem is that it only lets you configure the amount of
>> headroom you need for offsets, not members.  If you squint at what I'm
>> proposing the right way, it's essentially that that GUC should control
>> both of those things.
>
> But member space *always* grows at least twice as fast as offset space
> (aka active multixact IDs), because multixacts always have at least 2
> members (except in some rare cases IIUC), don't they?  So if we do
> what you just said, then we'll trigger wraparound vacuums twice as
> soon as we do now for everybody, even people who don't have any
> problem with member space management.  We don't want this patch to
> change anything for most people, let alone everyone.  So I think that
> some_constant should be at least 2, if we try to do it this way, in
> other words if you set the GUC for 10% of offset space, we also start
> triggering wraparounds at 20% of member space.  The code in
> MultiXactCheckMemberSpace would just say safe_member_count =
> autovacum_multixact_freeze_max_age * 2, where 2 is some_constant (this
> number is the average number of multixact members below which your
> workload will be unaffected by the new autovac behaviour).

Here is a version of the patch that uses the GUC to control where
safe_member_count starts as you suggested.  But it multiplies it by a
scaling factor that I'm calling avg_multixact_size_threshold.  The key
point is this: if your average multixact size is below this number,
you will see no change in autovacuum behaviour from this patch,
because normal wraparound vacuums will occur before you ever exceed
safe_member_count, but if your average multixact size is above this
number, you'll see some extra wraparound vacuuming.

As for its value, I start the bidding at 3.  Unless I misunderstood,
your proposal amounted to using a value of 1, but I think that is too
small, because surely *everyone's* average multixact size is above 1.
We don't want to change the autovacuum behaviour for everyone.   I
figure the value should be a smallish number of at least 2, and I
speculate that that multixacts might have one of those kinds of
distributions where there are lots of 2s, a few 3s, hardly any 4s etc,
so that the average would be somewhere not far above 2.  3 seems to
fit that description.  I could be completely wrong about that, but
even so, with the default GUC setting of 400 million, 3 gives you
(400000000 * 3) / 2^32 = ~28%, pretty close to the 25% that people
seemed to like when we were talking about a fixed constant.  Higher
numbers could work but would make us less aggressive and increase the
wraparound risk, and 8 would definitely be too high, because then
safe_member_count crashes into dangerous_member_count with the default
GUC setting, so our ramping is effectively disabled giving us a panic
mode/cliff.

The new patch also tests the dangerous level *before* the safe level,
so that our 75% threshold is still triggered even if you set the GUC
insanely high so that safe_member_count finished up higher than
dangerous_member_count.

BTW I think you can estimate the average number of members per
multixact in a real database like this:

number of members = number of member segment files * 1636 * 32
over
number of multixacts = number of offsets segment files * 2048 * 32

Those numbers are from these macros with a default page size:

MULTIXACT_MEMBERS_PER_PAGE = 1636
MULTIXACT_OFFSETS_PER_PAGE = 2048
SLRU_PAGES_PER_SEGMENT = 32

--
Thomas Munro
http://www.enterprisedb.com

Вложения

В списке pgsql-bugs по дате отправления:

Предыдущее
От: jwierowski@swtechnologies.com
Дата:
Сообщение: BUG #13240: Error tryping to launch Stack Builder
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)