Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts
Дата
Msg-id 20140720223010.GF5974@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
On 2014-07-20 18:16:51 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2014-07-20 17:43:04 -0400, Tom Lane wrote:
> >> No, I don't think so.  Truncation is driven off oldestMultiXid from
> >> pg_control, not from relminmxid. The only thing in-the-future values of
> >> those will do to us is prevent autovacuum from thinking it must do a full
> >> table scan.  (In particular, in-the-future values do not cause
> >> oldestMultiXid to get advanced, because we're always looking for the
> >> oldest value not the newest.)
>
> > Right. But that's the problem. If oldestMulti is set to, say, 3000000000
> > by pg_resetxlog during pg_upgrade but *minmxid = 1 those tables won't be
> > full tables scanned because of multixacts. But vac_truncate_clog() will
> >     SetMultiXactIdLimit(minMulti, minmulti_datoid);
> > regardless.
>
> > Note that it'll not notice the limit of other databases in this case
> > because vac_truncate_clog() will effectively use the in memory
> > GetOldestMultiXactId() and check if other databases are before that. But
> > there won't be any because they all appear in the future. Due to that
> > the next checkpoint will tru6ncate the clog to the cutoff multi xid used
> > by the last vacuum.
>
> Right.
>
> > Am I missing something?
>
> My point is that the cutoff multi xid won't be new enough to remove
> non-LOCKED_ONLY (ie, post-9.3) mxids.

Why not? Afaics this will continue to happen until multixacts are
wrapped around once? So the cutoff multi will be new enough for that at
some point after the pg_upgrade?

Luckily in most cases full table vacuums triggered due to normal xids
will prevent bad problems though. There have been a couple reports where
people included pg_controldata output indicating crazy rates of multixid
consumption but I think none of those were crazy enough to burn multis
so fast that important ones get truncated before a full table vacuum
occurs due to normal xids.

> >> But in any case, we both agree that setting relminmxid to equal nextMulti
> >> is completely unsafe in a 9.3 cluster that's already been up.  So the
> >> proposed fix instructions are certainly wrong.
>
> > Right. I'm pondering what to do about it instead. The best idea I have
> > is something like:
> > 1) Jot down pg_controldata|grep NextMultiXactId
> > 2) kill/wait for all existing transactions to end
> > 3) vacuum all databases with vacuum_multixact_freeze_min_age=0. That'll
> >    get rid of all old appearing multis
> > 4) Update pg_class to set relminmxid=value from 1), same with
> >    pg_database
>
> > But that sucks and doesn't deal with all the problems :(
>
> Yeah.  At this point I'm of the opinion that we should not recommend any
> manual corrective actions for this issue.  They're likely to do more harm
> than good, especially if the user misses or fat-fingers any steps.

I don't really see us coming up with something robust in time :/. It's a
bid sad, but maybe we should recommend contacting the mailing list if
pg_upgrade has been used and nextMulti is above 2^31?

Btw, we really should have txid_current() equivalent for multis...

> I'm also thinking that the lack of any complaints suggests there are few
> or no existing installations with nextMulti past 2^31, anyhow.  If it were
> even past 400000000 (default autovacuum_multixact_freeze_max_age), we'd
> have been hearing howls of anguish about full-database freezing scans
> occurring immediately after a pg_upgrade (thanks to minmxid = 1 being old
> enough to trigger that).

I think people just chalk that up to 'crazy pg vacuuming behaviour' and
not investigate further. At least that's my practical experience :(

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts
Следующее
От: Tom Lane
Дата:
Сообщение: Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts