Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts

Поиск

Список

Период

Сортировка

От	Alvaro Herrera
Тема	Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts
Дата	20 июня 2014 г. 01:04:37
Msg-id	20140619220425.GQ18688@eldon.alvh.no-ip.org обсуждение исходный текст
Ответ на	Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts (Bruce Momjian <bruce@momjian.us>)
Ответы	Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts
Список	pgsql-bugs

Дерево обсуждения

Bruce Momjian wrote:

> I wasn't happy with having that delete code added there when we do
> directory delete in the function above.  I instead broke apart the
> delete and copy code and called the delete code where needed, in the
> attached patch.

Makes sense, yeah.  I didn't look closely enough to realize that the
function that does the copying also does the rmtree part.

I also now realize why the other case (upgrade from 9.3 to 9.4) does not
have a bug: we are already deleting the files in that path.

Bruce Momjian wrote:

> > I think one reason is that not all upgrades see an issue here; for old
> > clusters that haven't gone beyond the 0000 offset file, there is no
> > problem.  For clusters that have gone beyond 0000 but not by much, the
> > file will be deleted during the first truncation.  It only becomes a
> > problem if the cluster is close enough to 2^31.  Another thing to keep
> > in consideration is that initdb initializes all databases' datminmxid to
> > 1.  If the old cluster was past the 2^31 point, it means the datminmxid
> > doesn't move from 1 until the actual wraparound.
>
> OK, so the xid has to be beyond 2^31 during pg_upgrade to trigger a
> problem?  That might explain the rare reporting of this bug.  What would
> the test query look like so we can tell people when to remove the '0000'
> files?  Would we need to see the existence of '0000' and high-numbered
> files?  How high?  What does a 2^31 file look like?

I misspoke.

I ran a few more upgrades, and then tried vacuuming all databases, which
is when the truncate code is run.  Say the original cluster had an
oldestmulti of 10 million.  If you just run VACUUM in the new cluster
after the upgrade, the 0000 file is not deleted: it's not yet old enough
in terms of multixact age.  An error is not thrown, because we're still
not attempting a truncate.  But if you lower the
vacuum_multixact_freeze_table_age to 10 million minus one, then we will
try the deletion and that will raise the error.

I think (didn't actually try) if you just let 150 million multixacts be
generated, that's the first time you will get the error.

Now if you run a VACUUM FREEZE after the upgrade, the file will be
deleted with no error.

I now think that the reason most people haven't hit the problem is that
they don't generate enough multis after upgrading a database that had
enough multis in the old database.  This seems a bit curious

> Also, is there a reason you didn't remove the 'members/0000' file in your
> patch?  I have removed it in my version.

There's no point.  That file is the starting point for new multis
anyway, and it's compatible with the new format (because it's all
zeroes).

I guess you could get in trouble if you initdb the 9.3 database,
generate a few multis there, and then drop all tables and use that
modified cluster as a "new" cluster for pg_upgrade.  I would question
the sanity of a person doing something like that, however.

Bruce Momjian wrote:

> > OK, so the xid has to be beyond 2^31 during pg_upgrade to trigger a
> > problem?  That might explain the rare reporting of this bug.  What would
> > the test query look like so we can tell people when to remove the '0000'
> > files?  Would we need to see the existence of '0000' and high-numbered
> > files?  How high?  What does a 2^31 file look like?
>
> Also, what would a legitimate 0000 file at wrap-around time look like?
> Would there have to be an 'ffff' or 'ffffff' file?

Since I was wrong, there is no point in further research here.  Anyway
the last file before wrapping around in pg_multixact/members is FFFF.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-bugs по дате отправления:

Предыдущее

От: Pontus Lundkvist
Дата: 19 июня 2014 г., 22:07:33
Сообщение: Re: BUG #10692: psql: \c service=foo only uses dbname information

Следующее

От: Alvaro Herrera
Дата: 20 июня 2014 г., 01:12:55
Сообщение: Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts

Предыдущее

Следующее