Re: pg_upgrade: Pass -j down to vacuumdb

Поиск
Список
Период
Сортировка
От Justin Pryzby
Тема Re: pg_upgrade: Pass -j down to vacuumdb
Дата
Msg-id 20190403212434.GY17544@telsasoft.com
обсуждение исходный текст
Ответ на Re: pg_upgrade: Pass -j down to vacuumdb  (Jeff Janes <jeff.janes@gmail.com>)
Ответы Re: pg_upgrade: Pass -j down to vacuumdb  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Список pgsql-hackers
On Wed, Apr 03, 2019 at 04:42:14PM -0400, Jeff Janes wrote:
> So maybe the first stage could
> be run by pg_upgrade itself, while the new server is still running on a
> linux socket in a private directory.

I think that would take too long.  It would be less of an issue if there was
feedback/progress from pg_upgrade during the analyze.

For our upgrades (which typically take ~15min but several customers take up to
~60min), I only analyze base tables (essentially, those which are neither
parents nor children), then start services, then ANALYZE with default stats
target.  I would want to avoid delaying services restart for more than another
(say) 5 minutes, and I would want to avoid even that unless there was a
progress report indicating that it's projected to take only a few more minutes.

I just did a test on one of our large-but-not-huge customers.  With
stats_target=1, analyzing a 145GB partitioned table looks like it'll take
perhaps an hour; they have ~1TB data, so delaying services during ANALYZE would
nullify the utility of pg_upgrade.  I can restore the essential tables from
backup in 15-30 minutes.

It might be fine if pg_upgrade took an option which enabled analyze, perhaps
instead of outputting analyze_new_cluster.sh.  But actually, a problem with
*that* is that currently pg_upgrade avoids starting the new cluster.  That
seems to be deliberate, since, with --link, that's an irreversible operation:
it's unsafe to start the old cluster afterwards.

Tangent: I have a queued mail from ~15 months ago wherein I proposed adding to
pg_upgrade an option to remove the old data dir (or probably only the files
associated with known relations).  I realized at the time that would be pretty
scary without having first verified that the new cluster at least starts.  I'm
not sure how good an idea that is, but --startnewcluster would be needed there,
too.

Justin



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: CPU costs of random_zipfian in pgbench
Следующее
От: Sergei Kornilov
Дата:
Сообщение: Re: allow online change primary_conninfo