Re: BUG #7884: pg_upgradecluster is terribly slow

Поиск
Список
Период
Сортировка
От Martin Pitt
Тема Re: BUG #7884: pg_upgradecluster is terribly slow
Дата
Msg-id 20130215182959.GA2531@piware.de
обсуждение исходный текст
Ответ на BUG #7884: pg_upgradecluster is terribly slow  (rtvd@me.com)
Список pgsql-bugs
Hello,

rtvd@me.com [2013-02-15 17:41 +0000]:
> I have tried upgrading a database from 8.4 to 9.1 and pg_upgradecluster is
> terribly slow.

Please note that this is a Debian/Ubuntu specific bug,
pg_upgradecluster is not shipped by upstream PostgreSQL.

I admittedly don't have any first-hand experience with migrating large
databases, and get very little feedback about it, so thanks for your
observations!

First of all, Peter has worked on a branch that makes
pg_upgradecluster use pg_upgrade, which hopefully will be a lot more
performant.

> 1. It uses "-Fc". This is bad as it enables compression by default. Which is
> pointless and wastes CPU power.

The tar format (-Ft) has also been supported since at least 8.4
(that's the earliest version which is still supported in
Debian/Ubuntu), and documentation says it's not compressed. If you
replace -Fc with -Ft, do you see a significant performance increase?

> 2. It seems to migrate the schema first and only then it moves data. This is
> bad as indices are being re-built when the data is finally poured in.

At least in earlier versions pg_dumpall wasn't able to dump BLOBs, and
the manpage doesn't indicate otherwise for 9.2 either. AFAIK this just
supports the plain SQL text format. I don't believe pg_dumpall would
avoid the rebuilding of incides?

> 3. The migration does not happen within a transaction.

> 1. migrate the schema

pg_upgradecluster does that, too.

> 2. drop all databases (leaving roles in)
> 3. re-creating the databases (without any content)

Out of interest, why does that help?

> 4. for each database run pg_dump <database> | psql --single-transaction
> <database>

As written above, I don't want to use the text format and psql, but
pg_restore also supports --single-transaction since at least 8.4.

The question which I'm not sure about is whether it's ok to use
--single-transaction even for very large databases. I. e. is piling up
gigabytes of data in a transaction and committing it all in the end
always more efficient than the default mode (which I assume will use
one transaction by row)? Is that a safe thing to do, or could one run
into out-of-memory conditions?

http://people.canonical.com/~pitti/tmp/pg_upgradecluster is a version
with these two changes: pg_restore --single-transaction and using the
tar format. Perhaps you can try this, and compare performance?

Thanks,

Martin
--
Martin Pitt                        | http://www.piware.de
Ubuntu Developer (www.ubuntu.com)  | Debian Developer  (www.debian.org)

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Kevin Grittner
Дата:
Сообщение: Re: BUG #7884: pg_upgradecluster is terribly slow
Следующее
От: Maciek Sakrejda
Дата:
Сообщение: Re: BUG #7883: "PANIC: WAL contains references to invalid pages" on replica recovery