pg_dump slow with bytea data

Поиск
Список
Период
Сортировка
От chris r.
Тема pg_dump slow with bytea data
Дата
Msg-id 4D6E0151.20402@gmx.net
обсуждение исходный текст
Ответы Re: pg_dump slow with bytea data  ("chris r." <chricki@gmx.net>)
Re: pg_dump slow with bytea data  (Merlin Moncure <mmoncure@gmail.com>)
Re: pg_dump slow with bytea data  (Alban Hertroys <dalroi@solfertje.student.utwente.nl>)
Список pgsql-general
Dear list,

As discussed extensively in the past [1], pg_dump tends to be slow for
tables that contain bytea columns with large contents. Starting with
postgres version 8.5 the COPY format of bytea was changed from escape to
hex [1], giving ~50% performance boost.

However, we experience heavy problems during our weekly backup of our
database recently. We suspect the reason for this is that we changed
some columns from text with base64-encoded binary stuff to bytea
columns. This change affected a large fraction of the database (~400
GB). Note that we ran VACUUM FULL on the tables affected.

After this change our backup procedure heavily slowed down. Whereas it
took about 8 hours before the change, pg_dump is still busy with the
first table (keeping roughly 50GB) after 12 hours of backup. If I
approximate the time to complete the backup based on this, the backup
procedure would require factor 10 the time it required before the
change. The command we run is simply:  pg_dump -f <outputfile> -F c <db>

The main reason for this immense slow-down was identified in [1] as the
conversion of bytea into a compatible format (i.e. hex). However, given
the size of the db, a factor 10 makes backups practically infeasible.

We do not see any good solution to our problem except COPYing all data
in BINARY format. We understand there is a tough trade-off between
backup portability and backup efficiency here. As Bernd mentioned in
[1], however, not in all cases portability is required - particularly
not in ours.

A switch for binary output in pg_dump, or some alternative way to export
data *consistently* in binary format would be ideal for us, and probably
some others storing bytea data. Or do you see an alternative way how we
could get around this issue? Obviously, having no backup or deleting the
binary stuff from the database are no serious options.

Thanks for any discussion input in advance,
Chris

В списке pgsql-general по дате отправления:

Предыдущее
От: Thom Brown
Дата:
Сообщение: Re: Comparing md5 hash with md5 password hash
Следующее
От: "chris r."
Дата:
Сообщение: Re: pg_dump slow with bytea data