Re: Disable WAL logging to speed up data loading

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: Disable WAL logging to speed up data loading
Дата
Msg-id X+WQEgRDGHyLdH2x@paquier.xyz
обсуждение исходный текст
Ответ на RE: Disable WAL logging to speed up data loading  ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
Ответы RE: Disable WAL logging to speed up data loading
Re: Disable WAL logging to speed up data loading
Список pgsql-hackers
On Thu, Dec 03, 2020 at 03:52:47AM +0000, tsunakawa.takay@fujitsu.com wrote:
> The code looks good, and the performance seems to be nice, so I
> marked this ready for committer.

FWIW, I am extremely afraid of this proposal because this is basically
a footgun able to corrupt customer instances, and I am ready to bet
that people would switch wal_level to none because they see a lot of
benefits in doing so at first sight, until the host running the server
is plugged off and they need to use pg_resetwal in urgency to bring an
instance online.  Users have already the option to make things go bad,
just by disabling full page writes or fsync, but I really don't think
that we should put in their hands more options able to break
instances, nor should we try to spend more efforts in having more
"protections" that would trigger only once the instance is already
fried.

Perhaps this is something that Horiguchi-san pointed out upthread in
[1] (last sentence of first paragraph), but did you consider that it
is already possible to do bulk-loading with a minimal amount of WAL
generated as long as you do the COPY within the transaction that
created the table?  Quoting the docs in [2]:
"COPY is fastest when used within the same transaction as an earlier
CREATE TABLE or TRUNCATE command. In such cases no WAL needs to be
written, because in case of an error, the files containing the newly
loaded data will be removed anyway. However, this consideration only
applies when wal_level is minimal as all commands must write WAL
otherwise."

Upgrade scenarios have been mentioned in this case as being a pain
when it comes to take advantage of this optimization.  Wouldn't it be
safer if we took a client approach instead, where restores are able to
load the data with a cheap succession of commands by loading the data
in a transaction done after a TRUNCATE?

[1]: https://www.postgresql.org/message-id/20201001.115136.288898409051085426.horikyota.ntt@gmail.com
[2]: https://www.postgresql.org/docs/current/populate.html#POPULATE-COPY-FROM
--
Michael

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: create table like: ACCESS METHOD
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: [bug fix] ALTER TABLE SET LOGGED/UNLOGGED on a partitioned table does nothing silently