Re: Disable WAL logging to speed up data loading

Поиск
Список
Период
Сортировка
От Fujii Masao
Тема Re: Disable WAL logging to speed up data loading
Дата
Msg-id cb48d622-c41f-4cf5-b8f7-3cf4758980d7@oss.nttdata.com
обсуждение исходный текст
Ответ на Re: Disable WAL logging to speed up data loading  (Laurenz Albe <laurenz.albe@cybertec.at>)
Ответы Re: Disable WAL logging to speed up data loading  (Laurenz Albe <laurenz.albe@cybertec.at>)
RE: Disable WAL logging to speed up data loading  ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>)
Список pgsql-hackers

On 2020/10/28 22:05, Laurenz Albe wrote:
> On Wed, 2020-10-28 at 09:55 +0000, osumi.takamichi@fujitsu.com wrote:
>>>> I wrote and attached the first patch to disable WAL logging.
>>>> This patch passes the regression test of check-world already and is
>>>> formatted by pgindent.
>>>
>>> Without reading the code, I have my doubts about that feature.
>>> While it clearly will improve performance, it opens the door to data loss.
>>
>> Therefore, this feature must avoid that
>> that kind of inconsistent server starts up again.
>> This has been discussed in this thread already.
>>
>>> People will use it to speed up their data loads and then be unhappy if they
>>> cannot use their backups to recover from a problem.
>>> What happens if you try to do archive recovery across a time where wal_level
>>> was "none"?  Will the recovery process fail, as it should, or will you end up
>>> with data corruption?
>>> We already have a performance-related footgun in the shape of fsync = off.
>>> Do we want to add another one?
>>
>> Further, in this thread, we discuss that
>> this feature is intended to serve under
>> some specific opportunities like DBA wants
>> to load data as soon as possible and/or the operation itself is easily *repeatable*.
>> So, before and after the change of wal_level, DBA needs to take a full backup to
>> prepare the unexpected crash.
>>
>> But anyway, I'll fix and enrich the documents. Thanks.
> 
> I read through the thread and the patch now.
> 
> The only safety I see is that startup after a crash is prevented.
> 
> But what if someone sets wal_level=none, performs some data modifications,
> sets wal_level=archive and after dome more processing decides to restore from
> a backup that was taken before the cluster was set to wal_level=none?
> Then they would end up with a corrupted database, right?

I think that the guard to prevent the server from starting up from
the corrupted database in that senario is necessary.

> 
> I think the least this patch needs is that starting with wal_level=none emits
> a WAL record that will make recovery fail.
> 
> I am aware that this is intended for "specific opportunities", but we still
> should make it as hard as possible for the user to cause harm.  It may be that
> MySQL, which inspired this feature, does not care about that, but I think we
> should do better.

I'm still not sure if it's worth supporting this feature in core.
Because it can really really easily cause users to corrupt whole the database.

BTW, with the patch, I observed that PREPARE TRANSACTION and
COMMIT PREPARED caused assertion failure in my env, as I pointed upthread.

How does the patch handle other feature depending on the existence of WAL,
e.g., pg_logical_emit_message()?

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: recovering from "found xmin ... from before relfrozenxid ..."
Следующее
От: Fujii Masao
Дата:
Сообщение: Re: document pg_settings view doesn't display custom options