RE: Disable WAL logging to speed up data loading

Поиск
Список
Период
Сортировка
От osumi.takamichi@fujitsu.com
Тема RE: Disable WAL logging to speed up data loading
Дата
Msg-id OSBPR01MB48882B8D091CE688C4DE2731ED170@OSBPR01MB4888.jpnprd01.prod.outlook.com
обсуждение исходный текст
Ответ на RE: Disable WAL logging to speed up data loading  ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
Ответы Re: Disable WAL logging to speed up data loading  (Laurenz Albe <laurenz.albe@cybertec.at>)
RE: Disable WAL logging to speed up data loading  ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
Список pgsql-hackers
Hi,


I wrote and attached the first patch to disable WAL logging.
This patch passes the regression test of check-world already
and is formatted by pgindent.

Also, I conducted 3 types of performance tests to
clarify the rough benefits of the patch.

I compared two wal_levels both 'minimal' and 'none'.
For both levels, I measured
(1) cluster's restore from pg_dumpall output,
(2) COPY initial data to a defined table as initial data loading, and
(3) COPY data to a defined table with tuples, 3 times each for all cases.
After that, calculated the average and the ratio of the loading speed.
The conclusion is that wal_level=none cuts about 20% of the loading speed
compared to 'minimal' in the three cases above. For sure, we could tune the configuration of
postgresql.conf further but roughly it's meaningful to share the result, I think.
'shared_buffers' was set to 40% of RAM while 'maintenance_work_mem'
was set to 20%. I set max_wal_senders = 0 this time.

The input data was generated from pgbench with 1000 scale factor.
It's about 9.3GB. For the table definition or
the initial data for appended data loading test case,
I used pgbench to set up the schema as well.
Sharing other scenario to measure is welcome.

I need to say that the current patch doesn't take the change of wal_level='none'
from/to other ones into account fully or other commands like ones for two phase commit yet.
Sorry for that.

Also, to return the value of last shut down LSN in XLogInsert(),
it acquires lock of control file every time, which was not good clearly.
I tried to replace that code like having a LSN cache as one variable
but I was not sure where I should put the function to set the initial value of the LSN.
After LocalProcessControlFile() in PostmasterMain() was not the right place.
I'd be happy if someone gives me an advice about it.


Best,
    Takamichi Osumi

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Craig Ringer
Дата:
Сообщение: Re: Internal key management system
Следующее
От: Bharath Rupireddy
Дата:
Сообщение: Re: Log message for GSS connection is missing once connection authorization is successful.