Performance and WAL on big inserts/updates

Поиск
Список
Период
Сортировка
От Marty Scholes
Тема Performance and WAL on big inserts/updates
Дата
Msg-id 4050DF07.1040301@outputservices.com
обсуждение исходный текст
Ответы Re: Performance and WAL on big inserts/updates  (Rod Taylor <pg@rbt.ca>)
Re: Performance and WAL on big inserts/updates  (Sailesh Krishnamurthy <sailesh@cs.berkeley.edu>)
Re: Performance and WAL on big inserts/updates  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
I combed the archives but could not find a discussion on this and am 
amazed this hasn't been discussed.

My experience with Oracle (and now limited experience with Pg) is that 
the major choke point in performance is not the CPU or read I/O, it is 
the log performance of big update and select statements.

Essentially, the data is written twice: first to the log and then the 
data files.  This would be ok except the transaction is regularly frozen 
while the log files sync to disk with a bunch of tiny (8KB for Oracle 
and Pg) write requests.

I realize that the logs must be there to ensure crash recovery and that 
PITR is on the way to supplement this.

If a transaction will do large updates or inserts, why don't we just log 
the parsed statements in the WAL instead of the individual data blocks 
that have changed?  Then, the data files could be fsync()ed every 
checkpoint, but essentially no write I/O takes places in the interim.

Outside of checkpoints, large updates would only need to fsync() a very 
small addition to the log files.

Recovery could be similar to how I understand it currently is:
1. Roll back in-flight changes
2. Roll forward log entries in order, either direct changes to the data 
or re-execute the parsed command in the log.

Some informal testing suggests that we get a factor of 8 improvement in 
speed here if we completely disable fsync() in large updates under Pg.

I would suspect that a big portion of those gains would be preserved if 
fsync() calls were limited to checkpoints and saving the parsed SQL 
command in the log.

Why have I not seen this in any database?

There must be a reason.

Thanks in advance.

Sincerely,
Marty



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Neil Conway
Дата:
Сообщение: Re: unsafe floats
Следующее
От: Alvaro Herrera Munoz
Дата:
Сообщение: Re: COMMENT ON [GROUP/USER]