Re: Reduce WAL logging of INSERT SELECT

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: Reduce WAL logging of INSERT SELECT
Дата
Msg-id 201108060316.p763Giq09274@momjian.us
обсуждение исходный текст
Ответ на Re: Reduce WAL logging of INSERT SELECT  (Jeff Davis <pgsql@j-davis.com>)
Ответы Re: Reduce WAL logging of INSERT SELECT  (Simon Riggs <simon@2ndQuadrant.com>)
Re: Reduce WAL logging of INSERT SELECT  (Jeff Davis <pgsql@j-davis.com>)
Список pgsql-hackers
Jeff Davis wrote:
> On Thu, 2011-08-04 at 18:07 -0400, Bruce Momjian wrote:
> > I am confused how generating WAL traffic that is larger than the heap
> > file we are fsync'ing can possibly be slower.  Are you just throwing out
> > an idea to try to make me prove it?
> 
> That's worded in a slightly confusing way, but here is the trade-off:
> 
> 1. If you are using WAL, then regardless of what your transaction does,
> only the WAL needs to be fsync'd at commit time. Conveniently, that's
> being written sequentially, so it's a single fairly cheap fsync; and all
> the data page changes are deferred, collected together, and fsync'd at
> checkpoint time (rather than commit time). The cost is that you
> double-write the data.
> 
> 2. If you aren't using WAL, you need to fsync every data file the
> transaction touched, which are probably not localized with other
> activity. Also, the _entire_ data files needs to be sync'd, so perhaps
> many other transactions have made changes to one data file all over, and
> it may require _many_ seeks to accomplish the one fsync. The benefit is
> that you don't double-write the data.
> 
> So, fundamentally, WAL is (in the OLTP case, where a transaction is much
> shorter than a checkpoint interval) a big performance _win_, because it
> allows us to do nice sequential writing in a single place for all
> activities of all transactions; and defer all those random writes to
> data pages until the next checkpoint. So we shouldn't treat WAL like a
> cost burden that we want to avoid in every case we can.
> 
> But in the data load case (where many checkpoints may happen during a
> single transaction anyway), it happens that avoiding WAL is a
> performance win, because the seeks are not the dominant cost.

Well, if the table is created in the same transaction (which is the only
case under consideration), no other sessions can write to the table so
you are just writing the entire table on commit, rather than to the WAL.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: cataloguing NOT NULL constraints
Следующее
От: Gokulakannan Somasundaram
Дата:
Сообщение: Re: Reduce WAL logging of INSERT SELECT