Re: Reworking WAL locking

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: Reworking WAL locking
Дата
Msg-id 200803230005.m2N05Gq26940@momjian.us
обсуждение исходный текст
Ответ на Reworking WAL locking  (Simon Riggs <simon@2ndquadrant.com>)
Список pgsql-hackers
Added to TODO:

* Improve WAL concurrency by increasing lock granularity
 http://archives.postgresql.org/pgsql-hackers/2008-02/msg00556.php


---------------------------------------------------------------------------

Simon Riggs wrote:
> 
> Paul van den Bogaard (Sun) suggested to me that we could use more than
> two WAL locks to improve concurrency. I think its possible to introduce
> such a scheme with some ease. All mods within xlog.c
> 
> The scheme below requires an extra LWlock per WAL buffer.
> 
> Locking within XLogInsert() would look like this:
> 
> Calculate length of data to be inserted.
> Calculate initial CRC
> 
> LWLockAcquire(WALInsertLock, LW_EXCLUSIVE)
> 
> Reserve space to write into. 
> LSN = current Insert pointer
> Move pointer forward by length of data to be inserted, acquiring
> WALWriteLock if required to ensure space is available.
> 
> LWLockAcquire(LSNGetWALPageLockId(LSN), LW_SHARED);
> 
> Note that we don't lock every page, just the first one of the set we
> want, but we hold it until all page writes are complete.
> 
> LWLockRelease(WALInsertLock);
> 
> finish calculating CRC
> write xlog into reserved space
>     
> LWLockRelease(LSNGetWALPageLockId(LSN));
> 
> XLogWrite() will then try to get a conditional LW_EXCLUSIVE lock
> sequentially on each page it plans to write. It keeps going until it
> fails to get the lock, then writes. Callers of XLogWrite will never be
> able to pass a backend currently performing the wal buffer fill.
> 
> We write whole page at a time.
> 
> Next time, we do a regular lock wait on the same page, so that we always
> get a page eventually.
> 
> This requires us to get 2 locks for an XLogInsert rather than just one.
> However the second lock is always acquired with zero-wait time when the
> wal_buffers are sensibly sized. Overall this should reduce wait time for
> the WALInsertLock since it seems likely that each actual filling of WAL
> buffers will effect different cache lines and are very likely to be able
> to be performed in parallel.
> 
> Sounds good to me.
> 
> Any objections/comments before this can be tried out? 
> 
> -- 
>   Simon Riggs
>   2ndQuadrant  http://www.2ndQuadrant.com 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://postgres.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: Idea for minor tstore optimization
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: pg_dump additional options for performance