Re: WAL sync behaviour

Поиск
Список
Период
Сортировка
От mark@mark.mielke.cc
Тема Re: WAL sync behaviour
Дата
Msg-id 20051110165313.GA14444@mark.mielke.cc
обсуждение исходный текст
Ответ на Re: WAL sync behaviour  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-performance
On Thu, Nov 10, 2005 at 11:39:34AM -0500, Tom Lane wrote:
> No, Mike is right: for WAL you shouldn't need any journaling.  This is
> because we zero out *and fsync* an entire WAL file before we ever
> consider putting live WAL data in it.  During live use of a WAL file,
> its metadata is not changing.  As long as the filesystem follows
> the minimal rule of syncing metadata about a file when it fsyncs the
> file, all the live WAL files should survive crashes OK.

Yes, with emphasis on the zero out... :-)

> You do need metadata journaling for all non-WAL PG files, since we don't
> fsync them every time we extend them; which means the filesystem could
> lose track of which disk blocks belong to such a file, if it's not
> journaled.

I think there may be theoretical problems with regard to the ordering
of the fsync operation, for files that are not pre-allocated. For
example, if a new block is allocated - there are two blocks that need
to be updated.  The indirect reference block (or inode block, if block
references fit into the inode entry), and the block itself. If the
indirect reference block is written first, before the data block, the
state of the disk is inconsistent. This would be a crash during the
fsync() operation. The metadata journalling can ensure that the data
block is allocated first, and then all the necessary references
updated, allowing for the operation to be incomplete and rolled back,
or committed in full.

Or, that is my understanding, anyways, and this is why I would not use
ext2 for the database, even if it was claimed that fsync() was used.

For WAL, with pre-allocated zero blocks? Sure. Ext2... :-)

mark

--
mark@mielke.cc / markm@ncf.ca / markm@nortel.com     __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   |
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
                       and in the darkness bind them...

                           http://mark.mielke.cc/


В списке pgsql-performance по дате отправления:

Предыдущее
От: Scott Marlowe
Дата:
Сообщение: Re: WAL sync behaviour
Следующее
От: Tom Lane
Дата:
Сообщение: Re: same plan, add 1 condition, 1900x slower