WAL file utilization question

Поиск

Список

Период

Сортировка

От	Keaton Adams
Тема	WAL file utilization question
Дата	16 мая 2007 г. 16:28:31
Msg-id	1179332909.22514.3.camel@MXLRMT-208.corp.mxlogic.com обсуждение исходный текст
Ответы	Re: WAL file utilization question Re: WAL file utilization question
Список	pgsql-admin

Дерево обсуждения

I have a question related to the WAL log. Our organization is looking at using log-shipping to build a high-availability standby server. In our tests the amount of WAL log data generated per minute is significant. On our busiest database during a WAL archive test, eight 16 MB logs were being used/archived per minute, which translated into having to move roughly 7.6 GB of archived WAL log data across a WAN link from one data center to another.

I read this statement from the PostgreSQL 8.1 manual and have a question related to it:

"If full_page_writes is set (as is the default), there is another factor to consider. To ensure data page consistency, the first modification of a data page after each checkpoint results in logging the entire page content. In that case, a smaller checkpoint interval increases the volume of output to the WAL log, partially negating the goal of using a smaller interval, and in any case causing more disk I/O."

On most database systems I am used to, there is a physical log and separate logical log. The physical log holds any “before images” of data pages that have been modified since the last checkpoint. The logical log holds the individual rows of data that have been inserted, updated, deleted as well as checkpoint records and the like. If the server crashes, fast recovery is initiated where (1) any pages in the physical log are put back in the data files (tablespaces/dbspaces) on disk to get back to a state of physical consistency and (2) individual transactions in the logical log since the last checkpoint are rolled forward / rolled back to get to a point of logical consistency.

Even with full_page_writes set to false and checkpoints taking place every 60 seconds or so, the amount of WAL log data generated per minute seems to be significant.

So my question is this: If I kick off a transaction that loads records with a size of 100 bytes, does the insert for that record take 100 bytes in the WAL file, or is the data archived in the WAL log in page size (8k) portions? So with piggyback commits if I can only stack up 2K worth of data before the next LogFlush, will 2K be written to the WAL file, or will 8K be written each time regardless of the amount of actual transaction data that is available to flush?

Since there is no separate physical log to keep track of dirty/modified pages since the last checkpoint I would assume that the WAL log is serving a dual purpose of being able to get back to the point of physical and logical database consistency, but I need to know for certain that there is not a way to reduce the amount of WAL data being written for the amount of transaction information we are actually writing to the database at any given point in time.

Thank you,

Keaton

В списке pgsql-admin по дате отправления:

Предыдущее

От: "Jim C. Nasby"
Дата: 16 мая 2007 г., 16:22:31
Сообщение: Re: pg_dump not enough space to create a backup FreeBSD 6.1?

Следующее

От: "Jim C. Nasby"
Дата: 16 мая 2007 г., 16:35:22
Сообщение: Re: WAL file utilization question

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

WAL file utilization question

Предыдущее

Следующее