Re: Load distributed checkpoint

Поиск
Список
Период
Сортировка
От Kevin Grittner
Тема Re: Load distributed checkpoint
Дата
Msg-id 4577E6D9.EE98.0025.0@wicourts.gov
обсуждение исходный текст
Ответ на Load distributed checkpoint  (ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>)
Ответы Re: Load distributed checkpoint  (Greg Smith <gsmith@gregsmith.com>)
Re: Load distributed checkpoint  (ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>)
Re: Load distributed checkpoint  ("Jim C. Nasby" <jim@nasby.net>)
Список pgsql-hackers
>>> On Thu, Dec 7, 2006 at 12:05 AM, in message
<20061207144843.6269.ITAGAKI.TAKAHIRO@oss.ntt.co.jp>, ITAGAKI Takahiro
<itagaki.takahiro@oss.ntt.co.jp> wrote:
>
> We offen encounters performance gap during checkpoint. The reason is write
> bursts. Storage devices are too overworked in checkpoint, so they can not
> supply usual transaction processing.
When we first switched our web site to PostgreSQL, this was one of our biggest problems.  Queries which normally run in
afew milliseconds were hitting the 20 second limit we impose in our web application.  These were happening in bursts
whichsuggested that they were caused by checkpoints.  We adjusted the background writer configuration and nearly
eliminatedthe problem.bgwriter_all_maxpages           | 600bgwriter_all_percent            | 10bgwriter_delay
      | 200bgwriter_lru_maxpages           | 200bgwriter_lru_percent            | 20 
Between the xfs caching and the batter backed cache in the RAID controller, the disk writes seemed to settle out pretty
well.
> Checkpoint consists of the following four steps, and the major performance
> problem is 2nd step. All dirty buffers are written without interval in it.
>
>  1. Query information (REDO pointer, next XID etc.)
>  2. Write dirty pages in buffer pool
>  3. Flush all modified files
>  4. Update control file
>
> I suggested to write pages with sleeping in 2nd step, using normal activity
> of the background writer. It is something like cost- based vacuum delay.
> Background writer has two pointers, 'ALL' and 'LRU', indicating where to
> write out in buffer pool. We can wait for the ALL clock- hand going around
> to guarantee all pages to be written.
>
> Here is pseudo- code for the proposed method. The internal loop is just the
> same as bgwriter's activity.
>
>   PrepareCheckPoint();  --  do step 1
>   Reset num_of_scanned_pages by ALL activity;
>   do {
>       BgBufferSync();   --  do a part of step 2
>       sleep(bgwriter_delay);
>   } while (num_of_scanned_pages < shared_buffers);
>   CreateCheckPoint();   --  do step 3 and 4
Would the background writer be disabled during this extended checkpoint?  How is it better to concentrate step 2 in an
extendedcheckpoint periodically rather than consistently in the background writer? 
> We may accelerate background writer to reduce works at checkpoint instead of
> the method, but it introduces another performance problem; Extra pressure
> is always put on the storage devices to keep the number of dirty pages low.
Doesn't the file system caching logic combined with a battery backed cache in the controller cover this, or is your
patchto help out those who don't have battery backed controller cache?  What would the impact of your patch be on
environmentslike ours?  Will there be any affect on PITR techniques, in terms of how current the copied WAL files would
be?
> I'm working about adjusting the progress of checkpoint to checkpoint timeout
> and wal segments limitation automatically to avoid overlap of two
> checkpoints.
> I'll post a patch sometime soon.
>
> Comments and suggestions welcome.
>
> Regards,
> ---
> ITAGAKI Takahiro
> NTT Open Source Software Center



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Takayuki Tsunakawa"
Дата:
Сообщение: Re: Load distributed checkpoint
Следующее
От: "Heikki Linnakangas"
Дата:
Сообщение: Re: old synchronized scan patch