Re: Load distributed checkpoint

Поиск
Список
Период
Сортировка
От Greg Smith
Тема Re: Load distributed checkpoint
Дата
Msg-id Pine.GSO.4.64.0612072205250.24653@westnet.com
обсуждение исходный текст
Ответ на Re: Load distributed checkpoint  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Список pgsql-hackers
On Thu, 7 Dec 2006, Kevin Grittner wrote:

> Between the xfs caching and the batter backed cache in the RAID...

Mmmmm, battered cache.  You can deep fry anything nowadays.

> Would the background writer be disabled during this extended checkpoint?

The background writer is the same process that does the full buffer sweep 
at checkpoint time.  You wouldn't have to disable it because it would be 
busy doing this extended checkpoint instead of its normal job.

> How is it better to concentrate step 2 in an extended checkpoint 
> periodically rather than consistently in the background writer?

Right now, when the checkpoint flush is occuring, there is no background 
writer active--that process is handling the checkpoint.  Itagaki's 
suggestion is basically to take the current checkpoint code, which runs 
all in one burst, and spread it out over time.  I like the concept, as 
I've seen the behavior he's describing (even after tuning the background 
writer like you suggest and doing Linux disk tuning as Ron describes), but 
I think solving the problem is a little harder than suggested.

I have two concerns with the logic behind this approach.  The first is 
that if the background writer isn't keeping up with writing out all the 
dirty pages, what makes you think that running the checkpoint with a 
similar level of activity is going to?  If your checkpoint is taking a 
long time, it's because the background writer has an overwhelming load and 
needs to be bailed out.  Slowing down the writes with a lazier checkpoint 
process introduces the possibility that you'll hit a second checkpoint 
request before you're even finished cleaning up the first one, and then 
you're really in trouble.

Second, the assumption here is that it's writing the dirty buffers out 
that is the primary cause of the ugly slowdown.  I too believe it could 
just as easily be the fsync when it's done that killing you, and slowing 
down the writes isn't necessarily going to make that faster.

> Doesn't the file system caching logic combined with a battery backed 
> cache in the controller cover this, or is your patch to help out those 
> who don't have battery backed controller cache?

Unless your shared buffer pool is so small that you can write it all out 
onto the cache, that won't help much with this problem.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD


В списке pgsql-hackers по дате отправления:

Предыдущее
От: "zhang Jackie"
Дата:
Сообщение: about PostgreSQL Benchmak( pgbench )
Следующее
От: Greg Smith
Дата:
Сообщение: Re: Load distributed checkpoint