Re: Overflow of bgwriter's request queue

Поиск
Список
Период
Сортировка
От ITAGAKI Takahiro
Тема Re: Overflow of bgwriter's request queue
Дата
Msg-id 20060113131936.4E24.ITAGAKI.TAKAHIRO@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: Overflow of bgwriter's request queue  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
I'm sorry when you have received mails of the same content. I had sent
a mail but it seemed not to be delivered, so I'll send it again.


Tom Lane <tgl@sss.pgh.pa.us> wrote:

> > I encountered overflow of bgwriter's file-fsync request queue.
> I can't help thinking that this is a situation that could only be got
> into with a seriously misconfigured database --- per the comments for
> ForwardFsyncRequest, we really don't want this code to run at all,
> let alone run so often that a queue with NBuffers entries overflows.
> What exactly are the test conditions under which you're seeing this
> happen?

It happened at the two environments. [1] TPC-C(DBT-2) / RHEL4 U1 (2.6.9-11)     XFS, 8 S-ATA disks / 8GB
memory(shmem=512MB)[2] TPC-C(DBT-2) / RHEL4 U2 (2.6.9-22)     XFS, 6 SCSI disks / 6GB memory(shmem=1GB)
 

I think it is not so bad configuration. There seems to be a problem in
the combination of XFS and heavy update workloads, but the total throuput
at XFS with my patch was better than ext3.

I suspect that NBuffers for the queue length is not enough. If all buffers
are dirty, ForwardFsyncRequest would be called more than NBuffers times
during BufferSync, so the queue could become full.


> If there actually is a problem that needs to be solved, I think it'd be
> better to try to do AbsorbFsyncRequests somewhere in the main checkpoint
> loops.  I don't like the idea of holding the BgWriterCommLock long
> enough to do a qsort ... especially not if this occurs only with very
> large NBuffers settings.

Ok, I agree. I sent PATCHES a patch that calls AbsorbFsyncRequests
in the loops of BufferSync and mdsync.


> Also, what if the qsort fails to eliminate any
> duplicates, or eliminates only a few?  You could get into a scenario
> where the qsort gets repeated every few ForwardFsyncRequest calls, in
> which case it'd become a drag on performance itself.

Now, I think the above solution is better than qsort, but qsort will also
work not so bad. NBuffers is at least one thousand, while the count of files
that needs fsync is at most hundreds, so duplidate elimination will work well.
In fact, in my machine, the queue became full twice in a checkpoint and
length of the queue decreased from 65536 to *32* by duplicate eliminations.

---
ITAGAKI Takahiro
NTT Cyber Space Laboratories




В списке pgsql-hackers по дате отправления:

Предыдущее
От: John DeSoi
Дата:
Сообщение: Re: Contrib Schemas
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [SQL] info is a reserved word?