Re: simplify register_dirty_segment()
От | Qingqing Zhou |
---|---|
Тема | Re: simplify register_dirty_segment() |
Дата | |
Msg-id | d4kcjl$1f8n$1@news.hub.org обсуждение исходный текст |
Ответ на | simplify register_dirty_segment() ("Qingqing Zhou" <zhouqq@cs.toronto.edu>) |
Ответы |
Re: simplify register_dirty_segment()
|
Список | pgsql-hackers |
"Tom Lane" <tgl@sss.pgh.pa.us> writes > On platforms that I'm familiar with, an fsync call causes the kernel > to spend a significant amount of time groveling through its buffers > to see if any are dirty. We shouldn't incur that cost to buy marginal > speedups at the application level. (In other words, "it's only an > open/close" is wrong.) > I did some tests in SunOS, Linux and windows. Basically, I create 100 files, close them. Reopen them, write(dirty)/read(clean) 8192*100 bytes each, then fsync() them. I mesured the fsync() time. SunOS 5.8 + NFS + SCSI Fsync dirty files: duration: 2404.573 ms Fsync clean files: duration: 598.037 ms Linux 2.4 + Ext3 + IDE Fsync dirty files: duration: 6951.793 ms Fsync clean files: duration: 18.132 ms Window2000 + NTFS + IDE Fsync dirty files: duration: 3005.000 ms Fsync clean files: duration: 1101.000 ms I can't figure out why it tooks so long time in windows and SunOS for clean files - a possible reason is that they have to fsync some inode information like last access time even for clean files. Linux is quite smart in this sense. > Also, it's not clear to me how this idea works at all, if a backend holds > a relation open across more than one checkpoint. What will re-register > the segment for the next cycle? > You are right. A possible (but not clean) solution is like this: The bgwriter maintain a refcount for each file. When the file is open, refcount++, when the file is closing, refcount--. When the refcount goes to zero, Bgwriter could safely remove it from its PendingOpsTable after checkpoint. Regards, Qingqing
В списке pgsql-hackers по дате отправления: