Re: [HACKERS] Priorities for 6.6
От | Bruce Momjian |
---|---|
Тема | Re: [HACKERS] Priorities for 6.6 |
Дата | |
Msg-id | 199906070116.VAA17949@candle.pha.pa.us обсуждение исходный текст |
Ответ на | Re: [HACKERS] Priorities for 6.6 (Vadim Mikheev <vadim@krs.ru>) |
Ответы |
Re: [HACKERS] Priorities for 6.6
|
Список | pgsql-hackers |
> Kaare Rasmussen wrote: > > > > > I think we need that, and it should be the default, but few people agree > > > with me. I have some schemes to do this. > > I remember this, Bruce. But I would like to see it implemented > in right way. I'm not happy with "two sync() in postmaster" idea. > We have to implement Shared Catalog Cache (SCC), mark all dirtied > relation files there and than just fsync() these files, before > fsync() of pg_log. I see. You want to use the shared catalog cache to flag relations that have been modified, and fsync those before fsync of pglog. Another idea is to send a signal to each backend that has marked a bit in shared memory saying it has written to a relation, and have the signal handler fsync all its dirty relations, set a finished bit, and have the postmaster then fsync pglog. The shared catalog cache still requires the postmaster to open every relation that is marked as dirty to fsync it, which could be a performance problem. Now, if we could pass file descriptors between processes, that would make things easy. I think BSD can do it, but I don't believe it is portable. My idea would be: backend 1 2 3 4 5 6 7 dirtied: 1 2 3 4 5 6 7 fsync'ed: Each backend sets it's 'dirtied' bit when it modifies and relation. Every 5 seconds, postmaster scans dirtied list, sends signal to each backend that has dirtied. Each backend fsyncs its relations, then sets its fsync'ed bit. When all have signaled fsynced, the postmaster can update pg_log on disk. Another issue is that now that we update the transaction status as part of SELECT, pg_log is not the only representation of committed status. Of course, we have to prevent flush of pglog by OS, perhaps by making a copy of the last two pages of pg_log before this and remove it after. If a backend starts up and sees that pg_log copy file, it puts that in place of the current last two pages of pg_log. Also, for 6.6, I am going to add system table indexes so all cache lookups use indexes. I am unsure that shared catalog cache is going to do that buffer cache doesn't already do. Perhaps if we just flushed the system table cache buffers less frequently, there would be no need for a shared system cache. Basically, this fsync() thing is killing performance, and I think we can come up with an smart solution to this if we discuss the options. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
В списке pgsql-hackers по дате отправления: