Re: [HACKERS] TODO item

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: [HACKERS] TODO item
Дата
Msg-id 21589.949965409@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: [HACKERS] TODO item  (Bruce Momjian <pgman@candle.pha.pa.us>)
Ответы Re: [HACKERS] TODO item  (Bruce Momjian <pgman@candle.pha.pa.us>)
Список pgsql-hackers
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I can't imagine how fsync could flush _only_ the file discriptor buffers
> modified by the current process.  It would have to affect all buffers
> for the file descriptor.

Yeah, you're probably right.  After thinking about it, I can't believe
that a disk block buffer inside the kernel has any record of which FD
it was written by (after all, it could have been dirtied through more
than one FD since it was last synced to disk).  All it's got is a file
inode number and a block number within the file.  Presumably fsync()
searches the buffer cache for blocks that match the FD's inode number
and schedules I/O for all the ones that are dirty.

> So, I think we are safe if we can either keep that file descriptor open
> until commit, or re-open it and fsync it on commit.  That assume a
> re-open is hitting the same file.  My opinion is that we should just
> fsync it on close and not worry about a reopen.

There's still the problem that your backend might never have opened the
relation file at all, still less done a write through its fd or vfd.
I think we would need to have a separate data structure saying "these
relations were dirtied in the current xact" that is not tied to fd's or
vfd's.  Maybe the relcache would be a good place to keep such a flag.

Transaction commit would look like:

* scan buffer cache for dirty buffers, fwrite each one that belongs
to one of the relations I'm trying to commit;

* open and fsync each segment of each rel that I'm trying to commit
(or maybe just the dirtied segments, if we want to do the bookkeeping
at that level of detail);

* make pg_log entry;

* write and fsync pg_log.

fsync-on-close is probably a waste of cycles.  The only way that would
matter is if someone else were doing a RENAME TABLE on the rel, thus
preventing you from reopening it.  I think we could just put the
responsibility on the renamer to fsync the file while he's doing it
(in fact I think that's already in there, at least to the extent of
flushing the buffer cache).
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Chris Bitmead
Дата:
Сообщение: Re: [HACKERS] New Globe
Следующее
От: Chris Bitmead
Дата:
Сообщение: Re: [HACKERS] ONLY