Re: [HACKERS] Re: [QUESTIONS] Does Storage Manager support >2GB tables?

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: [HACKERS] Re: [QUESTIONS] Does Storage Manager support >2GB tables?
Дата
Msg-id 199803122113.QAA06519@candle.pha.pa.us
обсуждение исходный текст
Ответ на Re: [HACKERS] Re: [QUESTIONS] Does Storage Manager support >2GB tables?  (dg@illustra.com (David Gould))
Ответы Re: [HACKERS] Re: [QUESTIONS] Does Storage Manager support >2GB tables?  (dg@illustra.com (David Gould))
Список pgsql-hackers
> At least on the systems I am intimately familiar with, the prefetch that the
> OS does (assuming a modern OS like Linux) is pretty hard to beat. If you have
> a table that was bulk loaded in key order, a sequential scan is going to
> result in a sequential access pattern to the underlying file and the OS
> prefetch does the right thing. If you have an unindexed table with rows
> inserted at the end, the OS prefetch still works. If you are using a secondary
> index on some sort of chopped up table with rows inserted willy-nilly, it
> then, it may be worth doing async reads in a burst and let the disk request
> sort make the best of it.
>
> As far as I am aware, Postgres does not do async I/O. Perhaps it should.

I am adding this to the TODO list:

    * Do async I/O to do better read-ahead of data

Because we are not threaded, we really can't do anything else while we
are waiting for I/O, but we can pre-request data we know we will need.

>
> > Also nice so you can control what gets written to disk/fsync'ed and what doesn't
> > get fsync'ed.
>
> This is really the big win.

Yep, and this is what we are trying to work around in our buffered
pg_log change.  Because we have the transaction ids all compact in one
place, this seems like a workable solution to our lack of write-to-disk
control.  We just control the pg_log writes.

>
> > Our idea is to control when pg_log gets written to disk.  We keep active
> > pg_log pages in shared memory, and every 30-60 seconds, we make a memory
> > copy of the current pg_log active pages, do a system sync() (which
> > happens anyway at that interval), update the pg_log file with the saved
> > changes, and fsync() the pg_log pages to disk.  That way, after a crash,
> > the current database only shows transactions as committed where we are
> > sure all the data has made it to disk.
>
> OK as far as it goes, but probably bad for concurrancy if I have understood
> you.

Interesed in hearing your comments.


--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: SCO vs. the monster macro
Следующее
От: dg@illustra.com (David Gould)
Дата:
Сообщение: Re: [HACKERS] PL/pgSQL - for discussion