Обсуждение: disk caching for writing log

Поиск
Список
Период
Сортировка

disk caching for writing log

От
flyusa2010 fly
Дата:
When writing log, dbms should synchronously flush log to disk. I'm wondering, if it is possible that the logs are in disk cache, while the control is returned to dbms again, so dbms thinks logs are persistent on disk. In this case, if the disk fails, then there's incorrectness for dbms log writing, because the log is not persistent, but dbms considers it is persistent!

Am I correct?

Re: disk caching for writing log

От
Heikki Linnakangas
Дата:
On 03.12.2010 13:49, flyusa2010 fly wrote:
> When writing log, dbms should synchronously flush log to disk. I'm
> wondering, if it is possible that the logs are in disk cache, while the
> control is returned to dbms again, so dbms thinks logs are persistent on
> disk. In this case, if the disk fails, then there's incorrectness for dbms
> log writing, because the log is not persistent, but dbms considers it is
> persistent!

I have no idea what you mean. The method we use to flush the WAL to disk 
should not be fallible to such failures, we wait for fsync() or 
fdatasync() to return before we assume the logs are safely on disk. If 
you can elaborate what you mean by "control is returned to dbms", maybe 
someone can explain why in more detail.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: disk caching for writing log

От
Stefan Kaltenbrunner
Дата:
On 12/03/2010 06:43 PM, Heikki Linnakangas wrote:
> On 03.12.2010 13:49, flyusa2010 fly wrote:
>> When writing log, dbms should synchronously flush log to disk. I'm
>> wondering, if it is possible that the logs are in disk cache, while the
>> control is returned to dbms again, so dbms thinks logs are persistent on
>> disk. In this case, if the disk fails, then there's incorrectness for
>> dbms
>> log writing, because the log is not persistent, but dbms considers it is
>> persistent!
>
> I have no idea what you mean. The method we use to flush the WAL to disk
> should not be fallible to such failures, we wait for fsync() or
> fdatasync() to return before we assume the logs are safely on disk. If
> you can elaborate what you mean by "control is returned to dbms", maybe
> someone can explain why in more detail.

I think he is refering to the plain old "the disk/os is lying about 
whether the data really made it to stable storage" issue(especially with 
the huge local caches on modern disks) - if you have such a disk and/or 
an OS with broken barrier support you are doomed.


Stefan


Re: disk caching for writing log

От
flyusa2010 fly
Дата:
Thanks for your reply. 
Yes, i mean disk may lie to os.


On Fri, Dec 3, 2010 at 12:14 PM, Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote:
On 12/03/2010 06:43 PM, Heikki Linnakangas wrote:
On 03.12.2010 13:49, flyusa2010 fly wrote:
When writing log, dbms should synchronously flush log to disk. I'm
wondering, if it is possible that the logs are in disk cache, while the
control is returned to dbms again, so dbms thinks logs are persistent on
disk. In this case, if the disk fails, then there's incorrectness for
dbms
log writing, because the log is not persistent, but dbms considers it is
persistent!

I have no idea what you mean. The method we use to flush the WAL to disk
should not be fallible to such failures, we wait for fsync() or
fdatasync() to return before we assume the logs are safely on disk. If
you can elaborate what you mean by "control is returned to dbms", maybe
someone can explain why in more detail.

I think he is refering to the plain old "the disk/os is lying about whether the data really made it to stable storage" issue(especially with the huge local caches on modern disks) - if you have such a disk and/or an OS with broken barrier support you are doomed.


Stefan

Re: disk caching for writing log

От
Bruce Momjian
Дата:
flyusa2010 fly wrote:
> Thanks for your reply.
> Yes, i mean disk may lie to os.

Our documentation covers this extensively:
http://www.postgresql.org/docs/9.0/static/wal-reliability.html

---------------------------------------------------------------------------


> 
> 
> On Fri, Dec 3, 2010 at 12:14 PM, Stefan Kaltenbrunner
> <stefan@kaltenbrunner.cc> wrote:
> 
> > On 12/03/2010 06:43 PM, Heikki Linnakangas wrote:
> >
> >> On 03.12.2010 13:49, flyusa2010 fly wrote:
> >>
> >>> When writing log, dbms should synchronously flush log to disk. I'm
> >>> wondering, if it is possible that the logs are in disk cache, while the
> >>> control is returned to dbms again, so dbms thinks logs are persistent on
> >>> disk. In this case, if the disk fails, then there's incorrectness for
> >>> dbms
> >>> log writing, because the log is not persistent, but dbms considers it is
> >>> persistent!
> >>>
> >>
> >> I have no idea what you mean. The method we use to flush the WAL to disk
> >> should not be fallible to such failures, we wait for fsync() or
> >> fdatasync() to return before we assume the logs are safely on disk. If
> >> you can elaborate what you mean by "control is returned to dbms", maybe
> >> someone can explain why in more detail.
> >>
> >
> > I think he is refering to the plain old "the disk/os is lying about whether
> > the data really made it to stable storage" issue(especially with the huge
> > local caches on modern disks) - if you have such a disk and/or an OS with
> > broken barrier support you are doomed.
> >
> >
> > Stefan
> >

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +