Обсуждение: journaled FS and and WAL

Поиск

Список

Период

Сортировка

journaled FS and and WAL

От

"t.dalpozzo@gmail.com"

Дата:

14 октября 2016 г., 15:34:44

Hi,
  two question related to the WAL.

1) I read in the doc that journaled FS is not important as WAL is
journaling itself.  But who garantees that the WAL is written correctly?
I know that it's sequential and a partial update of WAL can be discarded
after a restart. But am I sure that without a journaled FS, if there is
a crash during the WAL update, nothing already updated in the WAL before
my commit can get corrupted?

2) Let's suppose that I have one database, one table of 100000 rows,
each 256 bytes. Now, in a single SQL commit, I update row 10, row 30000
and row 80000. How much should I expect the WAL increase by? (supposing
no WAL segments will be deleted). I could guess 8192x3 but I'm not sure

Regards
Pupillo

Re: journaled FS and and WAL

От

Albe Laurenz

Дата:

14 октября 2016 г., 17:28:00

t.dalpozzo@gmail.com wrote:
>   two question related to the WAL.
> 
> 1) I read in the doc that journaled FS is not important as WAL is
> journaling itself.  But who garantees that the WAL is written correctly?
> I know that it's sequential and a partial update of WAL can be discarded
> after a restart. But am I sure that without a journaled FS, if there is
> a crash during the WAL update, nothing already updated in the WAL before
> my commit can get corrupted?

At commit time, the WAL is "synchronized": PostgreSQL instructs the operating
system to write the data to the physical medium (not just a memory cache)
and only return success if that write was successful.

After a successful commit, the WAL file and its metadata are on disk.
Moreover, the file metadata won't change (except for the write and access
timestamps) because WAL files are created with their full size and never
extended, so no WAL file should ever get "lost" because of partial metadata
writes.

> 2) Let's suppose that I have one database, one table of 100000 rows,
> each 256 bytes. Now, in a single SQL commit, I update row 10, row 30000
> and row 80000. How much should I expect the WAL increase by? (supposing
> no WAL segments will be deleted). I could guess 8192x3 but I'm not sure

It will be that much immediately after a checkpoint, but for subsequent writes
to the same disk block only the actually changed parts of the data block will
be written to WAL.

Yours,
Laurenz Albe

Re: journaled FS and and WAL

От

Michael Paquier

Дата:

15 октября 2016 г., 08:52:16

On Fri, Oct 14, 2016 at 11:27 PM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote:
> After a successful commit, the WAL file and its metadata are on disk.
> Moreover, the file metadata won't change (except for the write and access
> timestamps) because WAL files are created with their full size and never
> extended, so no WAL file should ever get "lost" because of partial metadata
> writes.

This behavior depends as well on the value of wal_sync_method. For
example with fdatasync the metadata is not flushed. It does not matter
any for for WAL segments as Albe has already mentioned, but the choice
here impacts performance.
--
Michael

Re: journaled FS and and WAL

От

"t.dalpozzo@gmail.com"

Дата:

19 октября 2016 г., 11:00:52

So, as for the data content of the WAL file, I see that no more page
will be allocated. I wonder if during a crash, strange things can still
happen at disk level however, in particular in SSD devices; on these
things we have no control, and perhaps journaling helps?
As for the metadata, if during a crash it's flushed (with fdatasync,
only when the FS decides to do that), can anything bad happen without
journaling?

Third, let's suppose that the WAL can't get corrupted. When the system
flushes data pages to the disk according to the WAL content, if there is
a crash, am I sure that tables files old pages and /or their metadata,
inode.... can't get corrupted?
If that, there is no possibility to reconstruct the things, even through
the WAL. Even in this case, perhaps journaling helps.

I don't mind about performance but I absolutely mind about reliability,
so I was thinking about the safest setting of linux FS and postgresql I
can use.
Thanks!
Pupillo

Il 15/10/2016 07:52, Michael Paquier ha scritto:
> On Fri, Oct 14, 2016 at 11:27 PM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote:
>> After a successful commit, the WAL file and its metadata are on disk.
>> Moreover, the file metadata won't change (except for the write and access
>> timestamps) because WAL files are created with their full size and never
>> extended, so no WAL file should ever get "lost" because of partial metadata
>> writes.
> This behavior depends as well on the value of wal_sync_method. For
> example with fdatasync the metadata is not flushed. It does not matter
> any for for WAL segments as Albe has already mentioned, but the choice
> here impacts performance.

Re: journaled FS and and WAL

От

Albe Laurenz

Дата:

19 октября 2016 г., 14:49:23

t.dalpozzo@gmail.com wrote:
> I don't mind about performance but I absolutely mind about reliability,
> so I was thinking about the safest setting of linux FS and postgresql I
> can use.

Sure, use journaling then.
I do it all the time.

Yours,
Laurenz Albe

Re: journaled FS and and WAL

От

"Alex Ignatov \(postgrespro\)"

Дата:

19 октября 2016 г., 15:59:34

-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of t.dalpozzo@gmail.com
Sent: Wednesday, October 19, 2016 11:01 AM
To: Michael Paquier <michael.paquier@gmail.com>
Cc: Albe Laurenz <laurenz.albe@wien.gv.at>; pgsql-general@postgresql.org
Subject: Re: [GENERAL] journaled FS and and WAL

So, as for the data content of the WAL file, I see that no more page will be allocated. I wonder if during a crash,
strangethings can still happen at disk level however, in particular in SSD devices; on these things we have no control,
andperhaps journaling helps? 
As for the metadata, if during a crash it's flushed (with fdatasync, only when the FS decides to do that), can anything
badhappen without journaling? 

Third, let's suppose that the WAL can't get corrupted. When the system flushes data pages to the disk according to the
WALcontent, if there is a crash, am I sure that tables files old pages and /or their metadata, inode.... can't get
corrupted?
If that, there is no possibility to reconstruct the things, even through the WAL. Even in this case, perhaps journaling
helps.

I don't mind about performance but I absolutely mind about reliability, so I was thinking about the safest setting of
linuxFS and postgresql I can use. 
Thanks!
Pupillo

Il 15/10/2016 07:52, Michael Paquier ha scritto:
> On Fri, Oct 14, 2016 at 11:27 PM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote:
>> After a successful commit, the WAL file and its metadata are on disk.
>> Moreover, the file metadata won't change (except for the write and
>> access
>> timestamps) because WAL files are created with their full size and
>> never extended, so no WAL file should ever get "lost" because of
>> partial metadata writes.
> This behavior depends as well on the value of wal_sync_method. For
> example with fdatasync the metadata is not flushed. It does not matter
> any for for WAL segments as Albe has already mentioned, but the choice
> here impacts performance.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Hi!
PG can lost its segments from data file and nobody knows it.   For PG  - no file = no data and no need to recover after
crash,there is no infos about what data files belongs to PG. 
After this don’t bother about WAL and anything else =)
Just use FS with journal, check sums you DB with initdb -k, fsync=on , do regular backups and check it thoroughly with
restore.Also don’t forget to praise the gods that so far PG clogs file is not corrupted while being not protected by
anychecksums in minds. Youl never know that PG clog is corrupted until "doomsday" 

--
Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

streaming replication and WAL

От

"t.dalpozzo@gmail.com"

Дата:

25 октября 2016 г., 18:08:36

Hi,
let's suppose I have:
  - a primary server with its own local archive location, configured for
continuous archiving
  - a standby server without archive.
These servers are configured for Sync streaming replication .
Let's suppose that the standby stays down for a long time, then it
restarts, goes into catchup mode and now needs some old WALs from the
server archive location.
Will the standby be able to automatically drain those files through the
replication or only the WALs being currently updated by the primary ?

Regards
Pupillo

Re: streaming replication and WAL

От

Alan Hodgson

Дата:

25 октября 2016 г., 18:19:09

On Tuesday 25 October 2016 17:08:26 t.dalpozzo@gmail.com wrote:
> Hi,
> let's suppose I have:
>   - a primary server with its own local archive location, configured for
> continuous archiving
>   - a standby server without archive.
> These servers are configured for Sync streaming replication .
> Let's suppose that the standby stays down for a long time, then it
> restarts, goes into catchup mode and now needs some old WALs from the
> server archive location.
> Will the standby be able to automatically drain those files through the
> replication or only the WALs being currently updated by the primary ?
>

It would need its own direct access to the master's archive.

Re: streaming replication and WAL

От

Francisco Olarte

Дата:

25 октября 2016 г., 19:17:45

I may be confused but...

On Tue, Oct 25, 2016 at 5:08 PM, t.dalpozzo@gmail.com
<t.dalpozzo@gmail.com> wrote:
> These servers are configured for Sync streaming replication .
> Let's suppose that the standby stays down for a long time, then it restarts,

Doesn't sync replication plus standby down mean primary will stop
accepting work?

Francisco Olarte.

Re: streaming replication and WAL

От

"t.dalpozzo@gmail.com"

Дата:

25 октября 2016 г., 19:23:22

Sure you're right... my oversight, sorry. I wanted only to create a
situation in which the standby remains quite behind with updates, so we
can suppose that there is a list of standby servers (so the primary
keeps going on with the 2nd) or simply suppose the replication async.
Pupillo

Il 25/10/2016 18:17, Francisco Olarte ha scritto:
> I may be confused but...
>
> On Tue, Oct 25, 2016 at 5:08 PM, t.dalpozzo@gmail.com
> <t.dalpozzo@gmail.com> wrote:
>> These servers are configured for Sync streaming replication .
>> Let's suppose that the standby stays down for a long time, then it restarts,
> Doesn't sync replication plus standby down mean primary will stop
> accepting work?
>
> Francisco Olarte.

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: journaled FS and and WAL