Обсуждение: [ADMIN] WAL archive space planning?

Поиск
Список
Период
Сортировка

[ADMIN] WAL archive space planning?

От
Ray Stell
Дата:
I was "planning" to turn on WAL archiving on a postgresql 9.4.11 server
that currently is running with "wal_level=hot_standby" and streaming to
a standby.  I thought there would be a relationship between the rate of
pg_xlog files and archive generation.   When I turned up the
archive_command/mode I found the scale of the archive target was wrong
as I had based it on the pg_xlog file creation rate.

When I turned on the archive command for a few minutes, pg_xlog dir
contained these files for the time period:

-rw------- 1 postgres postgres 16777216 Feb 17 06:47
00000001000023AC0000007F
-rw------- 1 postgres postgres 16777216 Feb 17 06:45
00000001000023AC0000007E

The archive command wrote 126, 16MB files:

-rw------- 1 postgres postgres 16777216 Feb 17 06:47
000000010000237700000056
-rw------- 1 postgres postgres 16777216 Feb 17 06:47
000000010000237700000055
-rw------- 1 postgres postgres 16777216 Feb 17 06:47
000000010000237700000054
-rw------- 1 postgres postgres 16777216 Feb 17 06:47
000000010000237700000053
...

-rw------- 1 postgres postgres 16777216 Feb 17 06:45
0000000100002376000000DC
-rw------- 1 postgres postgres 16777216 Feb 17 06:45
0000000100002376000000DB
-rw------- 1 postgres postgres 16777216 Feb 17 06:45
0000000100002376000000DA
-rw------- 1 postgres postgres 16777216 Feb 17 06:45
0000000100002376000000D9

On servers that are not nearly as busy, I observe a one-to-one
relationship between these files/rates.

Is there a good WAL archive space planning guide?

Is the way to collect planning data for this to turn on wal_debug?

TIA!




Re: [ADMIN] WAL archive space planning?

От
Keith
Дата:


On Fri, Feb 17, 2017 at 1:30 PM, Ray Stell <stellr@vt.edu> wrote:
I was "planning" to turn on WAL archiving on a postgresql 9.4.11 server that currently is running with "wal_level=hot_standby" and streaming to a standby.  I thought there would be a relationship between the rate of pg_xlog files and archive generation.   When I turned up the archive_command/mode I found the scale of the archive target was wrong as I had based it on the pg_xlog file creation rate.

When I turned on the archive command for a few minutes, pg_xlog dir contained these files for the time period:

-rw------- 1 postgres postgres 16777216 Feb 17 06:47 00000001000023AC0000007F
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 00000001000023AC0000007E

The archive command wrote 126, 16MB files:

-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000056
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000055
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000054
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000053
...

-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DC
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DB
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DA
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000D9

On servers that are not nearly as busy, I observe a one-to-one relationship between these files/rates.

Is there a good WAL archive space planning guide?

Is the way to collect planning data for this to turn on wal_debug?

TIA!




--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Did you have an archive_timeout value set? This will cause the archive_command to run and create a WAL file at that time interval, no matter how little write activity there may be. These will always be 16MB as well, but if there's no activity, they will compress down quite significantly.

If you disable the timeout, I believe you will then get the matched rate of pg_xlog generation for your archive command.

Keith

Re: [ADMIN] WAL archive space planning?

От
Steven Chang
Дата:
check this, his 2nd part introduces Stream Replication Implementation and tell you px_log retention and wal archive related parameters.

2017-02-18 2:30 GMT+08:00 Ray Stell <stellr@vt.edu>:
I was "planning" to turn on WAL archiving on a postgresql 9.4.11 server that currently is running with "wal_level=hot_standby" and streaming to a standby.  I thought there would be a relationship between the rate of pg_xlog files and archive generation.   When I turned up the archive_command/mode I found the scale of the archive target was wrong as I had based it on the pg_xlog file creation rate.

When I turned on the archive command for a few minutes, pg_xlog dir contained these files for the time period:

-rw------- 1 postgres postgres 16777216 Feb 17 06:47 00000001000023AC0000007F
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 00000001000023AC0000007E

The archive command wrote 126, 16MB files:

-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000056
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000055
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000054
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000053
...

-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DC
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DB
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DA
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000D9

On servers that are not nearly as busy, I observe a one-to-one relationship between these files/rates.

Is there a good WAL archive space planning guide?

Is the way to collect planning data for this to turn on wal_debug?

TIA!




--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: [ADMIN] WAL archive space planning?

От
Ray Stell
Дата:

On 2/21/17 12:09 AM, Steven Chang wrote:

check this, his 2nd part introduces Stream Replication Implementation and tell you px_log retention and wal archive related parameters.

2017-02-18 2:30 GMT+08:00 Ray Stell <stellr@vt.edu>:
I was "planning" to turn on WAL archiving on a postgresql 9.4.11 server that currently is running with "wal_level=hot_standby" and streaming to a standby.  I thought there would be a relationship between the rate of pg_xlog files and archive generation.   When I turned up the archive_command/mode I found the scale of the archive target was wrong as I had based it on the pg_xlog file creation rate.

When I turned on the archive command for a few minutes, pg_xlog dir contained these files for the time period:

-rw------- 1 postgres postgres 16777216 Feb 17 06:47 00000001000023AC0000007F
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 00000001000023AC0000007E

The archive command wrote 126, 16MB files:

-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000056
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000055
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000054
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000053
...

-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DC
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DB
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DA
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000D9

On servers that are not nearly as busy, I observe a one-to-one relationship between these files/rates.

Is there a good WAL archive space planning guide?

Is the way to collect planning data for this to turn on wal_debug?

TIA!




--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

I'm interested in why I might observed a dramatic difference between pg_xlog files and the archive target files. Other dbs I see have them pretty much one-to-one, but not in this case here.  What might cause the large variation?  I got only a few pg_xlogs files each minute and 42 archive files/min.  

Re: [ADMIN] WAL archive space planning?

От
Keith
Дата:


On Tue, Feb 21, 2017 at 12:22 PM, Ray Stell <stellr@vt.edu> wrote:

On 2/21/17 12:09 AM, Steven Chang wrote:

check this, his 2nd part introduces Stream Replication Implementation and tell you px_log retention and wal archive related parameters.

2017-02-18 2:30 GMT+08:00 Ray Stell <stellr@vt.edu>:
I was "planning" to turn on WAL archiving on a postgresql 9.4.11 server that currently is running with "wal_level=hot_standby" and streaming to a standby.  I thought there would be a relationship between the rate of pg_xlog files and archive generation.   When I turned up the archive_command/mode I found the scale of the archive target was wrong as I had based it on the pg_xlog file creation rate.

When I turned on the archive command for a few minutes, pg_xlog dir contained these files for the time period:

-rw------- 1 postgres postgres 16777216 Feb 17 06:47 00000001000023AC0000007F
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 00000001000023AC0000007E

The archive command wrote 126, 16MB files:

-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000056
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000055
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000054
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000053
...

-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DC
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DB
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DA
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000D9

On servers that are not nearly as busy, I observe a one-to-one relationship between these files/rates.

Is there a good WAL archive space planning guide?

Is the way to collect planning data for this to turn on wal_debug?

TIA!




--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

I'm interested in why I might observed a dramatic difference between pg_xlog files and the archive target files. Other dbs I see have them pretty much one-to-one, but not in this case here.  What might cause the large variation?  I got only a few pg_xlogs files each minute and 42 archive files/min.  


Ray,

Did you see my previous response about the archive_timeout setting?

Re: [ADMIN] WAL archive space planning?

От
Ray Stell
Дата:

On 2/21/17 12:26 PM, Keith wrote:



On Tue, Feb 21, 2017 at 12:22 PM, Ray Stell <stellr@vt.edu> wrote:

On 2/21/17 12:09 AM, Steven Chang wrote:

check this, his 2nd part introduces Stream Replication Implementation and tell you px_log retention and wal archive related parameters.

2017-02-18 2:30 GMT+08:00 Ray Stell <stellr@vt.edu>:
I was "planning" to turn on WAL archiving on a postgresql 9.4.11 server that currently is running with "wal_level=hot_standby" and streaming to a standby.  I thought there would be a relationship between the rate of pg_xlog files and archive generation.   When I turned up the archive_command/mode I found the scale of the archive target was wrong as I had based it on the pg_xlog file creation rate.

When I turned on the archive command for a few minutes, pg_xlog dir contained these files for the time period:

-rw------- 1 postgres postgres 16777216 Feb 17 06:47 00000001000023AC0000007F
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 00000001000023AC0000007E

The archive command wrote 126, 16MB files:

-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000056
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000055
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000054
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000053
...

-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DC
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DB
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DA
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000D9

On servers that are not nearly as busy, I observe a one-to-one relationship between these files/rates.

Is there a good WAL archive space planning guide?

Is the way to collect planning data for this to turn on wal_debug?

TIA!




--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

I'm interested in why I might observed a dramatic difference between pg_xlog files and the archive target files. Other dbs I see have them pretty much one-to-one, but not in this case here.  What might cause the large variation?  I got only a few pg_xlogs files each minute and 42 archive files/min.  


Ray,

Did you see my previous response about the archive_timeout setting?

Lost your post somehow, but I see it in the list archive.  Thanks, Keith.

There was a 30 minute timeout set, but I would not think that would increase the archive file generation as observed, does it? 

" When this parameter is greater than zero, the server will switch to a new segment file whenever this many seconds have elapsed since the last segment file switch, and there has been any database activity, including a single checkpoint. (Increasing checkpoint_timeout will reduce unnecessary checkpoints on an idle system.) "

Your post seems to indicate having a non-zero value might increase the number of archive files, is that so?  Maybe it is in milliseconds which might make sense with the observed archive rate, but the doc says seconds and it was set to 1800.  I can set it to 0 and see if it changes things, but I'm a little confused.



Re: [ADMIN] WAL archive space planning?

От
Keith
Дата:


On Tue, Feb 21, 2017 at 2:33 PM, Ray Stell <stellr@vt.edu> wrote:

On 2/21/17 12:26 PM, Keith wrote:



On Tue, Feb 21, 2017 at 12:22 PM, Ray Stell <stellr@vt.edu> wrote:

On 2/21/17 12:09 AM, Steven Chang wrote:

check this, his 2nd part introduces Stream Replication Implementation and tell you px_log retention and wal archive related parameters.

2017-02-18 2:30 GMT+08:00 Ray Stell <stellr@vt.edu>:
I was "planning" to turn on WAL archiving on a postgresql 9.4.11 server that currently is running with "wal_level=hot_standby" and streaming to a standby.  I thought there would be a relationship between the rate of pg_xlog files and archive generation.   When I turned up the archive_command/mode I found the scale of the archive target was wrong as I had based it on the pg_xlog file creation rate.

When I turned on the archive command for a few minutes, pg_xlog dir contained these files for the time period:

-rw------- 1 postgres postgres 16777216 Feb 17 06:47 00000001000023AC0000007F
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 00000001000023AC0000007E

The archive command wrote 126, 16MB files:

-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000056
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000055
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000054
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000053
...

-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DC
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DB
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DA
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000D9

On servers that are not nearly as busy, I observe a one-to-one relationship between these files/rates.

Is there a good WAL archive space planning guide?

Is the way to collect planning data for this to turn on wal_debug?

TIA!




--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

I'm interested in why I might observed a dramatic difference between pg_xlog files and the archive target files. Other dbs I see have them pretty much one-to-one, but not in this case here.  What might cause the large variation?  I got only a few pg_xlogs files each minute and 42 archive files/min.  


Ray,

Did you see my previous response about the archive_timeout setting?

Lost your post somehow, but I see it in the list archive.  Thanks, Keith.

There was a 30 minute timeout set, but I would not think that would increase the archive file generation as observed, does it? 

" When this parameter is greater than zero, the server will switch to a new segment file whenever this many seconds have elapsed since the last segment file switch, and there has been any database activity, including a single checkpoint. (Increasing checkpoint_timeout will reduce unnecessary checkpoints on an idle system.) "

Your post seems to indicate having a non-zero value might increase the number of archive files, is that so?  Maybe it is in milliseconds which might make sense with the observed archive rate, but the doc says seconds and it was set to 1800.  I can set it to 0 and see if it changes things, but I'm a little confused.




If it was set to 30 minutes, then no, it wouldn't explain what you saw. But we didn't know what it was set to, so that's why I asked. Make sure you're checking the value actually set in the database, and not just what's in postgresql.conf, since it could've changed but never been reloaded to put in place (type "show archive_timeout" in psql).

If that's not it, not sure what may be causing what you're seeing.

Keith

Re: [ADMIN] WAL archive space planning?

От
Steven Chang
Дата:
Dears,

     Just read a posgres monitoring moudule named check-postres,  which is a ready enhance package on Debian Linux Distribution.
     
   https://bucardo.org/check_postgres/check_postgres.pl.html
   
   here is its readme about wal_files
   Maybe you can also use this module to help plan your  space usage of WAL files.

 wal_files
       ("symlink: check_postgres_wal_files") Checks how many WAL files exist in the
       pg_xlog directory, which is found off of your data_directory, sometimes as a
       symlink to another physical disk for performance reasons. This action must be
       run as a superuser, in order to access the contents of the pg_xlog directory.
       The minimum version to use this action is Postgres 8.1. The --warning and
       --critical options are simply the number of files in the pg_xlog directory. What
       number to set this to will vary, but a general guideline is to put a number
       slightly higher than what is normally there, to catch problems early.

       Normally, WAL files are closed and then re-used, but a long-running open
       transaction, or a faulty archive_command script, may cause Postgres to create
       too many files. Ultimately, this will cause the disk they are on to run out of
       space, at which point Postgres will shut down.


2017-02-22 3:45 GMT+08:00 Keith <keith@keithf4.com>:


On Tue, Feb 21, 2017 at 2:33 PM, Ray Stell <stellr@vt.edu> wrote:

On 2/21/17 12:26 PM, Keith wrote:



On Tue, Feb 21, 2017 at 12:22 PM, Ray Stell <stellr@vt.edu> wrote:

On 2/21/17 12:09 AM, Steven Chang wrote:

check this, his 2nd part introduces Stream Replication Implementation and tell you px_log retention and wal archive related parameters.

2017-02-18 2:30 GMT+08:00 Ray Stell <stellr@vt.edu>:
I was "planning" to turn on WAL archiving on a postgresql 9.4.11 server that currently is running with "wal_level=hot_standby" and streaming to a standby.  I thought there would be a relationship between the rate of pg_xlog files and archive generation.   When I turned up the archive_command/mode I found the scale of the archive target was wrong as I had based it on the pg_xlog file creation rate.

When I turned on the archive command for a few minutes, pg_xlog dir contained these files for the time period:

-rw------- 1 postgres postgres 16777216 Feb 17 06:47 00000001000023AC0000007F
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 00000001000023AC0000007E

The archive command wrote 126, 16MB files:

-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000056
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000055
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000054
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000053
...

-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DC
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DB
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DA
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000D9

On servers that are not nearly as busy, I observe a one-to-one relationship between these files/rates.

Is there a good WAL archive space planning guide?

Is the way to collect planning data for this to turn on wal_debug?

TIA!




--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

I'm interested in why I might observed a dramatic difference between pg_xlog files and the archive target files. Other dbs I see have them pretty much one-to-one, but not in this case here.  What might cause the large variation?  I got only a few pg_xlogs files each minute and 42 archive files/min.  


Ray,

Did you see my previous response about the archive_timeout setting?

Lost your post somehow, but I see it in the list archive.  Thanks, Keith.

There was a 30 minute timeout set, but I would not think that would increase the archive file generation as observed, does it? 

" When this parameter is greater than zero, the server will switch to a new segment file whenever this many seconds have elapsed since the last segment file switch, and there has been any database activity, including a single checkpoint. (Increasing checkpoint_timeout will reduce unnecessary checkpoints on an idle system.) "

Your post seems to indicate having a non-zero value might increase the number of archive files, is that so?  Maybe it is in milliseconds which might make sense with the observed archive rate, but the doc says seconds and it was set to 1800.  I can set it to 0 and see if it changes things, but I'm a little confused.




If it was set to 30 minutes, then no, it wouldn't explain what you saw. But we didn't know what it was set to, so that's why I asked. Make sure you're checking the value actually set in the database, and not just what's in postgresql.conf, since it could've changed but never been reloaded to put in place (type "show archive_timeout" in psql).

If that's not it, not sure what may be causing what you're seeing.

Keith

Re: [ADMIN] WAL archive space planning?

От
Scott Whitney
Дата:
Forgive me if this has been explained well in the past. You can probably search my name for discussions on this...

I fail to see why WAL archiving is either required or desired in a streaming replication scenario WHERE ONE NEVER NEEDS PITR.

I have 2 "production" servers. One is paid customers, and one is customer-facing test and training. Let us call the one that I care about bigDb and the other littleDb.

littleDb replicates bigDb in real-time. They both back up nightly. They are in the same data center.

At my home office, I have otherDb which replicates BOTH and runs independent backups nightly.

I have to have a secondary data center. Let's call that otherCityBigDb and otherCityLittleDb. They stream replication. 

There is nowhere in the world where I am responsible for rolling back to anything.

Why would one EVER want WAL archiving? It seems like you are planning for failure.

On Mar 13, 2017 10:55 PM, Steven Chang <stevenchang1213@gmail.com> wrote:
Dears,

     Just read a posgres monitoring moudule named check-postres,  which is a ready enhance package on Debian Linux Distribution.
     
   https://bucardo.org/check_postgres/check_postgres.pl.html
   
   here is its readme about wal_files
   Maybe you can also use this module to help plan your  space usage of WAL files.

 wal_files
       ("symlink: check_postgres_wal_files") Checks how many WAL files exist in the
       pg_xlog directory, which is found off of your data_directory, sometimes as a
       symlink to another physical disk for performance reasons. This action must be
       run as a superuser, in order to access the contents of the pg_xlog directory.
       The minimum version to use this action is Postgres 8.1. The --warning and
       --critical options are simply the number of files in the pg_xlog directory. What
       number to set this to will vary, but a general guideline is to put a number
       slightly higher than what is normally there, to catch problems early.

       Normally, WAL files are closed and then re-used, but a long-running open
       transaction, or a faulty archive_command script, may cause Postgres to create
       too many files. Ultimately, this will cause the disk they are on to run out of
       space, at which point Postgres will shut down.


2017-02-22 3:45 GMT+08:00 Keith <keith@keithf4.com>:


On Tue, Feb 21, 2017 at 2:33 PM, Ray Stell <stellr@vt.edu> wrote:

On 2/21/17 12:26 PM, Keith wrote:



On Tue, Feb 21, 2017 at 12:22 PM, Ray Stell <stellr@vt.edu> wrote:

On 2/21/17 12:09 AM, Steven Chang wrote:

check this, his 2nd part introduces Stream Replication Implementation and tell you px_log retention and wal archive related parameters.

2017-02-18 2:30 GMT+08:00 Ray Stell <stellr@vt.edu>:
I was "planning" to turn on WAL archiving on a postgresql 9.4.11 server that currently is running with "wal_level=hot_standby" and streaming to a standby.  I thought there would be a relationship between the rate of pg_xlog files and archive generation.   When I turned up the archive_command/mode I found the scale of the archive target was wrong as I had based it on the pg_xlog file creation rate.

When I turned on the archive command for a few minutes, pg_xlog dir contained these files for the time period:

-rw------- 1 postgres postgres 16777216 Feb 17 06:47 00000001000023AC0000007F
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 00000001000023AC0000007E

The archive command wrote 126, 16MB files:

-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000056
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000055
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000054
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000053
...

-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DC
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DB
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DA
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000D9

On servers that are not nearly as busy, I observe a one-to-one relationship between these files/rates.

Is there a good WAL archive space planning guide?

Is the way to collect planning data for this to turn on wal_debug?

TIA!




--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

I'm interested in why I might observed a dramatic difference between pg_xlog files and the archive target files. Other dbs I see have them pretty much one-to-one, but not in this case here.  What might cause the large variation?  I got only a few pg_xlogs files each minute and 42 archive files/min.  


Ray,

Did you see my previous response about the archive_timeout setting?

Lost your post somehow, but I see it in the list archive.  Thanks, Keith.

There was a 30 minute timeout set, but I would not think that would increase the archive file generation as observed, does it? 

" When this parameter is greater than zero, the server will switch to a new segment file whenever this many seconds have elapsed since the last segment file switch, and there has been any database activity, including a single checkpoint. (Increasing checkpoint_timeout will reduce unnecessary checkpoints on an idle system.) "

Your post seems to indicate having a non-zero value might increase the number of archive files, is that so?  Maybe it is in milliseconds which might make sense with the observed archive rate, but the doc says seconds and it was set to 1800.  I can set it to 0 and see if it changes things, but I'm a little confused.




If it was set to 30 minutes, then no, it wouldn't explain what you saw. But we didn't know what it was set to, so that's why I asked. Make sure you're checking the value actually set in the database, and not just what's in postgresql.conf, since it could've changed but never been reloaded to put in place (type "show archive_timeout" in psql).

If that's not it, not sure what may be causing what you're seeing.

Keith




Journyx, Inc.
7600 Burnet Road #300
Austin, TX 78757
www.journyx.com

p 512.834.8888 
f 512-834-8858 

Do you receive our promotional emails? You can subscribe or unsubscribe to those emails at http://go.journyx.com/emailPreference/e/4932/714/ 

Re: [ADMIN] WAL archive space planning?

От
Steven Chang
Дата:
Hi Scott,

     Do you have any experience on any  RDBMS other than postgres ?
     And this page explains it very much about WAL ---- https://www.postgresql.org/docs/9.1/static/continuous-archiving.html
     Replication is actually a disaster recovery solution, but now many other experts extend it to function others.
     For example, combining with pgpool, it could distribute offload select query to slave sites for scale or performance concerns.
     Of course you could also coding   plproxy to accomplish the same goal.

Best Regards,
Steven
     

2017-03-14 12:17 GMT+08:00 Scott Whitney <scott@journyx.com>:
Forgive me if this has been explained well in the past. You can probably search my name for discussions on this...

I fail to see why WAL archiving is either required or desired in a streaming replication scenario WHERE ONE NEVER NEEDS PITR.

I have 2 "production" servers. One is paid customers, and one is customer-facing test and training. Let us call the one that I care about bigDb and the other littleDb.

littleDb replicates bigDb in real-time. They both back up nightly. They are in the same data center.

At my home office, I have otherDb which replicates BOTH and runs independent backups nightly.

I have to have a secondary data center. Let's call that otherCityBigDb and otherCityLittleDb. They stream replication. 

There is nowhere in the world where I am responsible for rolling back to anything.

Why would one EVER want WAL archiving? It seems like you are planning for failure.

On Mar 13, 2017 10:55 PM, Steven Chang <stevenchang1213@gmail.com> wrote:
Dears,

     Just read a posgres monitoring moudule named check-postres,  which is a ready enhance package on Debian Linux Distribution.
     
   https://bucardo.org/check_postgres/check_postgres.pl.html
   
   here is its readme about wal_files
   Maybe you can also use this module to help plan your  space usage of WAL files.

 wal_files
       ("symlink: check_postgres_wal_files") Checks how many WAL files exist in the
       pg_xlog directory, which is found off of your data_directory, sometimes as a
       symlink to another physical disk for performance reasons. This action must be
       run as a superuser, in order to access the contents of the pg_xlog directory.
       The minimum version to use this action is Postgres 8.1. The --warning and
       --critical options are simply the number of files in the pg_xlog directory. What
       number to set this to will vary, but a general guideline is to put a number
       slightly higher than what is normally there, to catch problems early.

       Normally, WAL files are closed and then re-used, but a long-running open
       transaction, or a faulty archive_command script, may cause Postgres to create
       too many files. Ultimately, this will cause the disk they are on to run out of
       space, at which point Postgres will shut down.


2017-02-22 3:45 GMT+08:00 Keith <keith@keithf4.com>:


On Tue, Feb 21, 2017 at 2:33 PM, Ray Stell <stellr@vt.edu> wrote:

On 2/21/17 12:26 PM, Keith wrote:



On Tue, Feb 21, 2017 at 12:22 PM, Ray Stell <stellr@vt.edu> wrote:

On 2/21/17 12:09 AM, Steven Chang wrote:

check this, his 2nd part introduces Stream Replication Implementation and tell you px_log retention and wal archive related parameters.

2017-02-18 2:30 GMT+08:00 Ray Stell <stellr@vt.edu>:
I was "planning" to turn on WAL archiving on a postgresql 9.4.11 server that currently is running with "wal_level=hot_standby" and streaming to a standby.  I thought there would be a relationship between the rate of pg_xlog files and archive generation.   When I turned up the archive_command/mode I found the scale of the archive target was wrong as I had based it on the pg_xlog file creation rate.

When I turned on the archive command for a few minutes, pg_xlog dir contained these files for the time period:

-rw------- 1 postgres postgres 16777216 Feb 17 06:47 00000001000023AC0000007F
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 00000001000023AC0000007E

The archive command wrote 126, 16MB files:

-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000056
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000055
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000054
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000053
...

-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DC
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DB
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DA
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000D9

On servers that are not nearly as busy, I observe a one-to-one relationship between these files/rates.

Is there a good WAL archive space planning guide?

Is the way to collect planning data for this to turn on wal_debug?

TIA!




--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

I'm interested in why I might observed a dramatic difference between pg_xlog files and the archive target files. Other dbs I see have them pretty much one-to-one, but not in this case here.  What might cause the large variation?  I got only a few pg_xlogs files each minute and 42 archive files/min.  


Ray,

Did you see my previous response about the archive_timeout setting?

Lost your post somehow, but I see it in the list archive.  Thanks, Keith.

There was a 30 minute timeout set, but I would not think that would increase the archive file generation as observed, does it? 

" When this parameter is greater than zero, the server will switch to a new segment file whenever this many seconds have elapsed since the last segment file switch, and there has been any database activity, including a single checkpoint. (Increasing checkpoint_timeout will reduce unnecessary checkpoints on an idle system.) "

Your post seems to indicate having a non-zero value might increase the number of archive files, is that so?  Maybe it is in milliseconds which might make sense with the observed archive rate, but the doc says seconds and it was set to 1800.  I can set it to 0 and see if it changes things, but I'm a little confused.




If it was set to 30 minutes, then no, it wouldn't explain what you saw. But we didn't know what it was set to, so that's why I asked. Make sure you're checking the value actually set in the database, and not just what's in postgresql.conf, since it could've changed but never been reloaded to put in place (type "show archive_timeout" in psql).

If that's not it, not sure what may be causing what you're seeing.

Keith




Journyx, Inc.
7600 Burnet Road #300
Austin, TX 78757
www.journyx.com

Do you receive our promotional emails? You can subscribe or unsubscribe to those emails at http://go.journyx.com/emailPreference/e/4932/714/ 

Re: [ADMIN] WAL archive space planning?

От
Ray Stell
Дата:

On 3/13/17 11:53 PM, Steven Chang wrote:

Dears,

     Just read a posgres monitoring moudule named check-postres,  which is a ready enhance package on Debian Linux Distribution.
     
   https://bucardo.org/check_postgres/check_postgres.pl.html
   
   here is its readme about wal_files
   Maybe you can also use this module to help plan your  space usage of WAL files.

 wal_files
       ("symlink: check_postgres_wal_files") Checks how many WAL files exist in the
       pg_xlog directory, which is found off of your data_directory, sometimes as a
       symlink to another physical disk for performance reasons. This action must be
       run as a superuser, in order to access the contents of the pg_xlog directory.
       The minimum version to use this action is Postgres 8.1. The --warning and
       --critical options are simply the number of files in the pg_xlog directory. What
       number to set this to will vary, but a general guideline is to put a number
       slightly higher than what is normally there, to catch problems early.

       Normally, WAL files are closed and then re-used, but a long-running open
       transaction, or a faulty archive_command script, may cause Postgres to create
       too many files. Ultimately, this will cause the disk they are on to run out of
       space, at which point Postgres will shut down.


2017-02-22 3:45 GMT+08:00 Keith <keith@keithf4.com>:


On Tue, Feb 21, 2017 at 2:33 PM, Ray Stell <stellr@vt.edu> wrote:

On 2/21/17 12:26 PM, Keith wrote:



On Tue, Feb 21, 2017 at 12:22 PM, Ray Stell <stellr@vt.edu> wrote:

On 2/21/17 12:09 AM, Steven Chang wrote:

check this, his 2nd part introduces Stream Replication Implementation and tell you px_log retention and wal archive related parameters.

2017-02-18 2:30 GMT+08:00 Ray Stell <stellr@vt.edu>:
I was "planning" to turn on WAL archiving on a postgresql 9.4.11 server that currently is running with "wal_level=hot_standby" and streaming to a standby.  I thought there would be a relationship between the rate of pg_xlog files and archive generation.   When I turned up the archive_command/mode I found the scale of the archive target was wrong as I had based it on the pg_xlog file creation rate.

When I turned on the archive command for a few minutes, pg_xlog dir contained these files for the time period:

-rw------- 1 postgres postgres 16777216 Feb 17 06:47 00000001000023AC0000007F
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 00000001000023AC0000007E

The archive command wrote 126, 16MB files:

-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000056
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000055
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000054
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000053
...

-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DC
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DB
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DA
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000D9

On servers that are not nearly as busy, I observe a one-to-one relationship between these files/rates.

Is there a good WAL archive space planning guide?

Is the way to collect planning data for this to turn on wal_debug?

TIA!




--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

I'm interested in why I might observed a dramatic difference between pg_xlog files and the archive target files. Other dbs I see have them pretty much one-to-one, but not in this case here.  What might cause the large variation?  I got only a few pg_xlogs files each minute and 42 archive files/min.  


Ray,

Did you see my previous response about the archive_timeout setting?

Lost your post somehow, but I see it in the list archive.  Thanks, Keith.

There was a 30 minute timeout set, but I would not think that would increase the archive file generation as observed, does it? 

" When this parameter is greater than zero, the server will switch to a new segment file whenever this many seconds have elapsed since the last segment file switch, and there has been any database activity, including a single checkpoint. (Increasing checkpoint_timeout will reduce unnecessary checkpoints on an idle system.) "

Your post seems to indicate having a non-zero value might increase the number of archive files, is that so?  Maybe it is in milliseconds which might make sense with the observed archive rate, but the doc says seconds and it was set to 1800.  I can set it to 0 and see if it changes things, but I'm a little confused.




If it was set to 30 minutes, then no, it wouldn't explain what you saw. But we didn't know what it was set to, so that's why I asked. Make sure you're checking the value actually set in the database, and not just what's in postgresql.conf, since it could've changed but never been reloaded to put in place (type "show archive_timeout" in psql).

If that's not it, not sure what may be causing what you're seeing.

Keith

Thanks, Steven,

An interesting statement:

"long-running open transaction, or a faulty archive_command script, may cause Postgres to create too many files. Ultimately, this will cause the disk they are on to run out of space, at which point Postgres will shut down."  

The statement, "long-running open transaction...may cause Postgres to create too many files" makes it sound like there could be the case where there is not a 1-to-1 relationship between pg_xlog and the archive.  I've never seen that before and didn't think that was the case.  Maybe I'm reading too much into the statement. 

I have to suspect that the "faulty" nature of the archive_command was that it was just asking too much of the I/O subsystem and was just getting way behind.  Shockingly way behind.  I didn't expect it.   It is a very busy system will many long running transactions.  If I ever get back to that problem, my first pass will be to reduce the load request.  I suppose I can do some bonnie tests to simulate the archive writing.  I was doing some multiplexing of the files in the archive shell.  Let's see what writing them once locally does first.  If it can't do that and keep up, I'm going to need a "bigger boat."   It's a production system so I only get a few cracks at it a year.   I'll probably wait for the next patch release.  Going back to the last backup is within the SLA, but I don't like not having the archive.




Re: [ADMIN] WAL archive space planning?

От
Ray Stell
Дата:

On 3/14/17 12:17 AM, Scott Whitney wrote:

Forgive me if this has been explained well in the past. You can probably search my name for discussions on this...

I fail to see why WAL archiving is either required or desired in a streaming replication scenario WHERE ONE NEVER NEEDS PITR.

I have 2 "production" servers. One is paid customers, and one is customer-facing test and training. Let us call the one that I care about bigDb and the other littleDb.

littleDb replicates bigDb in real-time. They both back up nightly. They are in the same data center.

At my home office, I have otherDb which replicates BOTH and runs independent backups nightly.

I have to have a secondary data center. Let's call that otherCityBigDb and otherCityLittleDb. They stream replication. 

There is nowhere in the world where I am responsible for rolling back to anything.

Why would one EVER want WAL archiving? It seems like you are planning for failure.

On Mar 13, 2017 10:55 PM, Steven Chang <stevenchang1213@gmail.com> wrote:
Dears,

     Just read a posgres monitoring moudule named check-postres,  which is a ready enhance package on Debian Linux Distribution.
     
   https://bucardo.org/check_postgres/check_postgres.pl.html
   
   here is its readme about wal_files
   Maybe you can also use this module to help plan your  space usage of WAL files.

 wal_files
       ("symlink: check_postgres_wal_files") Checks how many WAL files exist in the
       pg_xlog directory, which is found off of your data_directory, sometimes as a
       symlink to another physical disk for performance reasons. This action must be
       run as a superuser, in order to access the contents of the pg_xlog directory.
       The minimum version to use this action is Postgres 8.1. The --warning and
       --critical options are simply the number of files in the pg_xlog directory. What
       number to set this to will vary, but a general guideline is to put a number
       slightly higher than what is normally there, to catch problems early.

       Normally, WAL files are closed and then re-used, but a long-running open
       transaction, or a faulty archive_command script, may cause Postgres to create
       too many files. Ultimately, this will cause the disk they are on to run out of
       space, at which point Postgres will shut down.


2017-02-22 3:45 GMT+08:00 Keith <keith@keithf4.com>:


On Tue, Feb 21, 2017 at 2:33 PM, Ray Stell <stellr@vt.edu> wrote:

On 2/21/17 12:26 PM, Keith wrote:



On Tue, Feb 21, 2017 at 12:22 PM, Ray Stell <stellr@vt.edu> wrote:

On 2/21/17 12:09 AM, Steven Chang wrote:

check this, his 2nd part introduces Stream Replication Implementation and tell you px_log retention and wal archive related parameters.

2017-02-18 2:30 GMT+08:00 Ray Stell <stellr@vt.edu>:
I was "planning" to turn on WAL archiving on a postgresql 9.4.11 server that currently is running with "wal_level=hot_standby" and streaming to a standby.  I thought there would be a relationship between the rate of pg_xlog files and archive generation.   When I turned up the archive_command/mode I found the scale of the archive target was wrong as I had based it on the pg_xlog file creation rate.

When I turned on the archive command for a few minutes, pg_xlog dir contained these files for the time period:

-rw------- 1 postgres postgres 16777216 Feb 17 06:47 00000001000023AC0000007F
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 00000001000023AC0000007E

The archive command wrote 126, 16MB files:

-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000056
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000055
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000054
-rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000053
...

-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DC
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DB
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DA
-rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000D9

On servers that are not nearly as busy, I observe a one-to-one relationship between these files/rates.

Is there a good WAL archive space planning guide?

Is the way to collect planning data for this to turn on wal_debug?

TIA!




--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

I'm interested in why I might observed a dramatic difference between pg_xlog files and the archive target files. Other dbs I see have them pretty much one-to-one, but not in this case here.  What might cause the large variation?  I got only a few pg_xlogs files each minute and 42 archive files/min.  


Ray,

Did you see my previous response about the archive_timeout setting?

Lost your post somehow, but I see it in the list archive.  Thanks, Keith.

There was a 30 minute timeout set, but I would not think that would increase the archive file generation as observed, does it? 

" When this parameter is greater than zero, the server will switch to a new segment file whenever this many seconds have elapsed since the last segment file switch, and there has been any database activity, including a single checkpoint. (Increasing checkpoint_timeout will reduce unnecessary checkpoints on an idle system.) "

Your post seems to indicate having a non-zero value might increase the number of archive files, is that so?  Maybe it is in milliseconds which might make sense with the observed archive rate, but the doc says seconds and it was set to 1800.  I can set it to 0 and see if it changes things, but I'm a little confused.




If it was set to 30 minutes, then no, it wouldn't explain what you saw. But we didn't know what it was set to, so that's why I asked. Make sure you're checking the value actually set in the database, and not just what's in postgresql.conf, since it could've changed but never been reloaded to put in place (type "show archive_timeout" in psql).

If that's not it, not sure what may be causing what you're seeing.

Keith




Journyx, Inc.
7600 Burnet Road #300
Austin, TX 78757
www.journyx.com

p 512.834.8888 
f 512-834-8858 

Do you receive our promotional emails? You can subscribe or unsubscribe to those emails at http://go.journyx.com/emailPreference/e/4932/714/ 

"There is nowhere in the world where I am responsible for rolling back to anything."

Logical failures.  Suppose duhveloper trashes table on production when he meant to make the change to pre-prod.  I might want to roll forward to to a PIT just before. 

Re: [ADMIN] WAL archive space planning?

От
Jerry Sievers
Дата:
Scott Whitney <scott@journyx.com> writes:

> Forgive me if this has been explained well in the past. You can probably search my name for discussions on this...
>
> I fail to see why WAL archiving is either required or desired in a streaming replication scenario WHERE ONE NEVER
NEEDSPITR. 

If you don't mind losing sync from standby to master and having to
refresh the standby from scratch after for any reason streaming
connectivity is lost (and/or standby falls out of sync due to a too
heavy volume)... then fine.

In practice you need both to allow standbys to catch up again after an a
break.

A busy master can overrun a streaming standby at times just due to
volume, perhaps complicated by IO throughput on the standby.

Your standby may go down  undetected...  it may lag beyond what the
master still has in pg_xlog to satisfy simply due to configured in
replication lag tolerance to support reporting workloads.

To name a few :-)

True in an uncomplicated setup being able to fetch WALs from a remote
archive site is not strictly necessary.

It's also the case with newer Pg releases, that if you use a binary rep
slot, your master will hold WAL forever in its own pg_xlog directory
until a down standby comes alive and reads the data but then your
production master is at risk of disk filling if the unresponsive standby
isn't dealt with eventually.

HTH

> I have 2 "production" servers. One is paid customers, and one is customer-facing test and training. Let us call the
onethat I care about bigDb and the other littleDb. 
>
> littleDb replicates bigDb in real-time. They both back up nightly. They are in the same data center.
>
> At my home office, I have otherDb which replicates BOTH and runs independent backups nightly.
>
> I have to have a secondary data center. Let's call that otherCityBigDb and otherCityLittleDb. They stream
replication. 
>
> There is nowhere in the world where I am responsible for rolling back to anything.
>
> Why would one EVER want WAL archiving? It seems like you are planning for failure.
>
> On Mar 13, 2017 10:55 PM, Steven Chang <stevenchang1213@gmail.com> wrote:
>
>     Dears,
>
>          Just read a posgres monitoring moudule named check-postres,  which is a ready enhance package on Debian
LinuxDistribution. 
>
>        https://bucardo.org/check_postgres/check_postgres.pl.html
>
>        here is its readme about wal_files
>        Maybe you can also use this module to help plan your  space usage of WAL files.
>
>      wal_files
>            ("symlink: check_postgres_wal_files") Checks how many WAL files exist in the
>            pg_xlog directory, which is found off of your data_directory, sometimes as a
>            symlink to another physical disk for performance reasons. This action must be
>            run as a superuser, in order to access the contents of the pg_xlog directory.
>            The minimum version to use this action is Postgres 8.1. The --warning and
>            --critical options are simply the number of files in the pg_xlog directory. What
>            number to set this to will vary, but a general guideline is to put a number
>            slightly higher than what is normally there, to catch problems early.
>
>            Normally, WAL files are closed and then re-used, but a long-running open
>            transaction, or a faulty archive_command script, may cause Postgres to create
>            too many files. Ultimately, this will cause the disk they are on to run out of
>            space, at which point Postgres will shut down.
>
>     2017-02-22 3:45 GMT+08:00 Keith <keith@keithf4.com>:
>
>         On Tue, Feb 21, 2017 at 2:33 PM, Ray Stell <stellr@vt.edu> wrote:
>
>             On 2/21/17 12:26 PM, Keith wrote:
>
>                 On Tue, Feb 21, 2017 at 12:22 PM, Ray Stell <stellr@vt.edu> wrote:
>
>                     On 2/21/17 12:09 AM, Steven Chang wrote:
>
>                         check this, his 2nd part introduces Stream Replication Implementation and tell you px_log
retentionand wal archive related parameters. 
>                         https://www.itenlight.com/blog/2016/05/19/PostgreSQL+HA+with+pgpool-II+-+Part+2
>
>                         2017-02-18 2:30 GMT+08:00 Ray Stell <stellr@vt.edu>:
>
>                             I was "planning" to turn on WAL archiving on a postgresql 9.4.11 server that currently is
runningwith "wal_level=hot_standby" and streaming 
>                             to a standby.  I thought there would be a relationship between the rate of pg_xlog files
andarchive generation.   When I turned up the 
>                             archive_command/mode I found the scale of the archive target was wrong as I had based it
onthe pg_xlog file creation rate. 
>
>                             When I turned on the archive command for a few minutes, pg_xlog dir contained these files
forthe time period: 
>
>                             -rw------- 1 postgres postgres 16777216 Feb 17 06:47 00000001000023AC0000007F
>                             -rw------- 1 postgres postgres 16777216 Feb 17 06:45 00000001000023AC0000007E
>
>                             The archive command wrote 126, 16MB files:
>
>                             -rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000056
>                             -rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000055
>                             -rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000054
>                             -rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000053
>                             ...
>
>                             -rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DC
>                             -rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DB
>                             -rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DA
>                             -rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000D9
>
>                             On servers that are not nearly as busy, I observe a one-to-one relationship between these
files/rates.
>
>                             Is there a good WAL archive space planning guide?
>
>                             Is the way to collect planning data for this to turn on wal_debug?
>
>                             TIA!
>
>                             --
>                             Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
>                             To make changes to your subscription:
>                             http://www.postgresql.org/mailpref/pgsql-admin
>
>                     I'm interested in why I might observed a dramatic difference between pg_xlog files and the
archivetarget files. Other dbs I see have them pretty 
>                     much one-to-one, but not in this case here.  What might cause the large variation?  I got only a
fewpg_xlogs files each minute and 42 archive files 
>                     /min.
>
>                 Ray,
>
>                 Did you see my previous response about the archive_timeout setting?
>
>             Lost your post somehow, but I see it in the list archive.  Thanks, Keith.
>
>             There was a 30 minute timeout set, but I would not think that would increase the archive file generation
asobserved, does it?  
>
>             " When this parameter is greater than zero, the server will switch to a new segment file whenever this
manyseconds have elapsed since the last segment file 
>             switch, and there has been any database activity, including a single checkpoint. (Increasing
checkpoint_timeoutwill reduce unnecessary checkpoints on an 
>             idle system.) "
>
>             Your post seems to indicate having a non-zero value might increase the number of archive files, is that
so? Maybe it is in milliseconds which might make 
>             sense with the observed archive rate, but the doc says seconds and it was set to 1800.  I can set it to 0
andsee if it changes things, but I'm a little 
>             confused.
>
>         If it was set to 30 minutes, then no, it wouldn't explain what you saw. But we didn't know what it was set
to,so that's why I asked. Make sure you're checking 
>         the value actually set in the database, and not just what's in postgresql.conf, since it could've changed but
neverbeen reloaded to put in place (type "show 
>         archive_timeout" in psql).
>
>         If that's not it, not sure what may be causing what you're seeing.
>
>         Keith
>
> Journyx, Inc.
> 7600 Burnet Road #300
> Austin, TX 78757
> www.journyx.com
>
> p 512.834.8888
> f 512-834-8858
>
> Do you receive our promotional emails? You can subscribe or unsubscribe to those emails at
http://go.journyx.com/emailPreference/e/4932/714/ 
>

--
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consulting@comcast.net
p: 312.241.7800


Re: WAL archive space planning?

От
Steven Chang
Дата:
Hello,

     Official documents (since 7.2) do give us some formulation to evaluate the  space usage of  WAL files. 
     It also depends on your how your environment configured.
    

There will always be at least one WAL segment file, and will normally not be more than (2 + checkpoint_completion_target) * checkpoint_segments + 1 or checkpoint_segments + wal_keep_segments + 1 files. Each segment file is normally 16 MB (though this size can be altered when building the server). You can use this to estimate space requirements for WAL. Ordinarily, when old log segment files are no longer needed, they are recycled (that is, renamed to become future segments in the numbered sequence). If, due to a short-term peak of log output rate, there are more than 3 * checkpoint_segments + 1 segment files, the unneeded segment files will be deleted instead of recycled until the system gets back under this limit.

2. https://www.postgresql.org/docs/9.4/static/continuous-archiving.html

It is important that the archive command return zero exit status if and only if it succeeds. Upon getting a zero result, PostgreSQL will assume that the file has been successfully archived, and will remove or recycle it. However, a nonzero status tells PostgreSQL that the file was not archived; it will try again periodically until it succeeds.

BR,
Steven

2017-03-18 5:39 GMT+08:00 Jerry Sievers <gsievers19@comcast.net>:
Scott Whitney <scott@journyx.com> writes:

> Forgive me if this has been explained well in the past. You can probably search my name for discussions on this...
>
> I fail to see why WAL archiving is either required or desired in a streaming replication scenario WHERE ONE NEVER NEEDS PITR.

If you don't mind losing sync from standby to master and having to
refresh the standby from scratch after for any reason streaming
connectivity is lost (and/or standby falls out of sync due to a too
heavy volume)... then fine.

In practice you need both to allow standbys to catch up again after an a
break.

A busy master can overrun a streaming standby at times just due to
volume, perhaps complicated by IO throughput on the standby.

Your standby may go down  undetected...  it may lag beyond what the
master still has in pg_xlog to satisfy simply due to configured in
replication lag tolerance to support reporting workloads.

To name a few :-)

True in an uncomplicated setup being able to fetch WALs from a remote
archive site is not strictly necessary.

It's also the case with newer Pg releases, that if you use a binary rep
slot, your master will hold WAL forever in its own pg_xlog directory
until a down standby comes alive and reads the data but then your
production master is at risk of disk filling if the unresponsive standby
isn't dealt with eventually.

HTH

> I have 2 "production" servers. One is paid customers, and one is customer-facing test and training. Let us call the one that I care about bigDb and the other littleDb.
>
> littleDb replicates bigDb in real-time. They both back up nightly. They are in the same data center.
>
> At my home office, I have otherDb which replicates BOTH and runs independent backups nightly.
>
> I have to have a secondary data center. Let's call that otherCityBigDb and otherCityLittleDb. They stream replication.
>
> There is nowhere in the world where I am responsible for rolling back to anything.
>
> Why would one EVER want WAL archiving? It seems like you are planning for failure.
>
> On Mar 13, 2017 10:55 PM, Steven Chang <stevenchang1213@gmail.com> wrote:
>
>     Dears,
>
>          Just read a posgres monitoring moudule named check-postres,  which is a ready enhance package on Debian Linux Distribution.
>
>        https://bucardo.org/check_postgres/check_postgres.pl.html
>
>        here is its readme about wal_files
>        Maybe you can also use this module to help plan your  space usage of WAL files.
>
>      wal_files
>            ("symlink: check_postgres_wal_files") Checks how many WAL files exist in the
>            pg_xlog directory, which is found off of your data_directory, sometimes as a
>            symlink to another physical disk for performance reasons. This action must be
>            run as a superuser, in order to access the contents of the pg_xlog directory.
>            The minimum version to use this action is Postgres 8.1. The --warning and
>            --critical options are simply the number of files in the pg_xlog directory. What
>            number to set this to will vary, but a general guideline is to put a number
>            slightly higher than what is normally there, to catch problems early.
>
>            Normally, WAL files are closed and then re-used, but a long-running open
>            transaction, or a faulty archive_command script, may cause Postgres to create
>            too many files. Ultimately, this will cause the disk they are on to run out of
>            space, at which point Postgres will shut down.
>
>     2017-02-22 3:45 GMT+08:00 Keith <keith@keithf4.com>:
>
>         On Tue, Feb 21, 2017 at 2:33 PM, Ray Stell <stellr@vt.edu> wrote:
>
>             On 2/21/17 12:26 PM, Keith wrote:
>
>                 On Tue, Feb 21, 2017 at 12:22 PM, Ray Stell <stellr@vt.edu> wrote:
>
>                     On 2/21/17 12:09 AM, Steven Chang wrote:
>
>                         check this, his 2nd part introduces Stream Replication Implementation and tell you px_log retention and wal archive related parameters.
>                         https://www.itenlight.com/blog/2016/05/19/PostgreSQL+HA+with+pgpool-II+-+Part+2
>
>                         2017-02-18 2:30 GMT+08:00 Ray Stell <stellr@vt.edu>:
>
>                             I was "planning" to turn on WAL archiving on a postgresql 9.4.11 server that currently is running with "wal_level=hot_standby" and streaming
>                             to a standby.  I thought there would be a relationship between the rate of pg_xlog files and archive generation.   When I turned up the
>                             archive_command/mode I found the scale of the archive target was wrong as I had based it on the pg_xlog file creation rate.
>
>                             When I turned on the archive command for a few minutes, pg_xlog dir contained these files for the time period:
>
>                             -rw------- 1 postgres postgres 16777216 Feb 17 06:47 00000001000023AC0000007F
>                             -rw------- 1 postgres postgres 16777216 Feb 17 06:45 00000001000023AC0000007E
>
>                             The archive command wrote 126, 16MB files:
>
>                             -rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000056
>                             -rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000055
>                             -rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000054
>                             -rw------- 1 postgres postgres 16777216 Feb 17 06:47 000000010000237700000053
>                             ...
>
>                             -rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DC
>                             -rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DB
>                             -rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000DA
>                             -rw------- 1 postgres postgres 16777216 Feb 17 06:45 0000000100002376000000D9
>
>                             On servers that are not nearly as busy, I observe a one-to-one relationship between these files/rates.
>
>                             Is there a good WAL archive space planning guide?
>
>                             Is the way to collect planning data for this to turn on wal_debug?
>
>                             TIA!
>
>                             --
>                             Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
>                             To make changes to your subscription:
>                             http://www.postgresql.org/mailpref/pgsql-admin
>
>                     I'm interested in why I might observed a dramatic difference between pg_xlog files and the archive target files. Other dbs I see have them pretty
>                     much one-to-one, but not in this case here.  What might cause the large variation?  I got only a few pg_xlogs files each minute and 42 archive files
>                     /min.
>
>                 Ray,
>
>                 Did you see my previous response about the archive_timeout setting?
>
>             Lost your post somehow, but I see it in the list archive.  Thanks, Keith.
>
>             There was a 30 minute timeout set, but I would not think that would increase the archive file generation as observed, does it?
>
>             " When this parameter is greater than zero, the server will switch to a new segment file whenever this many seconds have elapsed since the last segment file
>             switch, and there has been any database activity, including a single checkpoint. (Increasing checkpoint_timeout will reduce unnecessary checkpoints on an
>             idle system.) "
>
>             Your post seems to indicate having a non-zero value might increase the number of archive files, is that so?  Maybe it is in milliseconds which might make
>             sense with the observed archive rate, but the doc says seconds and it was set to 1800.  I can set it to 0 and see if it changes things, but I'm a little
>             confused.
>
>         If it was set to 30 minutes, then no, it wouldn't explain what you saw. But we didn't know what it was set to, so that's why I asked. Make sure you're checking
>         the value actually set in the database, and not just what's in postgresql.conf, since it could've changed but never been reloaded to put in place (type "show
>         archive_timeout" in psql).
>
>         If that's not it, not sure what may be causing what you're seeing.
>
>         Keith
>
> Journyx, Inc.
> 7600 Burnet Road #300
> Austin, TX 78757
> www.journyx.com
>
> p 512.834.8888
> f 512-834-8858
>
> Do you receive our promotional emails? You can subscribe or unsubscribe to those emails at http://go.journyx.com/emailPreference/e/4932/714/
>

--
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consulting@comcast.net
p: 312.241.7800