Re: Confused about how to enable backups (e.g. Write Ahead Log).
От | Stephen Frost |
---|---|
Тема | Re: Confused about how to enable backups (e.g. Write Ahead Log). |
Дата | |
Msg-id | 20200221162056.GG3195@tamriel.snowman.net обсуждение исходный текст |
Ответ на | Confused about how to enable backups (e.g. Write Ahead Log). (mimble9@danwin1210.me) |
Список | pgsql-novice |
Greetings, * mimble9@danwin1210.me (mimble9@danwin1210.me) wrote: > I have been attempting, without success, to create a Write Ahead Log (WAL) > backup using PostgreSQL 10.0 under 18.04 Ubuntu VPS. There isn't really any such thing as a "WAL" backup. The WAL is where PG writes before it writes to the actual heap, but you don't typically keep WAL since the start of the database and so you need to have a copy of the heap files to start from and then you use WAL to replay up to a specific point in time. > It might be that I am doing the wrong thing and my method will never work. The method you've outlined below won't work, no. > In which case: what is the best way to achieve my 'Goal' (below)? There's a few options. > Goal: > > I want to backup a database every 15 minutes. Ideally, I want to backup > only if the database has changed (perhaps this isn't possible and backups > happen whether or not the database has changed)? This is what's known as an 'incremental backup' and there's a few tools out there which support that- but I suggest you first consider if your goal is actually to backup every 15 minutes, or if your goal is to be able to minimize data loss to, at most, 15 minutes, or if your goal is to recover to any specific point in time. The reason this matters is that PG supports Point-in-time-Recovery (PITR), by using file-level backups combined with the write ahead log. This means that you can restore a backup of PG and then play forward to *any* point in time- 9:00, 9:15, 9:17, 10:20, whatever. Taking more frequent incremental backups will mean that there is less WAL to have to go through to get to a specific point in time, but unless you have a very write-heavy system, it won't take long to play through an hour of WAL, so you might consider taking backups once an hour, or even less frequently. Reasonably common setups include daily incremental backups and then regular differential and/or full backups. > Attempts: > > I edited /etc/postgresql/10/main/postgresql.conf as follows: > > wal_level = replica > archive_mode = on > archive_command = 'cp %p /test/%f' > archive_timeout = 900 I strongly recommend against trying to write your own backup software for PG. In particular, 'cp' isn't really safe to use as an archive_command as it doesn't 'fsync' the WAL file after copying it, meaning you might be missing WAL if the system crashes. When using archive_command, hopefully with a proper tool, it can make sense to have an archive_timeout set, which will cause PG to kick out a WAL segment after that long, if it hasn't already, and there's been changes since the last WAL segment. This avoids the risk that a WAL segment sits with data in it for a long and then the system crashes in an unrecoverable way and those changes are then lost. Note that this primarily makes sense when you're actually archiving the WAL to a different system. PG, as part of each commit, will fsync the data to the WAL and so as long as the filesystem recovers, no data will be lost. > This should - in my opinion - every 15 minutes backup (cp) the database > (referenced by %p) to the /test/ directory using the filename (referenced > by %f) of the database. No, that's not how archive_command works at all. The archive_command simply copies the *WAL* file (each of which are 16MB by default) to another location (ideally, a remote one, where the file is fsync'd, prior to returning to PG). The database itself is *not* copied. > I then: > > /etc/init.d/postgresql stop > /etc/init.d/postgresql start > > And checked that PostgreSQL is running with /etc/init.d/postgresql status: > > postgresql.service - PostgreSQL RDBMS > Loaded: loaded (/lib/systemd/system/postgresql.service; enabled; vendor > preset: enabled) > Active: active (exited) since Fri 2020-02-21 01:23:59 UTC; 9h ago > Process: 60534 ExecStart=/bin/true (code=exited, status=0/SUCCESS) > Main PID: 60534 (code=exited, status=0/SUCCESS) > > Feb 21 01:23:58 server systemd[1]: postgresql.service: Failed to reset > devices.list: Operation not permitted > Feb 21 01:23:58 server systemd[1]: Starting PostgreSQL RDBMS... > Feb 21 01:23:59 server systemd[1]: Started PostgreSQL RDBMS. > > However, the /test/ directory did not show any backup files You would just end up with WAL files in that directory, per the discussion above regarding the purpose of archive_command. > I then read that I needed to use pg_basebackup to make an initial backup > (before editing postgresql.conf) so I tried pg_basebackup -D /etc/backup. > This created a duplicate of /var/lib/postgresql/10/main/ in /etc/backup > but I'm not sure how this relates to or effects (if it does) the Write > Ahead Log. pg_basebackup is one tool to create a backup, but it doesn't (currently, at least) support incremental backups, so each backup is a full backup of the system. The WAL can be used to take that backup up to a specific point in time. > I still don't have any backups in /test/ (or anywhere). Based on what you wrote above, the contents of '/etc/backup' should actually be a backup of PG, with the necessary WAL to bring it up consistently. Note that it's absolutely essential that all of the WAL generated during a backup be kept, otherwise the backup is not valid. I strongly recommend using a tool which ensures that all WAL from a backup is verified to have been stored properly. Thanks, Stephen
Вложения
В списке pgsql-novice по дате отправления:
Предыдущее
От: Stefan FercotДата:
Сообщение: Re: Confused about how to enable backups (e.g. Write Ahead Log).
Следующее
От: mimble9@danwin1210.meДата:
Сообщение: Re: Confused about how to enable backups (e.g. Write Ahead Log).