Обсуждение: [RFC] Incremental backup v2: add backup profile to base backup
Hi Hackers, I've updated the wiki page https://wiki.postgresql.org/wiki/Incremental_backup following the result of discussion on hackers. Compared to first version, we switched from a timestamp+checksum based approach to one based on LSN. This patch adds an option to pg_basebackup and to replication protocol BASE_BACKUP command to generate a backup_profile file. It is almost useless by itself, but it is the foundation on which we will build the file based incremental backup (and hopefully a block based incremental backup after it). Any comment will be appreciated. In particular I'd appreciate comments on correctness of relnode files detection and LSN extraction code. Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it
Вложения
On 10/03/2014 06:31 PM, Marco Nenciarini wrote: > Hi Hackers, > > I've updated the wiki page > https://wiki.postgresql.org/wiki/Incremental_backup following the result > of discussion on hackers. > > Compared to first version, we switched from a timestamp+checksum based > approach to one based on LSN. > > This patch adds an option to pg_basebackup and to replication protocol > BASE_BACKUP command to generate a backup_profile file. It is almost > useless by itself, but it is the foundation on which we will build the > file based incremental backup (and hopefully a block based incremental > backup after it). I'd suggest jumping straight to block-based incremental backup. It's not significantly more complicated to implement, and if you implement both separately, then we'll have to support both forever. If you really need to, you can implement file-level diff as a special case, where the server sends all blocks in the file, if any of them have an LSN > the cutoff point. But I'm not sure if there's point in that, once you have block-level support. If we're going to need a profile file - and I'm not convinced of that - is there any reason to not always include it in the backup? > Any comment will be appreciated. In particular I'd appreciate comments > on correctness of relnode files detection and LSN extraction code. I didn't look at it in detail, but one future problem comes to mind: Once you implement the server-side code that only sends a file if its LSN is higher than the cutoff point that the client gave, you'll have to scan the whole file first, to see if there are any blocks with a higher LSN. At least until you find the first such block. So with a file-level implementation of this sort, you'll have to scan all files twice, in the worst case. - Heikki
Il 03/10/14 17:53, Heikki Linnakangas ha scritto: > If we're going to need a profile file - and I'm not convinced of that - > is there any reason to not always include it in the backup? > The main reason is to have a centralized list of files that need to be present. Without a profile, you have to insert some sort of placeholder for kipped files. Moreover, the profile allows you to quickly know the size of the recovered backup (by simply summing the individual size). Another use could be to 'validate' the presence of all required files in a backup. >> Any comment will be appreciated. In particular I'd appreciate comments >> on correctness of relnode files detection and LSN extraction code. > > I didn't look at it in detail, but one future problem comes to mind: > Once you implement the server-side code that only sends a file if its > LSN is higher than the cutoff point that the client gave, you'll have to > scan the whole file first, to see if there are any blocks with a higher > LSN. At least until you find the first such block. So with a file-level > implementation of this sort, you'll have to scan all files twice, in the > worst case. > It's true. To solve this you have to keep a central maxLSN directory, but I think it introduces more issues than it solves. Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it
On Fri, Oct 3, 2014 at 1:08 PM, Marco Nenciarini <marco.nenciarini@2ndquadrant.it> wrote: >>> Any comment will be appreciated. In particular I'd appreciate comments >>> on correctness of relnode files detection and LSN extraction code. >> >> I didn't look at it in detail, but one future problem comes to mind: >> Once you implement the server-side code that only sends a file if its >> LSN is higher than the cutoff point that the client gave, you'll have to >> scan the whole file first, to see if there are any blocks with a higher >> LSN. At least until you find the first such block. So with a file-level >> implementation of this sort, you'll have to scan all files twice, in the >> worst case. >> > > It's true. To solve this you have to keep a central maxLSN directory, > but I think it introduces more issues than it solves. I see that as a worthy optimization on the server side, regardless of whether file or block-level backups are used, since it allows efficient skipping of untouched segments (common for append-only tables). Still, it would be something to do after it works already (ie: it's an optimization)
On Fri, Oct 3, 2014 at 06:08:47PM +0200, Marco Nenciarini wrote: > >> Any comment will be appreciated. In particular I'd appreciate comments > >> on correctness of relnode files detection and LSN extraction code. > > > > I didn't look at it in detail, but one future problem comes to mind: > > Once you implement the server-side code that only sends a file if its > > LSN is higher than the cutoff point that the client gave, you'll have to > > scan the whole file first, to see if there are any blocks with a higher > > LSN. At least until you find the first such block. So with a file-level > > implementation of this sort, you'll have to scan all files twice, in the > > worst case. > > > > It's true. To solve this you have to keep a central maxLSN directory, > but I think it introduces more issues than it solves. The central issue Heikki is pointing out is whether we should implement a file-based system if we already know that a block-based system will be superior in every way. I agree with that and agree that implementing just file-based isn't worth it as we would have to support it forever. So, in summary, if you target just a file-based system, be prepared that it might be rejected. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Fri, Oct 3, 2014 at 12:08 PM, Marco Nenciarini <marco.nenciarini@2ndquadrant.it> wrote: > Il 03/10/14 17:53, Heikki Linnakangas ha scritto: >> If we're going to need a profile file - and I'm not convinced of that - >> is there any reason to not always include it in the backup? > > The main reason is to have a centralized list of files that need to be > present. Without a profile, you have to insert some sort of placeholder > for kipped files. Why do you need to do that? And where do you need to do that? It seems to me that there are three interesting operations: 1. Take a full backup. Basically, we already have this. In the backup label file, make sure to note the newest LSN guaranteed to be present in the backup. 2. Take a differential backup. In the backup label file, note the LSN of the fullback to which the differential backup is relative, and the newest LSN guaranteed to be present in the differential backup. The actual backup can consist of a series of 20-byte buffer tags, those being the exact set of blocks newer than the base-backup's latest-guaranteed-to-be-present LSN. Each buffer tag is followed by an 8kB block of data. If a relfilenode is truncated or removed, you need some way to indicate that in the backup; e.g. include a buffertag with forknum = -(forknum + 1) and blocknum = the new number of blocks, or InvalidBlockNumber if removed entirely. 3. Apply a differential backup to a full backup to create an updated full backup. This is just a matter of scanning the full backup and the differential backup and applying the changes in the differential backup to the full backup. You might want combinations of these, like something that does 2+3 as a single operation, for efficiency, or a way to copy a full backup and apply a differential backup to it as you go. But that's it, right? What else do you need? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2014-10-03 17:31:45 +0200, Marco Nenciarini wrote: > I've updated the wiki page > https://wiki.postgresql.org/wiki/Incremental_backup following the result > of discussion on hackers. > > Compared to first version, we switched from a timestamp+checksum based > approach to one based on LSN. > > This patch adds an option to pg_basebackup and to replication protocol > BASE_BACKUP command to generate a backup_profile file. It is almost > useless by itself, but it is the foundation on which we will build the > file based incremental backup (and hopefully a block based incremental > backup after it). > > Any comment will be appreciated. In particular I'd appreciate comments > on correctness of relnode files detection and LSN extraction code. Can you describe the algorithm you implemented in words? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Sat, Oct 4, 2014 at 12:31 AM, Marco Nenciarini <marco.nenciarini@2ndquadrant.it> wrote: > Compared to first version, we switched from a timestamp+checksum based > approach to one based on LSN. Cool. > This patch adds an option to pg_basebackup and to replication protocol > BASE_BACKUP command to generate a backup_profile file. It is almost > useless by itself, but it is the foundation on which we will build the > file based incremental backup (and hopefully a block based incremental > backup after it). Hm. I am not convinced by the backup profile file. What's wrong with having a client send only an LSN position to get a set of files (or partial files filed with blocks) newer than the position given, and have the client do all the rebuild analysis? > Any comment will be appreciated. In particular I'd appreciate comments > on correctness of relnode files detection and LSN extraction code. Please include some documentation with the patch once you consider that this is worth adding to a commit fest. This is clearly WIP yet so it does not matter much, but that's something not to forget. Regards, -- Michael
Il 03/10/14 23:12, Andres Freund ha scritto: > On 2014-10-03 17:31:45 +0200, Marco Nenciarini wrote: >> I've updated the wiki page >> https://wiki.postgresql.org/wiki/Incremental_backup following the result >> of discussion on hackers. >> >> Compared to first version, we switched from a timestamp+checksum based >> approach to one based on LSN. >> >> This patch adds an option to pg_basebackup and to replication protocol >> BASE_BACKUP command to generate a backup_profile file. It is almost >> useless by itself, but it is the foundation on which we will build the >> file based incremental backup (and hopefully a block based incremental >> backup after it). >> >> Any comment will be appreciated. In particular I'd appreciate comments >> on correctness of relnode files detection and LSN extraction code. > > Can you describe the algorithm you implemented in words? > Here it is the relnode files detection algorithm: I've added a has_relfiles parameter to the sendDir function. If has_relfiles is true every file in the directory is tested against the validateRelfilenodeName function. If the response is true, the maxLSN value is computed for the file. The sendDir function is called with has_relfiles=true by sendTablespace function and by sendDir itself when is recurring into a subdirectory * if has_relfiles is true* if we are recurring into a "./global" or "./base" directory The validateRelfilenodeName has been taken from pg_computemaxlsn patch. It's short enough to be pasted here: static bool validateRelfilenodename(char *name) {int pos = 0; while ((name[pos] >= '0') && (name[pos] <= '9')) pos++; if (name[pos] == '_'){ pos++; while ((name[pos] >= 'a') && (name[pos] <= 'z')) pos++;}if (name[pos] == '.'){ pos++; while ((name[pos] >= '0') && (name[pos] <= '9')) pos++;} if (name[pos] == 0) return true;return false; } To compute the maxLSN for a file, as the file is sent in TAR_SEND_SIZE chunks (32kb) and it is always a multiple of the block size, I've added the following code inside the send cycle: + char *page; + + /* Scan every page to find the max file LSN */ + for (page = buf; page < buf + (off_t) cnt; page += (off_t) BLCKSZ) { + pagelsn = PageGetLSN(page); + if (filemaxlsn < pagelsn) + filemaxlsn = pagelsn; + } + Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it
Il 04/10/14 08:35, Michael Paquier ha scritto: > On Sat, Oct 4, 2014 at 12:31 AM, Marco Nenciarini > <marco.nenciarini@2ndquadrant.it> wrote: >> Compared to first version, we switched from a timestamp+checksum based >> approach to one based on LSN. > Cool. > >> This patch adds an option to pg_basebackup and to replication protocol >> BASE_BACKUP command to generate a backup_profile file. It is almost >> useless by itself, but it is the foundation on which we will build the >> file based incremental backup (and hopefully a block based incremental >> backup after it). > Hm. I am not convinced by the backup profile file. What's wrong with > having a client send only an LSN position to get a set of files (or > partial files filed with blocks) newer than the position given, and > have the client do all the rebuild analysis? > The main problem I see is the following: how a client can detect a truncated or removed file? Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it
On Mon, Oct 6, 2014 at 8:59 AM, Marco Nenciarini <marco.nenciarini@2ndquadrant.it> wrote: > Il 04/10/14 08:35, Michael Paquier ha scritto: >> On Sat, Oct 4, 2014 at 12:31 AM, Marco Nenciarini >> <marco.nenciarini@2ndquadrant.it> wrote: >>> Compared to first version, we switched from a timestamp+checksum based >>> approach to one based on LSN. >> Cool. >> >>> This patch adds an option to pg_basebackup and to replication protocol >>> BASE_BACKUP command to generate a backup_profile file. It is almost >>> useless by itself, but it is the foundation on which we will build the >>> file based incremental backup (and hopefully a block based incremental >>> backup after it). >> Hm. I am not convinced by the backup profile file. What's wrong with >> having a client send only an LSN position to get a set of files (or >> partial files filed with blocks) newer than the position given, and >> have the client do all the rebuild analysis? >> > > The main problem I see is the following: how a client can detect a > truncated or removed file? When you take a differential backup, the server needs to send some piece of information about every file so that the client can compare that list against what it already has. But a full backup does not need to include similar information. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Il 03/10/14 22:47, Robert Haas ha scritto: > On Fri, Oct 3, 2014 at 12:08 PM, Marco Nenciarini > <marco.nenciarini@2ndquadrant.it> wrote: >> Il 03/10/14 17:53, Heikki Linnakangas ha scritto: >>> If we're going to need a profile file - and I'm not convinced of that - >>> is there any reason to not always include it in the backup? >> >> The main reason is to have a centralized list of files that need to be >> present. Without a profile, you have to insert some sort of placeholder >> for kipped files. > > Why do you need to do that? And where do you need to do that? > > It seems to me that there are three interesting operations: > > 1. Take a full backup. Basically, we already have this. In the > backup label file, make sure to note the newest LSN guaranteed to be > present in the backup. Don't we already have it in "START WAL LOCATION"? > > 2. Take a differential backup. In the backup label file, note the LSN > of the fullback to which the differential backup is relative, and the > newest LSN guaranteed to be present in the differential backup. The > actual backup can consist of a series of 20-byte buffer tags, those > being the exact set of blocks newer than the base-backup's > latest-guaranteed-to-be-present LSN. Each buffer tag is followed by > an 8kB block of data. If a relfilenode is truncated or removed, you > need some way to indicate that in the backup; e.g. include a buffertag > with forknum = -(forknum + 1) and blocknum = the new number of blocks, > or InvalidBlockNumber if removed entirely. To have a working backup you need to ship each block which is newer than latest-guaranteed-to-be-present in full backup and not newer than latest-guaranteed-to-be-present in the current backup. Also, as a further optimization, you can think about not sending the empty space in the middle of each page. My main concern here is about how postgres can remember that a relfilenode has been deleted, in order to send the appropriate "deletion tag". IMHO the easiest way is to send the full list of files along the backup and let to the client the task to delete unneeded files. The backup profile has this purpose. Moreover, I do not like the idea of using only a stream of block as the actual differential backup, for the following reasons: * AFAIK, with the current infrastructure, you cannot do a backup with a block stream only. To have a valid backup you need many files for which the concept of LSN doesn't apply. * I don't like to have all the data from the various tablespace/db/whatever all mixed in the same stream. I'd prefer to have the blocks saved on a per file basis. > > 3. Apply a differential backup to a full backup to create an updated > full backup. This is just a matter of scanning the full backup and > the differential backup and applying the changes in the differential > backup to the full backup. > > You might want combinations of these, like something that does 2+3 as > a single operation, for efficiency, or a way to copy a full backup and > apply a differential backup to it as you go. But that's it, right? > What else do you need? > Nothing else. Once we agree on definition of involved files and protocols formats, only the actual coding remains. Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it
On Mon, Oct 6, 2014 at 11:33 AM, Marco Nenciarini <marco.nenciarini@2ndquadrant.it> wrote: >> 1. Take a full backup. Basically, we already have this. In the >> backup label file, make sure to note the newest LSN guaranteed to be >> present in the backup. > > Don't we already have it in "START WAL LOCATION"? Yeah, probably. I was too lazy to go look for it, but that sounds like the right thing. >> 2. Take a differential backup. In the backup label file, note the LSN >> of the fullback to which the differential backup is relative, and the >> newest LSN guaranteed to be present in the differential backup. The >> actual backup can consist of a series of 20-byte buffer tags, those >> being the exact set of blocks newer than the base-backup's >> latest-guaranteed-to-be-present LSN. Each buffer tag is followed by >> an 8kB block of data. If a relfilenode is truncated or removed, you >> need some way to indicate that in the backup; e.g. include a buffertag >> with forknum = -(forknum + 1) and blocknum = the new number of blocks, >> or InvalidBlockNumber if removed entirely. > > To have a working backup you need to ship each block which is newer than > latest-guaranteed-to-be-present in full backup and not newer than > latest-guaranteed-to-be-present in the current backup. Also, as a > further optimization, you can think about not sending the empty space in > the middle of each page. Right. Or compressing the data. > My main concern here is about how postgres can remember that a > relfilenode has been deleted, in order to send the appropriate "deletion > tag". You also need to handle truncation. > IMHO the easiest way is to send the full list of files along the backup > and let to the client the task to delete unneeded files. The backup > profile has this purpose. > > Moreover, I do not like the idea of using only a stream of block as the > actual differential backup, for the following reasons: > > * AFAIK, with the current infrastructure, you cannot do a backup with a > block stream only. To have a valid backup you need many files for which > the concept of LSN doesn't apply. > > * I don't like to have all the data from the various > tablespace/db/whatever all mixed in the same stream. I'd prefer to have > the blocks saved on a per file basis. OK, that makes sense. But you still only need the file list when sending a differential backup, not when sending a full backup. So maybe a differential backup looks like this: - Ship a table-of-contents file with a list relation files currently present and the length of each in blocks. - For each block that's been modified since the original backup, ship a file called delta_<original file name> which is of the form <block number><changed block contents> [...]. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Il 06/10/14 16:51, Robert Haas ha scritto: > On Mon, Oct 6, 2014 at 8:59 AM, Marco Nenciarini > <marco.nenciarini@2ndquadrant.it> wrote: >> Il 04/10/14 08:35, Michael Paquier ha scritto: >>> On Sat, Oct 4, 2014 at 12:31 AM, Marco Nenciarini >>> <marco.nenciarini@2ndquadrant.it> wrote: >>>> Compared to first version, we switched from a timestamp+checksum based >>>> approach to one based on LSN. >>> Cool. >>> >>>> This patch adds an option to pg_basebackup and to replication protocol >>>> BASE_BACKUP command to generate a backup_profile file. It is almost >>>> useless by itself, but it is the foundation on which we will build the >>>> file based incremental backup (and hopefully a block based incremental >>>> backup after it). >>> Hm. I am not convinced by the backup profile file. What's wrong with >>> having a client send only an LSN position to get a set of files (or >>> partial files filed with blocks) newer than the position given, and >>> have the client do all the rebuild analysis? >>> >> >> The main problem I see is the following: how a client can detect a >> truncated or removed file? > > When you take a differential backup, the server needs to send some > piece of information about every file so that the client can compare > that list against what it already has. But a full backup does not > need to include similar information. > I agree that a full backup does not need to include a profile. I've added the option to require the profile even for a full backup, as it can be useful for backup softwares. We could remove the option and build the profile only during incremental backups, if required. However, I would avoid the needing to scan the whole backup to know the size of the recovered data directory, hence the backup profile. Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it
On Mon, Oct 6, 2014 at 11:51 AM, Marco Nenciarini <marco.nenciarini@2ndquadrant.it> wrote: > I agree that a full backup does not need to include a profile. > > I've added the option to require the profile even for a full backup, as > it can be useful for backup softwares. We could remove the option and > build the profile only during incremental backups, if required. However, > I would avoid the needing to scan the whole backup to know the size of > the recovered data directory, hence the backup profile. That doesn't seem to be buying you much. Calling stat() on every file in a directory tree is a pretty cheap operation. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hello,
2014-10-06 17:51 GMT+02:00 Marco Nenciarini <marco.nenciarini@2ndquadrant.it>:
I agree that a full backup does not need to include a profile.
I've added the option to require the profile even for a full backup, as
it can be useful for backup softwares. We could remove the option and
build the profile only during incremental backups, if required. However,
I would avoid the needing to scan the whole backup to know the size of
the recovered data directory, hence the backup profile.
I really like this approach.
I think we should leave users the ability to ship a profile file even in case of full backup (by default disabled).
Thanks,
Gabriele
Il 06/10/14 17:55, Robert Haas ha scritto: > On Mon, Oct 6, 2014 at 11:51 AM, Marco Nenciarini > <marco.nenciarini@2ndquadrant.it> wrote: >> I agree that a full backup does not need to include a profile. >> >> I've added the option to require the profile even for a full backup, as >> it can be useful for backup softwares. We could remove the option and >> build the profile only during incremental backups, if required. However, >> I would avoid the needing to scan the whole backup to know the size of >> the recovered data directory, hence the backup profile. > > That doesn't seem to be buying you much. Calling stat() on every file > in a directory tree is a pretty cheap operation. > In case of incremental backup it is not true. You have to read the delta file to know the final size. You can optimize it putting this information in the first few bytes, but in case of compressed tar format you will need to scan the whole archive. Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it
On 10/06/2014 06:33 PM, Marco Nenciarini wrote: > Il 03/10/14 22:47, Robert Haas ha scritto: >> 2. Take a differential backup. In the backup label file, note the LSN >> of the fullback to which the differential backup is relative, and the >> newest LSN guaranteed to be present in the differential backup. The >> actual backup can consist of a series of 20-byte buffer tags, those >> being the exact set of blocks newer than the base-backup's >> latest-guaranteed-to-be-present LSN. Each buffer tag is followed by >> an 8kB block of data. If a relfilenode is truncated or removed, you >> need some way to indicate that in the backup; e.g. include a buffertag >> with forknum = -(forknum + 1) and blocknum = the new number of blocks, >> or InvalidBlockNumber if removed entirely. > > To have a working backup you need to ship each block which is newer than > latest-guaranteed-to-be-present in full backup and not newer than > latest-guaranteed-to-be-present in the current backup. Also, as a > further optimization, you can think about not sending the empty space in > the middle of each page. > > My main concern here is about how postgres can remember that a > relfilenode has been deleted, in order to send the appropriate "deletion > tag". > > IMHO the easiest way is to send the full list of files along the backup > and let to the client the task to delete unneeded files. The backup > profile has this purpose. Right, but the server doesn't need to send a separate backup profile file for that. Rather, anything that the server *didn't* send, should be deleted. I think the missing piece in this puzzle is that even for unmodified blocks, the server should send a note saying the blocks were present, but not modified. So for each file present in the server, the server sends a block stream. For each block, it sends either the full block contents, if it was modified, or a simple indicator that it was not modified. There's a downside to this, though. The client has to read the whole stream, before it knows which files were present. So when applying a block stream directly over an old backup, the client cannot delete files until it has applied all the other changes. That needs more needs more disk space. With a separate profile file that's sent *before* the rest of the backup, you could delete the obsolete files first. But that's not a very big deal. I would suggest that you leave out the profile file in the first version, and add it as an optimization later, if needed. > Moreover, I do not like the idea of using only a stream of block as the > actual differential backup, for the following reasons: > > * AFAIK, with the current infrastructure, you cannot do a backup with a > block stream only. To have a valid backup you need many files for which > the concept of LSN doesn't apply. Those should be sent in whole. At least in the first version. The non-relation files are small compared to relation files, so it's not too bad to just include them in full. >> 3. Apply a differential backup to a full backup to create an updated >> full backup. This is just a matter of scanning the full backup and >> the differential backup and applying the changes in the differential >> backup to the full backup. >> >> You might want combinations of these, like something that does 2+3 as >> a single operation, for efficiency, or a way to copy a full backup and >> apply a differential backup to it as you go. But that's it, right? >> What else do you need? > > Nothing else. Once we agree on definition of involved files and > protocols formats, only the actual coding remains. BTW, regarding the protocol, I have an idea. Rather than invent a whole new file format to represent the modified blocks, can we reuse some existing binary diff file format? For example, the VCDIFF format (RFC 3284). For each unmodified block, the server would send a vcdiff COPY instruction, to "copy" the block from the old backup, and for a modified block, the server would send an ADD instruction, with the new block contents. The VCDIFF file format is quite flexible, but we would only use a small subset of it. I believe that subset would be just as easy to generate in the backend as a custom file format, but you could then use an external tool (xdelta3, open-vcdiff) to apply the diff manually, in case of emergency. In essence, the server would send a tar stream as usual, but for each relation file, it would send a VCDIFF file with name "<relfilenode>.vcdiff" instead. - Heikki
On 10/06/2014 07:06 PM, Marco Nenciarini wrote: > Il 06/10/14 17:55, Robert Haas ha scritto: >> On Mon, Oct 6, 2014 at 11:51 AM, Marco Nenciarini >> <marco.nenciarini@2ndquadrant.it> wrote: >>> I agree that a full backup does not need to include a profile. >>> >>> I've added the option to require the profile even for a full backup, as >>> it can be useful for backup softwares. We could remove the option and >>> build the profile only during incremental backups, if required. However, >>> I would avoid the needing to scan the whole backup to know the size of >>> the recovered data directory, hence the backup profile. >> >> That doesn't seem to be buying you much. Calling stat() on every file >> in a directory tree is a pretty cheap operation. >> > > In case of incremental backup it is not true. You have to read the delta > file to know the final size. You can optimize it putting this > information in the first few bytes, but in case of compressed tar format > you will need to scan the whole archive. I think you're pretty much screwed with the compressed tar format anyway. The files in the .tar can be in different order in the 'diff' and the base backup, so you need to do random access anyway when you try apply the diff. And random access isn't very easy with uncompressed tar format either. I think it would be acceptable to only support incremental backups with the directory format. In hindsight, our compressed tar format was not a very good choice, because it makes random access impossible. - Heikki
Il 06/10/14 17:50, Robert Haas ha scritto: > On Mon, Oct 6, 2014 at 11:33 AM, Marco Nenciarini > <marco.nenciarini@2ndquadrant.it> wrote: >>> 2. Take a differential backup. In the backup label file, note the LSN >>> of the fullback to which the differential backup is relative, and the >>> newest LSN guaranteed to be present in the differential backup. The >>> actual backup can consist of a series of 20-byte buffer tags, those >>> being the exact set of blocks newer than the base-backup's >>> latest-guaranteed-to-be-present LSN. Each buffer tag is followed by >>> an 8kB block of data. If a relfilenode is truncated or removed, you >>> need some way to indicate that in the backup; e.g. include a buffertag >>> with forknum = -(forknum + 1) and blocknum = the new number of blocks, >>> or InvalidBlockNumber if removed entirely. >> >> To have a working backup you need to ship each block which is newer than >> latest-guaranteed-to-be-present in full backup and not newer than >> latest-guaranteed-to-be-present in the current backup. Also, as a >> further optimization, you can think about not sending the empty space in >> the middle of each page. > > Right. Or compressing the data. If we want to introduce compression on server side, I think that compressing the whole tar stream would be more effective. > >> My main concern here is about how postgres can remember that a >> relfilenode has been deleted, in order to send the appropriate "deletion >> tag". > > You also need to handle truncation. Yes, of course. The current backup profile contains the file size, and it can be used to truncate the file to the right size. >> IMHO the easiest way is to send the full list of files along the backup >> and let to the client the task to delete unneeded files. The backup >> profile has this purpose. >> >> Moreover, I do not like the idea of using only a stream of block as the >> actual differential backup, for the following reasons: >> >> * AFAIK, with the current infrastructure, you cannot do a backup with a >> block stream only. To have a valid backup you need many files for which >> the concept of LSN doesn't apply. >> >> * I don't like to have all the data from the various >> tablespace/db/whatever all mixed in the same stream. I'd prefer to have >> the blocks saved on a per file basis. > > OK, that makes sense. But you still only need the file list when > sending a differential backup, not when sending a full backup. So > maybe a differential backup looks like this: > > - Ship a table-of-contents file with a list relation files currently > present and the length of each in blocks. Having the size in bytes allow you to use the same format for non-block files. Am I missing any advantage of having the size in blocks over having the size in bytes? Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it
On Mon, Oct 6, 2014 at 12:06 PM, Marco Nenciarini <marco.nenciarini@2ndquadrant.it> wrote: > Il 06/10/14 17:55, Robert Haas ha scritto: >> On Mon, Oct 6, 2014 at 11:51 AM, Marco Nenciarini >> <marco.nenciarini@2ndquadrant.it> wrote: >>> I agree that a full backup does not need to include a profile. >>> >>> I've added the option to require the profile even for a full backup, as >>> it can be useful for backup softwares. We could remove the option and >>> build the profile only during incremental backups, if required. However, >>> I would avoid the needing to scan the whole backup to know the size of >>> the recovered data directory, hence the backup profile. >> >> That doesn't seem to be buying you much. Calling stat() on every file >> in a directory tree is a pretty cheap operation. >> > > In case of incremental backup it is not true. You have to read the delta > file to know the final size. You can optimize it putting this > information in the first few bytes, but in case of compressed tar format > you will need to scan the whole archive. Well, sure. But I never objected to sending a profile in a differential backup. I'm just objecting to sending one in a full backup. At least not without a more compelling reason why we need it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Oct 6, 2014 at 12:18 PM, Marco Nenciarini <marco.nenciarini@2ndquadrant.it> wrote: >> - Ship a table-of-contents file with a list relation files currently >> present and the length of each in blocks. > > Having the size in bytes allow you to use the same format for non-block > files. Am I missing any advantage of having the size in blocks over > having the size in bytes? Size in bytes would be fine, too. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 10/06/2014 07:00 PM, Gabriele Bartolini wrote: > Hello, > > 2014-10-06 17:51 GMT+02:00 Marco Nenciarini <marco.nenciarini@2ndquadrant.it >> : > >> I agree that a full backup does not need to include a profile. >> >> I've added the option to require the profile even for a full backup, as >> it can be useful for backup softwares. We could remove the option and >> build the profile only during incremental backups, if required. However, >> I would avoid the needing to scan the whole backup to know the size of >> the recovered data directory, hence the backup profile. > > I really like this approach. > > I think we should leave users the ability to ship a profile file even in > case of full backup (by default disabled). I don't see the point of making the profile optional. Why burden the user with that decision? I'm not convinced we need it at all, but if we're going to have a profile file, it should always be included. - Heikki
On Mon, Oct 06, 2014 at 07:24:32PM +0300, Heikki Linnakangas wrote: > On 10/06/2014 07:00 PM, Gabriele Bartolini wrote: > >Hello, > > > >2014-10-06 17:51 GMT+02:00 Marco Nenciarini <marco.nenciarini@2ndquadrant.it > >>: > > > >>I agree that a full backup does not need to include a profile. > >> > >>I've added the option to require the profile even for a full backup, as > >>it can be useful for backup softwares. We could remove the option and > >>build the profile only during incremental backups, if required. However, > >>I would avoid the needing to scan the whole backup to know the size of > >>the recovered data directory, hence the backup profile. > > > >I really like this approach. > > > >I think we should leave users the ability to ship a profile file even in > >case of full backup (by default disabled). > > I don't see the point of making the profile optional. Why burden the user > with that decision? I'm not convinced we need it at all, but if we're going > to have a profile file, it should always be included. +1 for fewer user decisions, especially with something light-weight in resource consumption like the profile. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate