Обсуждение: invalid data in file backup_label problem on windows
Hi hackers. When I using a Non-Exclusive Low-Level Backup on windows, "invalid data in file backup_label" will be shown. The code are listed below if (fscanf(lfp, "START WAL LOCATION: %X/%X (file %08X%16s)%c", &hi, &lo, &tli_from_walseg, startxlogfilename, &ch) != 5 || ch != '\n') ereport(FATAL, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), errmsg("invalid data in file \"%s\"", BACKUP_LABEL_FILE))); When I wrote the backup_label on windows, CRLF will at the end of line, so the ch will be '\r'. I'm not sure that these two files(tablespace_map and backup_label) should not use CRLF new line style is mentioned in manual[1]. How about setting the text mode when reading these 2 files like patch listed in attachment? Any thought? [1] https://www.postgresql.org/docs/13/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP Best Regards, Shenhao Wang
Вложения
Hi, again: On windows, if a backup_label file contains a windows(CRLF) line ending. Recovering from this backup will failed. I think this is a problem, can I add this to commitfest? Best Regards, Shenhao Wang -----Original Message----- From: Wang, Shenhao <wangsh.fnst@cn.fujitsu.com> Sent: Sunday, January 10, 2021 8:58 PM To: pgsql-hackers@lists.postgresql.org Subject: invalid data in file backup_label problem on windows Hi hackers. When I using a Non-Exclusive Low-Level Backup on windows, "invalid data in file backup_label" will be shown. The code are listed below if (fscanf(lfp, "START WAL LOCATION: %X/%X (file %08X%16s)%c", &hi, &lo, &tli_from_walseg, startxlogfilename, &ch) != 5 || ch != '\n') ereport(FATAL, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), errmsg("invalid data in file \"%s\"", BACKUP_LABEL_FILE))); When I wrote the backup_label on windows, CRLF will at the end of line, so the ch will be '\r'. I'm not sure that these two files(tablespace_map and backup_label) should not use CRLF new line style is mentioned in manual[1]. How about setting the text mode when reading these 2 files like patch listed in attachment? Any thought? [1] https://www.postgresql.org/docs/13/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP Best Regards, Shenhao Wang
On Fri, Jan 15, 2021 at 03:29:57AM +0000, Wang, Shenhao wrote: > On windows, if a backup_label file contains a windows(CRLF) line ending. > Recovering from this backup will failed. > > I think this is a problem, can I add this to commitfest? Please feel free to, under the section "Bug fixes". This way, it won't get lost in the traffic of this list. -- Michael
Вложения
Hi, Michael >Please feel free to, under the section "Bug fixes". This way, it >won't get lost in the traffic of this list. >-- >Michael Thank you for your advise, added it Best Regards, Shenhao Wang
On 1/14/21 10:50 PM, Wang, Shenhao wrote: > >> Please feel free to, under the section "Bug fixes". This way, it >> won't get lost in the traffic of this list. >> -- >> Michael > > Thank you for your advise, added it I'm not sure how I feel about this patch (not really a Windows person) but I do think that you shouldn't modify the backup_label when writing it, i.e. you should be writing backup_label in binary mode to avoid this issue. No objections from me if it gets committed but I'm not sure it should be classified as a "bug fix" since the backup_label was modified from what postgres provided, unless I am misunderstanding. Regards, -- -David david@pgmasters.net
Hi, David Steele <david@pgmasters.net> wrote: > I'm not sure how I feel about this patch (not really a Windows person) > but I do think that you shouldn't modify the backup_label when writing > it, i.e. you should be writing backup_label in binary mode to avoid this > issue. IMO, I don't modify backup_lable, I just execute select * FROM pg_stop_backup(), copy the result from terminal to a file and save the file, but most of editor on Windows will using CRLF as default to edit file, such as notepad, notepad++. BTW, in [1] > The pg_stop_backup will return one row with three values. The second > of these fields should be written to a file named backup_label in the root > directory of the backup. The third field should be written to a file named > tablespace_map unless the field is empty. These files are vital to the backup > working, and must be written without modification. Do not use CRLF to edit a backup_label on windows is not mentioned. > No objections from me if it gets committed but I'm not sure it should be > classified as a "bug fix" since the backup_label was modified from what > postgres provided, unless I am misunderstanding. I think the backup_label is not a postgres provided file(using Non-Exclusive Low Level API), this file must be created by user. If users use Exclusive Low Level API or pg_basebakup, this file will be auto created, and users will not edit this file. Therefore, I think this is a bug instead of a misuse [1] https://www.postgresql.org/docs/13/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP ------------- Best Regards Shenhao Wang
On 3/19/21 6:11 AM, wangsh.fnst@fujitsu.com wrote: > David Steele <david@pgmasters.net> wrote: > >> I'm not sure how I feel about this patch (not really a Windows person) >> but I do think that you shouldn't modify the backup_label when writing >> it, i.e. you should be writing backup_label in binary mode to avoid this >> issue. > > IMO, I don't modify backup_lable, I just execute select * FROM pg_stop_backup(), > copy the result from terminal to a file and save the file, but most of editor on > Windows will using CRLF as default to edit file, such as notepad, notepad++. It's not clear to me what text editors have to do with this? Are you editing the file manually? > BTW, in [1] >> The pg_stop_backup will return one row with three values. The second >> of these fields should be written to a file named backup_label in the root >> directory of the backup. The third field should be written to a file named >> tablespace_map unless the field is empty. These files are vital to the backup >> working, and must be written without modification. > > Do not use CRLF to edit a backup_label on windows is not mentioned. "These files are vital to the backup working, and must be written without modification" seems pretty clear to me. >> No objections from me if it gets committed but I'm not sure it should be >> classified as a "bug fix" since the backup_label was modified from what >> postgres provided, unless I am misunderstanding. > > I think the backup_label is not a postgres provided file(using Non-Exclusive Low Level API), > this file must be created by user. It is provided by postgres via pg_stop_backup(). It should be simply saved as-is to a file. > If users use Exclusive Low Level API or pg_basebakup, this file will be auto created, and > users will not edit this file. You keep saying "edit the file" but perhaps you mean "save the file"? Regards, -- -David david@pgmasters.net
David Steele <david@pgmasters.net> wrote: > It's not clear to me what text editors have to do with this? Are you > editing the file manually? When I execute SELECT * FROM pg_stop_backup(false, true) in psql. The results will be shown like: lsn | labelfile | spcmapfile ------------+---------------------------------------------------------------------+------------ 0/2000138 | START WAL LOCATION: 0/2000028 (file 000000010000000000000002)+| | CHECKPOINT LOCATION: 0/2000060 +| | BACKUP METHOD: streamed +| | BACKUP FROM: master + ...... The results only will be shown on screen and this function will not generate any files. What I do is write the second field(labelfile) to a new file backup_label and write the third field(spcmapfile) to tablespace_map if the third field is not null. Therefore, I choose a text editor to help me write the file. I copy the a line in second field and paste this to text editor and press the 'enter' key, repeat this action util all the line have be pasted to text editor, then save the file. If this is not a good way to create the backup_label file, can you tell me how can I create this file on windows? I think the real problem is this file on windows is opened with binary mode. If I use libpq to get the result and write the result to file directly(the default action on windows is open file in text mode), this problem will be happened. So I consider this is a bug. Best Regards, Shenhao Wang
On Sat, Mar 20, 2021 at 3:10 AM wangsh.fnst@fujitsu.com <wangsh.fnst@fujitsu.com> wrote: > > David Steele <david@pgmasters.net> wrote: > > > It's not clear to me what text editors have to do with this? Are you > > editing the file manually? > > When I execute SELECT * FROM pg_stop_backup(false, true) in psql. > > The results will be shown like: > lsn | labelfile | spcmapfile > ------------+---------------------------------------------------------------------+------------ > 0/2000138 | START WAL LOCATION: 0/2000028 (file 000000010000000000000002)+| > | CHECKPOINT LOCATION: 0/2000060 +| > | BACKUP METHOD: streamed +| > | BACKUP FROM: master + > ...... > The results only will be shown on screen and this function will not generate any files. What I do is write > the second field(labelfile) to a new file backup_label and write the third field(spcmapfile) to tablespace_map if > the third field is not null. > > Therefore, I choose a text editor to help me write the file. > I copy the a line in second field and paste this to text editor and press the 'enter' key, repeat this action util > all the line have be pasted to text editor, then save the file. > > If this is not a good way to create the backup_label file, can you tell me how can I create this file on windows? These APIs are really not designed to be run manually from a CLI and copy/paste the results. Running them from literally any script or program should make that easy, by getting the actual value out and storing it. > I think the real problem is this file on windows is opened with binary mode. If I use libpq to get the result and write > the result to file directly(the default action on windows is open file in text mode), this problem will be happened. > So I consider this is a bug. No, the problem is you are using copy/paste and in doing so you are *changing'* the value that is being returned. You'll either need to update your copy/paste procedure to not mess with the newlines, or to use a better way to get the data out. If we need to clarify that in the documentation, I'm fine with that. Maybe add an extra sentence to the part about not modifying the output to mention that this includes changing newslines and also encoding (which would also break it, if you managed to find a non-ascii compatible encoding). Maybe even something along the line of "the contents have to be written in binary mode"? -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/
On 3/21/21 10:40 AM, Magnus Hagander wrote: > On Sat, Mar 20, 2021 at 3:10 AM wangsh.fnst@fujitsu.com > <wangsh.fnst@fujitsu.com> wrote: >> >> David Steele <david@pgmasters.net> wrote: >> >>> It's not clear to me what text editors have to do with this? Are you >>> editing the file manually? >> >> When I execute SELECT * FROM pg_stop_backup(false, true) in psql. >> >> The results will be shown like: >> lsn | labelfile | spcmapfile >> ------------+---------------------------------------------------------------------+------------ >> 0/2000138 | START WAL LOCATION: 0/2000028 (file 000000010000000000000002)+| >> | CHECKPOINT LOCATION: 0/2000060 +| >> | BACKUP METHOD: streamed +| >> | BACKUP FROM: master + >> ...... >> The results only will be shown on screen and this function will not generate any files. What I do is write >> the second field(labelfile) to a new file backup_label and write the third field(spcmapfile) to tablespace_map if >> the third field is not null. >> >> Therefore, I choose a text editor to help me write the file. >> I copy the a line in second field and paste this to text editor and press the 'enter' key, repeat this action util >> all the line have be pasted to text editor, then save the file. >> >> If this is not a good way to create the backup_label file, can you tell me how can I create this file on windows? > > These APIs are really not designed to be run manually from a CLI and > copy/paste the results. > > Running them from literally any script or program should make that > easy, by getting the actual value out and storing it. You might consider using pg_basebackup, which does all this for you and is well tested. >> I think the real problem is this file on windows is opened with binary mode. If I use libpq to get the result and write >> the result to file directly(the default action on windows is open file in text mode), this problem will be happened. >> So I consider this is a bug. > > No, the problem is you are using copy/paste and in doing so you are > *changing'* the value that is being returned. You'll either need to > update your copy/paste procedure to not mess with the newlines, or to > use a better way to get the data out. > > If we need to clarify that in the documentation, I'm fine with that. > Maybe add an extra sentence to the part about not modifying the output > to mention that this includes changing newslines and also encoding > (which would also break it, if you managed to find a non-ascii > compatible encoding). Maybe even something along the line of "the > contents have to be written in binary mode"? Perhaps something like the attached? Regards, -- -David david@pgmasters.net
Вложения
On 3/26/21 10:19 AM, David Steele wrote: > >> No, the problem is you are using copy/paste and in doing so you are >> *changing'* the value that is being returned. You'll either need to >> update your copy/paste procedure to not mess with the newlines, or to >> use a better way to get the data out. >> >> If we need to clarify that in the documentation, I'm fine with that. >> Maybe add an extra sentence to the part about not modifying the output >> to mention that this includes changing newslines and also encoding >> (which would also break it, if you managed to find a non-ascii >> compatible encoding). Maybe even something along the line of "the >> contents have to be written in binary mode"? > > Perhaps something like the attached? > > That seems a bit opaque. Let's tell them exactly what they need to avoid. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On Fri, Mar 26, 2021 at 5:52 PM Andrew Dunstan <andrew@dunslane.net> wrote: > > > On 3/26/21 10:19 AM, David Steele wrote: > > > >> No, the problem is you are using copy/paste and in doing so you are > >> *changing'* the value that is being returned. You'll either need to > >> update your copy/paste procedure to not mess with the newlines, or to > >> use a better way to get the data out. > >> > >> If we need to clarify that in the documentation, I'm fine with that. > >> Maybe add an extra sentence to the part about not modifying the output > >> to mention that this includes changing newslines and also encoding > >> (which would also break it, if you managed to find a non-ascii > >> compatible encoding). Maybe even something along the line of "the > >> contents have to be written in binary mode"? > > > > Perhaps something like the attached? > > > > > > > That seems a bit opaque. Let's tell them exactly what they need to avoid. Yeah, it seems a bit imprecise. Maybe something like "which includes things like opening the file in binary mode"? (I want the "includes" part because it also means other things, this is not the only thing). -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/
On 3/26/21 1:20 PM, Magnus Hagander wrote: > On Fri, Mar 26, 2021 at 5:52 PM Andrew Dunstan <andrew@dunslane.net> wrote: >> On 3/26/21 10:19 AM, David Steele wrote: >>> >>>> No, the problem is you are using copy/paste and in doing so you are >>>> *changing'* the value that is being returned. You'll either need to >>>> update your copy/paste procedure to not mess with the newlines, or to >>>> use a better way to get the data out. >>>> >>>> If we need to clarify that in the documentation, I'm fine with that. >>>> Maybe add an extra sentence to the part about not modifying the output >>>> to mention that this includes changing newslines and also encoding >>>> (which would also break it, if you managed to find a non-ascii >>>> compatible encoding). Maybe even something along the line of "the >>>> contents have to be written in binary mode"? >>> >>> Perhaps something like the attached? >> >> That seems a bit opaque. Let's tell them exactly what they need to avoid. > > Yeah, it seems a bit imprecise. Maybe something like "which includes > things like opening the file in binary mode"? (I want the "includes" > part because it also means other things, this is not the only thing). OK, how about the attached? Regards, -- -David david@pgmasters.net
Вложения
Hi, David Steele <david@pgmasters.net> wrote: >On 3/26/21 1:20 PM, Magnus Hagander wrote: >> On Fri, Mar 26, 2021 at 5:52 PM Andrew Dunstan <andrew@dunslane.net> wrote: >>> On 3/26/21 10:19 AM, David Steele wrote: >>>> >>>>> No, the problem is you are using copy/paste and in doing so you are >>>>> *changing'* the value that is being returned. You'll either need to >>>>> update your copy/paste procedure to not mess with the newlines, or to >>>>> use a better way to get the data out. >>>>> >>>>> If we need to clarify that in the documentation, I'm fine with that. >>>>> Maybe add an extra sentence to the part about not modifying the output >>>>> to mention that this includes changing newslines and also encoding >>>>> (which would also break it, if you managed to find a non-ascii >>>>> compatible encoding). Maybe even something along the line of "the >>>>> contents have to be written in binary mode"? >>>> >>>> Perhaps something like the attached? >>> >>> That seems a bit opaque. Let's tell them exactly what they need to avoid. >> >> Yeah, it seems a bit imprecise. Maybe something like "which includes >> things like opening the file in binary mode"? (I want the "includes" >> part because it also means other things, this is not the only thing). > >OK, how about the attached? I think this is good for me to avoid this problem. And next time I will use libpq and fopen(xxx, "wb") to create this file. Thanks Best regards Shenhao Wang
On 3/26/21 2:45 PM, David Steele wrote: > On 3/26/21 1:20 PM, Magnus Hagander wrote: >> On Fri, Mar 26, 2021 at 5:52 PM Andrew Dunstan <andrew@dunslane.net> >> wrote: >>> On 3/26/21 10:19 AM, David Steele wrote: >>>> >>>>> No, the problem is you are using copy/paste and in doing so you are >>>>> *changing'* the value that is being returned. You'll either need to >>>>> update your copy/paste procedure to not mess with the newlines, or to >>>>> use a better way to get the data out. >>>>> >>>>> If we need to clarify that in the documentation, I'm fine with that. >>>>> Maybe add an extra sentence to the part about not modifying the >>>>> output >>>>> to mention that this includes changing newslines and also encoding >>>>> (which would also break it, if you managed to find a non-ascii >>>>> compatible encoding). Maybe even something along the line of "the >>>>> contents have to be written in binary mode"? >>>> >>>> Perhaps something like the attached? >>> >>> That seems a bit opaque. Let's tell them exactly what they need to >>> avoid. >> >> Yeah, it seems a bit imprecise. Maybe something like "which includes >> things like opening the file in binary mode"? (I want the "includes" >> part because it also means other things, this is not the only thing). > > OK, how about the attached? > > - vital to the backup working, and must be written without modification. + vital to the backup working and must be written without modification, which + may include opening the file in binary mode. how about ... must be written byte for byte without modification, which might require opening the file in binary mode. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On Sun, Mar 28, 2021 at 09:29:10AM -0400, Andrew Dunstan wrote: > - vital to the backup working, and must be written without modification. > + vital to the backup working and must be written without > modification, which > + may include opening the file in binary mode. += "on Windows"? -- Michael
Вложения
On Mon, Mar 29, 2021 at 7:01 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Sun, Mar 28, 2021 at 09:29:10AM -0400, Andrew Dunstan wrote: > > - vital to the backup working, and must be written without modification. > > + vital to the backup working and must be written without > > modification, which > > + may include opening the file in binary mode. > > += "on Windows"? I'd say no, better to just tell people to always open the file in binary mode. It's not hurtful anywhere, there's really no reason ever to open it in text mode. And if we clearly tell everybody to open it in binary mode, that lowers the bar in the unlikely event that we might want to store some other non-text data in it in the future. -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/
On 3/29/21 4:34 AM, Magnus Hagander wrote: > On Mon, Mar 29, 2021 at 7:01 AM Michael Paquier <michael@paquier.xyz> wrote: >> >> On Sun, Mar 28, 2021 at 09:29:10AM -0400, Andrew Dunstan wrote: >>> - vital to the backup working, and must be written without modification. >>> + vital to the backup working and must be written without >>> modification, which >>> + may include opening the file in binary mode. >> >> += "on Windows"? > > I'd say no, better to just tell people to always open the file in > binary mode. It's not hurtful anywhere, there's really no reason ever > to open it in text mode. Agreed. New patch attached. Regards, -- -David david@pgmasters.net
Вложения
On Wed, Mar 31, 2021 at 09:33:25AM -0400, David Steele wrote: > Agreed. New patch attached. Thanks, David. > diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml > index c5557d5444..8c9186d277 100644 > --- a/doc/src/sgml/backup.sgml > +++ b/doc/src/sgml/backup.sgml > @@ -913,7 +913,8 @@ SELECT * FROM pg_stop_backup(false, true); > <filename>backup_label</filename> in the root directory of the backup. The > third field should be written to a file named > <filename>tablespace_map</filename> unless the field is empty. These files are > - vital to the backup working, and must be written without modification. > + vital to the backup working and must be written byte for byte without > + modification, which may require opening the file in binary mode. Okay, that sounds good to me. -- Michael
Вложения
On Thu, Apr 01, 2021 at 09:56:02AM +0900, Michael Paquier wrote: > Okay, that sounds good to me. Applied and backpatched then as 6fb66c26 & co. -- Michael
Вложения
On 4/2/21 3:43 AM, Michael Paquier wrote: > On Thu, Apr 01, 2021 at 09:56:02AM +0900, Michael Paquier wrote: >> Okay, that sounds good to me. > > Applied and backpatched then as 6fb66c26 & co. Thank you! -- -David david@pgmasters.net