Обсуждение: BUG #12779: pg_dump -Fd doesn't care about -Z
The following bug has been logged on the website: Bug reference: 12779 Logged by: Christoph Berg Email address: christoph.berg@credativ.de PostgreSQL version: 9.4.1 Operating system: Linux (Debian 8) Description: pg_dump -Fd doesn't seem to care about -Z for Z > 0: $ for i in {0..9}; do echo $i; pg_dump -Z$i -Fd -f $i.fd postgres ; done $ du ?.fd 5392 0.fd 1164 1.fd 1164 2.fd 1164 3.fd 1164 4.fd 1164 5.fd 1164 6.fd 1164 7.fd 1164 8.fd 1164 9.fd In contrast with -Fc, where it works: $ for i in {0..9}; do echo $i; pg_dump -Z$i -Fc -f $i.fc postgres ; done $ du ?.fc 7548 0.fc 1488 1.fc 1440 2.fc 1392 3.fc 1240 4.fc 1148 5.fc 1160 6.fc 1156 7.fc 1160 8.fc 1160 9.fc
Hi Cristoph: On Tue, Feb 17, 2015 at 4:34 PM, <christoph.berg@credativ.de> wrote: > > pg_dump -Fd doesn't seem to care about -Z for Z > 0: > With such a small dump, are you sure du's block granularity is not masking the differences in -Fd ? ( Have you tried du -b, or just comparing every size ?) ( In -Fc it is a single file, harder to mask ). Regards. Francisco Olarte.
Re: To pgsql-bugs@postgresql.org 2015-02-17 <20150217153446.2590.24945@wrigleys.postgresql.org> > 1164 1.fd > 1164 9.fd > > 1488 1.fc > 1160 9.fc Re: Francisco Olarte 2015-02-17 <CA+bJJbza-+UOuofSVCoHdn6kMa3u4zWzgKZhyU=Qab46R4vY-w@mail.gmail.com> > > pg_dump -Fd doesn't seem to care about -Z for Z > 0: > > > > With such a small dump, are you sure du's block granularity is not masking > the differences in -Fd ? ( Have you tried du -b, or just comparing every > size ?) ( In -Fc it is a single file, harder to mask ). I noticed when trying to compress an assumed-to-be-Z4 file a bit more using -Z7. The directory was 241GB in both cases. And if you compare with -Fc, the difference is a lot more than a few FS blocks. Christoph -- cb@df7cb.de | http://www.df7cb.de/
Hi Cristopher. On Tue, Feb 17, 2015 at 4:54 PM, Christoph Berg <cb@df7cb.de> wrote: > Re: To pgsql-bugs@postgresql.org 2015-02-17 < > 20150217153446.2590.24945@wrigleys.postgresql.org> > > 1164 1.fd > > 1164 9.fd > > > > 1488 1.fc > > 1160 9.fc > > Re: Francisco Olarte 2015-02-17 <CA+bJJbza-+UOuofSVCoHdn6kMa3u4zWzgKZhyU= =3D > Qab46R4vY-w@mail.gmail.com> > > > pg_dump -Fd doesn't seem to care about -Z for Z > 0: > > > > > > > With such a small dump, are you sure du's block granularity is not > masking > > the differences in -Fd ? ( Have you tried du -b, or just comparing eve= ry > > size ?) ( In -Fc it is a single file, harder to mask ). > > I noticed when trying to compress an assumed-to-be-Z4 file a bit more > using -Z7. The directory was 241GB in both cases. > =E2=80=8BI understand, but how did you measure it? I mean, from this aserti= on one could be 240.01 and the other 241, and the differences you are quoting is between a -Z4 and -Z7, not -Z1 and -Z9 as in the above quoted text. Because if you used du -h it may well be masking it again. =E2=80=8BIt seems rather= strange that it is processing -Z0 and -Z1 and masking the rest of them ( also, it seems they use the same routines for both modes, but compressor, specially in the hi-compression modes which use larger windows, are much better when fed a large file ). > And if you compare with -Fc, the difference is a lot more than a few > FS blocks. > =E2=80=8BThat is what I was pointing. With -Fc you have a single file, so a= nything bigger than a block is going to show ( this seem maybe be extents, rather than blocks ). With -Fd you need a difference bigger than a block in a single file. With du -b you just need a byte of diff, so just testing it with it ( or asserting you've done it, I'm not gonna doubt it ) will make your point stand out more. =E2=80=8B=E2=80=8BRegards. Francisco Olarte.
Another run with 9.1.15: $ for i in {0..9}; do /usr/lib/postgresql/9.1/bin/pg_dump -p 5434 -Z$i -t t -Fd -f $i.fd; done $ l ?.fd/2495.dat* -rw-rw-r-- 1 cbe cbe 78888902 Feb 17 18:56 0.fd/2495.dat -rw-rw-r-- 1 cbe cbe 21100852 Feb 17 18:56 1.fd/2495.dat.gz -rw-rw-r-- 1 cbe cbe 21100852 Feb 17 18:56 2.fd/2495.dat.gz -rw-rw-r-- 1 cbe cbe 21100852 Feb 17 18:56 3.fd/2495.dat.gz -rw-rw-r-- 1 cbe cbe 21100852 Feb 17 18:56 4.fd/2495.dat.gz -rw-rw-r-- 1 cbe cbe 21100852 Feb 17 18:56 5.fd/2495.dat.gz -rw-rw-r-- 1 cbe cbe 21100852 Feb 17 18:56 6.fd/2495.dat.gz -rw-rw-r-- 1 cbe cbe 21100852 Feb 17 18:56 7.fd/2495.dat.gz -rw-rw-r-- 1 cbe cbe 21100852 Feb 17 18:56 8.fd/2495.dat.gz -rw-rw-r-- 1 cbe cbe 21100852 Feb 17 18:56 9.fd/2495.dat.gz Christoph -- cb@df7cb.de | http://www.df7cb.de/
christoph.berg@credativ.de writes: > pg_dump -Fd doesn't seem to care about -Z for Z > 0: Yeah, you're right: whoever wrote pg_dump/compress_io.c seems to have been utterly clueless about the idea that zlib needs to be told which compression level to use. That logic is just treating the compression level as a binary compress-or-not flag. cfopen_write() is obviously losing the compression level info, and even if it weren't, cfopen() thinks that argument is binary not a number to pass down. There may be other subroutines in there and/or in pg_backup_directory.c that missed the memo as well. This doesn't look tremendously hard to fix, but it's not a one-liner either. Don't have time for it personally right now. regards, tom lane
On Wed, Feb 18, 2015 at 3:52 AM, Tom Lane wrote: > This doesn't look tremendously hard to fix, but it's not a one-liner > either. Don't have time for it personally right now. Interesting. The call to gzopen is missing the compression level, always passing either 'w' or 'wb', while for example to write with a compression level of 6 we should pass w6 or wb6. The patch attached addresses that. -- Michael
Вложения
Re: Michael Paquier 2015-02-18 <CAB7nPqTLSv5qES=9hODzkX62V5XcivP1ij=_iahR_-SriD2AOA@mail.gmail.com> > On Wed, Feb 18, 2015 at 3:52 AM, Tom Lane wrote: > > This doesn't look tremendously hard to fix, but it's not a one-liner > > either. Don't have time for it personally right now. > > Interesting. The call to gzopen is missing the compression level, > always passing either 'w' or 'wb', while for example to write with a > compression level of 6 we should pass w6 or wb6. > > The patch attached addresses that. The patch works for me: $ paste <(du ?.fd) <(du ?.fd.patched) 5392 0.fd 5392 0.fd.patched 1164 1.fd 1492 1.fd.patched 1164 2.fd 1448 2.fd.patched 1164 3.fd 1400 3.fd.patched 1164 4.fd 1244 4.fd.patched 1164 5.fd 1156 5.fd.patched 1164 6.fd 1164 6.fd.patched 1164 7.fd 1164 7.fd.patched 1164 8.fd 1164 8.fd.patched 1164 9.fd 1164 9.fd.patched (I guess the next step would be to support other compressors, xz yields a 860kB backup on this trivial test data...) Mit freundlichen Grüßen, Christoph Berg -- Senior Berater, Tel.: +49 (0)21 61 / 46 43-187 credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209 Hohenzollernstr. 133, 41061 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer pgp fingerprint: 5C48 FE61 57F4 9179 5970 87C6 4C5A 6BAB 12D2 A7AE
Michael Paquier <michael.paquier@gmail.com> writes: > On Wed, Feb 18, 2015 at 3:52 AM, Tom Lane wrote: >> This doesn't look tremendously hard to fix, but it's not a one-liner >> either. Don't have time for it personally right now. > Interesting. The call to gzopen is missing the compression level, > always passing either 'w' or 'wb', while for example to write with a > compression level of 6 we should pass w6 or wb6. > The patch attached addresses that. Committed. I also fixed another problem in this code, which was an overoptimistic assumption that free() doesn't change errno; we've heard that it can on OS X at least. That would've resulted in faulty error reports for file-open failures. regards, tom lane