Обсуждение: BUG #12779: pg_dump -Fd doesn't care about -Z

Поиск
Список
Период
Сортировка

BUG #12779: pg_dump -Fd doesn't care about -Z

От
christoph.berg@credativ.de
Дата:
The following bug has been logged on the website:

Bug reference:      12779
Logged by:          Christoph Berg
Email address:      christoph.berg@credativ.de
PostgreSQL version: 9.4.1
Operating system:   Linux (Debian 8)
Description:

pg_dump -Fd doesn't seem to care about -Z for Z > 0:

$ for i in {0..9}; do echo $i; pg_dump -Z$i -Fd -f $i.fd postgres ; done
$ du ?.fd
5392    0.fd
1164    1.fd
1164    2.fd
1164    3.fd
1164    4.fd
1164    5.fd
1164    6.fd
1164    7.fd
1164    8.fd
1164    9.fd

In contrast with -Fc, where it works:

$ for i in {0..9}; do echo $i; pg_dump -Z$i -Fc -f $i.fc postgres ; done
$ du ?.fc
7548    0.fc
1488    1.fc
1440    2.fc
1392    3.fc
1240    4.fc
1148    5.fc
1160    6.fc
1156    7.fc
1160    8.fc
1160    9.fc

Re: BUG #12779: pg_dump -Fd doesn't care about -Z

От
Francisco Olarte
Дата:
Hi Cristoph:


On Tue, Feb 17, 2015 at 4:34 PM, <christoph.berg@credativ.de> wrote:

>
> pg_dump -Fd doesn't seem to care about -Z for Z > 0:
>

With such a small dump, are you sure du's block granularity is not masking
the differences in -Fd ? ( Have you tried  du -b, or just comparing every
size ?) ( In -Fc it is a single file, harder to mask ).

Regards.
    Francisco Olarte.

Re: BUG #12779: pg_dump -Fd doesn't care about -Z

От
Christoph Berg
Дата:
Re: To pgsql-bugs@postgresql.org 2015-02-17 <20150217153446.2590.24945@wrigleys.postgresql.org>
> 1164    1.fd
> 1164    9.fd
>
> 1488    1.fc
> 1160    9.fc

Re: Francisco Olarte 2015-02-17 <CA+bJJbza-+UOuofSVCoHdn6kMa3u4zWzgKZhyU=Qab46R4vY-w@mail.gmail.com>
> > pg_dump -Fd doesn't seem to care about -Z for Z > 0:
> >
>
> With such a small dump, are you sure du's block granularity is not masking
> the differences in -Fd ? ( Have you tried  du -b, or just comparing every
> size ?) ( In -Fc it is a single file, harder to mask ).

I noticed when trying to compress an assumed-to-be-Z4 file a bit more
using -Z7. The directory was 241GB in both cases.

And if you compare with -Fc, the difference is a lot more than a few
FS blocks.

Christoph
--
cb@df7cb.de | http://www.df7cb.de/

Re: BUG #12779: pg_dump -Fd doesn't care about -Z

От
Francisco Olarte
Дата:
Hi Cristopher.

On Tue, Feb 17, 2015 at 4:54 PM, Christoph Berg <cb@df7cb.de> wrote:

> Re: To pgsql-bugs@postgresql.org 2015-02-17 <
> 20150217153446.2590.24945@wrigleys.postgresql.org>
> > 1164  1.fd
> > 1164  9.fd
> >
> > 1488  1.fc
> > 1160  9.fc
>
> Re: Francisco Olarte 2015-02-17 <CA+bJJbza-+UOuofSVCoHdn6kMa3u4zWzgKZhyU=
=3D
> Qab46R4vY-w@mail.gmail.com>
> > > pg_dump -Fd doesn't seem to care about -Z for Z > 0:
> > >
> >
> > With such a small dump, are you sure du's block granularity is not
> masking
> > the differences in -Fd ? ( Have you tried  du -b, or just comparing eve=
ry
> > size ?) ( In -Fc it is a single file, harder to mask ).
>
> I noticed when trying to compress an assumed-to-be-Z4 file a bit more
> using -Z7. The directory was 241GB in both cases.
>

=E2=80=8BI understand, but how did you measure it? I mean, from this aserti=
on one
could be 240.01 and the other 241, and the differences you are quoting is
between a -Z4 and -Z7, not -Z1 and -Z9 as in the above quoted text. Because
if you used du -h it may well be masking it again. =E2=80=8BIt seems rather=
 strange
that it is processing -Z0 and -Z1 and masking the rest of them ( also, it
seems they use the same routines for both modes, but compressor, specially
in the hi-compression modes which use larger windows, are much better when
fed a large file ).


> And if you compare with -Fc, the difference is a lot more than a few
> FS blocks.
>

=E2=80=8BThat is what I was pointing. With -Fc you have a single file, so a=
nything
bigger than a block is going to show ( this seem maybe be extents, rather
than blocks ). With -Fd you need a difference bigger than a block in a
single file. With du -b you just need a byte of diff, so just testing it
with it ( or asserting you've done it, I'm not gonna doubt it ) will make
your point stand out more.

=E2=80=8B=E2=80=8BRegards.
    Francisco Olarte.

Re: BUG #12779: pg_dump -Fd doesn't care about -Z

От
Christoph Berg
Дата:
Another run with 9.1.15:

$ for i in {0..9}; do /usr/lib/postgresql/9.1/bin/pg_dump -p 5434 -Z$i -t t -Fd -f $i.fd; done
$ l ?.fd/2495.dat*
-rw-rw-r-- 1 cbe cbe 78888902 Feb 17 18:56 0.fd/2495.dat
-rw-rw-r-- 1 cbe cbe 21100852 Feb 17 18:56 1.fd/2495.dat.gz
-rw-rw-r-- 1 cbe cbe 21100852 Feb 17 18:56 2.fd/2495.dat.gz
-rw-rw-r-- 1 cbe cbe 21100852 Feb 17 18:56 3.fd/2495.dat.gz
-rw-rw-r-- 1 cbe cbe 21100852 Feb 17 18:56 4.fd/2495.dat.gz
-rw-rw-r-- 1 cbe cbe 21100852 Feb 17 18:56 5.fd/2495.dat.gz
-rw-rw-r-- 1 cbe cbe 21100852 Feb 17 18:56 6.fd/2495.dat.gz
-rw-rw-r-- 1 cbe cbe 21100852 Feb 17 18:56 7.fd/2495.dat.gz
-rw-rw-r-- 1 cbe cbe 21100852 Feb 17 18:56 8.fd/2495.dat.gz
-rw-rw-r-- 1 cbe cbe 21100852 Feb 17 18:56 9.fd/2495.dat.gz

Christoph
--
cb@df7cb.de | http://www.df7cb.de/

Re: BUG #12779: pg_dump -Fd doesn't care about -Z

От
Tom Lane
Дата:
christoph.berg@credativ.de writes:
> pg_dump -Fd doesn't seem to care about -Z for Z > 0:

Yeah, you're right: whoever wrote pg_dump/compress_io.c seems to have
been utterly clueless about the idea that zlib needs to be told which
compression level to use.  That logic is just treating the compression
level as a binary compress-or-not flag.  cfopen_write() is obviously
losing the compression level info, and even if it weren't, cfopen()
thinks that argument is binary not a number to pass down.  There may be
other subroutines in there and/or in pg_backup_directory.c that missed
the memo as well.

This doesn't look tremendously hard to fix, but it's not a one-liner
either.  Don't have time for it personally right now.

            regards, tom lane

Re: BUG #12779: pg_dump -Fd doesn't care about -Z

От
Michael Paquier
Дата:
On Wed, Feb 18, 2015 at 3:52 AM, Tom Lane wrote:
> This doesn't look tremendously hard to fix, but it's not a one-liner
> either.  Don't have time for it personally right now.

Interesting. The call to gzopen is missing the compression level,
always passing either 'w' or 'wb', while for example to write with a
compression level of 6 we should pass w6 or wb6.

The patch attached addresses that.
--
Michael

Вложения

Re: BUG #12779: pg_dump -Fd doesn't care about -Z

От
Christoph Berg
Дата:
Re: Michael Paquier 2015-02-18 <CAB7nPqTLSv5qES=9hODzkX62V5XcivP1ij=_iahR_-SriD2AOA@mail.gmail.com>
> On Wed, Feb 18, 2015 at 3:52 AM, Tom Lane wrote:
> > This doesn't look tremendously hard to fix, but it's not a one-liner
> > either.  Don't have time for it personally right now.
>
> Interesting. The call to gzopen is missing the compression level,
> always passing either 'w' or 'wb', while for example to write with a
> compression level of 6 we should pass w6 or wb6.
>
> The patch attached addresses that.

The patch works for me:

$ paste <(du ?.fd) <(du ?.fd.patched)
5392    0.fd    5392    0.fd.patched
1164    1.fd    1492    1.fd.patched
1164    2.fd    1448    2.fd.patched
1164    3.fd    1400    3.fd.patched
1164    4.fd    1244    4.fd.patched
1164    5.fd    1156    5.fd.patched
1164    6.fd    1164    6.fd.patched
1164    7.fd    1164    7.fd.patched
1164    8.fd    1164    8.fd.patched
1164    9.fd    1164    9.fd.patched

(I guess the next step would be to support other compressors, xz
yields a 860kB backup on this trivial test data...)

Mit freundlichen Grüßen,
Christoph Berg
--
Senior Berater, Tel.: +49 (0)21 61 / 46 43-187
credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209
Hohenzollernstr. 133, 41061 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
pgp fingerprint: 5C48 FE61 57F4 9179 5970  87C6 4C5A 6BAB 12D2 A7AE

Re: BUG #12779: pg_dump -Fd doesn't care about -Z

От
Tom Lane
Дата:
Michael Paquier <michael.paquier@gmail.com> writes:
> On Wed, Feb 18, 2015 at 3:52 AM, Tom Lane wrote:
>> This doesn't look tremendously hard to fix, but it's not a one-liner
>> either.  Don't have time for it personally right now.

> Interesting. The call to gzopen is missing the compression level,
> always passing either 'w' or 'wb', while for example to write with a
> compression level of 6 we should pass w6 or wb6.

> The patch attached addresses that.

Committed.  I also fixed another problem in this code, which was an
overoptimistic assumption that free() doesn't change errno; we've heard
that it can on OS X at least.  That would've resulted in faulty error
reports for file-open failures.

            regards, tom lane