Обсуждение: pg_basebackup bug: base backup is double the size of the database

Поиск
Список
Период
Сортировка

pg_basebackup bug: base backup is double the size of the database

От
Craig James
Дата:
We've encountered a serious bug with pg_basebackup. It seems to be following hard links and duplicating all files in the tablespaces rather than preserving links.

Drilling down into one specific tablespace, we find this:

# ls -l /data/postgres-9.3/main/pg_tblspc/16747
lrwxrwxrwx 1 postgres postgres 27 2014-08-18 11:28 /data/postgres-9.3/main/pg_tblspc/16747 -> /postgres/tablespaces/uorsy/

# du -sh /data/postgres-9.3/tablespaces/uorsy
35G     /data/postgres-9.3/tablespaces/uorsy

# du -sh /data/postgres-9.3/tablespaces/uorsy/*
35G     /data/postgres-9.3/tablespaces/uorsy/8208624
8.1M    /data/postgres-9.3/tablespaces/uorsy/PG_9.3_201306121
4.0K    /data/postgres-9.3/tablespaces/uorsy/pgsql_tmp
4.0K    /data/postgres-9.3/tablespaces/uorsy/PG_VERSION

# find /data/postgres-9.3/tablespaces/uorsy \! -links 1 -type f | wc -l
740

In other words, this tablespace has 35G of real data, plus 740 hard links that effectively duplicate each data file.

When we look at the same data in the archive that pg_basebackup creates (invoked via barman), we find this:

# du -sh /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747
70G     /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747

# du -sh /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/*
35G     /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/8208624
35G     /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/PG_9.3_201306121
4.0K    /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/pgsql_tmp
4.0K    /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/PG_VERSION

# find /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747 \! -links 1 -type f | wc -l
0

That is, no hard links, and all of the data files are duplicated. And of course, when we try to actually use this archive to recover, it's twice the size as the original database and doesn't fit on our disks.

My guess is that pg_basebackup is using (or doing the equivalent of) rsync(1) without the --hard-links option, and that these hard links were created by pg_upgrade when we went from 8.4.17 to 9.3.5.

What can we do to fix this? The whole cluster is about 350 databases and 800GB.

Thanks,
Craig

Re: pg_basebackup bug: base backup is double the size of the database

От
Craig James
Дата:
One clarification:

On Wed, Jan 21, 2015 at 9:32 AM, Craig James <cjames@emolecules.com> wrote:
We've encountered a serious bug with pg_basebackup. It seems to be following hard links and duplicating all files in the tablespaces rather than preserving links.

It could be barman, not pg_basebackup, that has the bug.  I assumed that barman was using pg_basebackup, but one of my colleagues pointed out that we're using barman, which may or may not use pg_basebackup to do its work.

Craig
 

Drilling down into one specific tablespace, we find this:

# ls -l /data/postgres-9.3/main/pg_tblspc/16747
lrwxrwxrwx 1 postgres postgres 27 2014-08-18 11:28 /data/postgres-9.3/main/pg_tblspc/16747 -> /postgres/tablespaces/uorsy/

# du -sh /data/postgres-9.3/tablespaces/uorsy
35G     /data/postgres-9.3/tablespaces/uorsy

# du -sh /data/postgres-9.3/tablespaces/uorsy/*
35G     /data/postgres-9.3/tablespaces/uorsy/8208624
8.1M    /data/postgres-9.3/tablespaces/uorsy/PG_9.3_201306121
4.0K    /data/postgres-9.3/tablespaces/uorsy/pgsql_tmp
4.0K    /data/postgres-9.3/tablespaces/uorsy/PG_VERSION

# find /data/postgres-9.3/tablespaces/uorsy \! -links 1 -type f | wc -l
740

In other words, this tablespace has 35G of real data, plus 740 hard links that effectively duplicate each data file.

When we look at the same data in the archive that pg_basebackup creates (invoked via barman), we find this:

# du -sh /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747
70G     /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747

# du -sh /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/*
35G     /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/8208624
35G     /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/PG_9.3_201306121
4.0K    /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/pgsql_tmp
4.0K    /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/PG_VERSION

# find /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747 \! -links 1 -type f | wc -l
0

That is, no hard links, and all of the data files are duplicated. And of course, when we try to actually use this archive to recover, it's twice the size as the original database and doesn't fit on our disks.

My guess is that pg_basebackup is using (or doing the equivalent of) rsync(1) without the --hard-links option, and that these hard links were created by pg_upgrade when we went from 8.4.17 to 9.3.5.

What can we do to fix this? The whole cluster is about 350 databases and 800GB.

Thanks,
Craig




--
---------------------------------
Craig A. James
Chief Technology Officer
eMolecules, Inc.
---------------------------------

Re: pg_basebackup bug: base backup is double the size of the database

От
Matheus de Oliveira
Дата:

On Wed, Jan 21, 2015 at 3:32 PM, Craig James <cjames@emolecules.com> wrote:
# ls -l /data/postgres-9.3/main/pg_tblspc/16747
lrwxrwxrwx 1 postgres postgres 27 2014-08-18 11:28 /data/postgres-9.3/main/pg_tblspc/16747 -> /postgres/tablespaces/uorsy/

# du -sh /data/postgres-9.3/tablespaces/uorsy
35G     /data/postgres-9.3/tablespaces/uorsy

# du -sh /data/postgres-9.3/tablespaces/uorsy/*
35G     /data/postgres-9.3/tablespaces/uorsy/8208624
8.1M    /data/postgres-9.3/tablespaces/uorsy/PG_9.3_201306121
4.0K    /data/postgres-9.3/tablespaces/uorsy/pgsql_tmp
4.0K    /data/postgres-9.3/tablespaces/uorsy/PG_VERSION

# find /data/postgres-9.3/tablespaces/uorsy \! -links 1 -type f | wc -l
740


Am I overlooking or is there something really wrong here?

First, all files of a tablespace should be inside PG_9.3_201306121 directory, why do you have those other files? Second, there shouldn't be any hard link inside of a tablespace, as PostgreSQL is not creating them, someone must have done it by hand. I'm guessing all inside PG_9.3_201306121 is linked to the root path of the tablespace, which is wrong.

If I'm not overlooking, then neither barman nor pg_basebackup is to blame, but whoever created the hard links; if PostgreSQL did this (which I doubt) then it is a bug.

Regards,
--
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres

Re: pg_basebackup bug: base backup is double the size of the database

От
Magnus Hagander
Дата:
On Wed, Jan 21, 2015 at 7:19 PM, Matheus de Oliveira <matioli.matheus@gmail.com> wrote:

On Wed, Jan 21, 2015 at 3:32 PM, Craig James <cjames@emolecules.com> wrote:
# ls -l /data/postgres-9.3/main/pg_tblspc/16747
lrwxrwxrwx 1 postgres postgres 27 2014-08-18 11:28 /data/postgres-9.3/main/pg_tblspc/16747 -> /postgres/tablespaces/uorsy/

# du -sh /data/postgres-9.3/tablespaces/uorsy
35G     /data/postgres-9.3/tablespaces/uorsy

# du -sh /data/postgres-9.3/tablespaces/uorsy/*
35G     /data/postgres-9.3/tablespaces/uorsy/8208624
8.1M    /data/postgres-9.3/tablespaces/uorsy/PG_9.3_201306121
4.0K    /data/postgres-9.3/tablespaces/uorsy/pgsql_tmp
4.0K    /data/postgres-9.3/tablespaces/uorsy/PG_VERSION

# find /data/postgres-9.3/tablespaces/uorsy \! -links 1 -type f | wc -l
740


Am I overlooking or is there something really wrong here?

First, all files of a tablespace should be inside PG_9.3_201306121 directory, why do you have those other files? Second, there shouldn't be any hard link inside of a tablespace, as PostgreSQL is not creating them, someone must have done it by hand. I'm guessing all inside PG_9.3_201306121 is linked to the root path of the tablespace, which is wrong.

If I'm not overlooking, then neither barman nor pg_basebackup is to blame, but whoever created the hard links; if PostgreSQL did this (which I doubt) then it is a bug.


Might it be hardlinks created by pg_upgrade? If so, they can just be removed... 

--

Re: pg_basebackup bug: base backup is double the size of the database

От
Matheus de Oliveira
Дата:

On Wed, Jan 21, 2015 at 4:42 PM, Magnus Hagander <magnus@hagander.net> wrote:
Might it be hardlinks created by pg_upgrade? If so, they can just be removed... 

hm... Does pg_upgrade create the links in the root instead of versioned directory? That seems odd to me.


--
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres

Re: pg_basebackup bug: base backup is double the size of the database

От
Craig James
Дата:
On Wed, Jan 21, 2015 at 10:19 AM, Matheus de Oliveira <matioli.matheus@gmail.com> wrote:

On Wed, Jan 21, 2015 at 3:32 PM, Craig James <cjames@emolecules.com> wrote:
# ls -l /data/postgres-9.3/main/pg_tblspc/16747
lrwxrwxrwx 1 postgres postgres 27 2014-08-18 11:28 /data/postgres-9.3/main/pg_tblspc/16747 -> /postgres/tablespaces/uorsy/

# du -sh /data/postgres-9.3/tablespaces/uorsy
35G     /data/postgres-9.3/tablespaces/uorsy

# du -sh /data/postgres-9.3/tablespaces/uorsy/*
35G     /data/postgres-9.3/tablespaces/uorsy/8208624
8.1M    /data/postgres-9.3/tablespaces/uorsy/PG_9.3_201306121
4.0K    /data/postgres-9.3/tablespaces/uorsy/pgsql_tmp
4.0K    /data/postgres-9.3/tablespaces/uorsy/PG_VERSION

# find /data/postgres-9.3/tablespaces/uorsy \! -links 1 -type f | wc -l
740


Am I overlooking or is there something really wrong here?

First, all files of a tablespace should be inside PG_9.3_201306121 directory, why do you have those other files?

They're not mine. Postgres created them.
 
Second, there shouldn't be any hard link inside of a tablespace, as PostgreSQL is not creating them, someone must have done it by hand.

No. Nobody did this. I am quite certain of this. I am the only one with root access to the server, and I guarantee I never did anything like this.
 
I'm guessing all inside PG_9.3_201306121 is linked to the root path of the tablespace, which is wrong.

If I'm not overlooking, then neither barman nor pg_basebackup is to blame, but whoever created the hard links; if PostgreSQL did this (which I doubt) then it is a bug.

As I said, I absolutely didn't create these links. I never, ever monkey with the innards of the Postgres tablespaces. Never have, never will.  It had to be done by Postgres, pg_upgrade or some other Postgres binary.

Thanks,
Craig
 

Regards,
--
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres




--
---------------------------------
Craig A. James
Chief Technology Officer
eMolecules, Inc.
---------------------------------

Re: pg_basebackup bug: base backup is double the size of the database

От
Craig James
Дата:
On Wed, Jan 21, 2015 at 10:42 AM, Magnus Hagander <magnus@hagander.net> wrote:
On Wed, Jan 21, 2015 at 7:19 PM, Matheus de Oliveira <matioli.matheus@gmail.com> wrote:

On Wed, Jan 21, 2015 at 3:32 PM, Craig James <cjames@emolecules.com> wrote:
# ls -l /data/postgres-9.3/main/pg_tblspc/16747
lrwxrwxrwx 1 postgres postgres 27 2014-08-18 11:28 /data/postgres-9.3/main/pg_tblspc/16747 -> /postgres/tablespaces/uorsy/

# du -sh /data/postgres-9.3/tablespaces/uorsy
35G     /data/postgres-9.3/tablespaces/uorsy

# du -sh /data/postgres-9.3/tablespaces/uorsy/*
35G     /data/postgres-9.3/tablespaces/uorsy/8208624
8.1M    /data/postgres-9.3/tablespaces/uorsy/PG_9.3_201306121
4.0K    /data/postgres-9.3/tablespaces/uorsy/pgsql_tmp
4.0K    /data/postgres-9.3/tablespaces/uorsy/PG_VERSION

# find /data/postgres-9.3/tablespaces/uorsy \! -links 1 -type f | wc -l
740


Am I overlooking or is there something really wrong here?

First, all files of a tablespace should be inside PG_9.3_201306121 directory, why do you have those other files? Second, there shouldn't be any hard link inside of a tablespace, as PostgreSQL is not creating them, someone must have done it by hand. I'm guessing all inside PG_9.3_201306121 is linked to the root path of the tablespace, which is wrong.

If I'm not overlooking, then neither barman nor pg_basebackup is to blame, but whoever created the hard links; if PostgreSQL did this (which I doubt) then it is a bug.


Might it be hardlinks created by pg_upgrade? If so, they can just be removed...

OK, but which one? I'm pretty reluctant to do something that would destroy my entire database...

Craig
 


Re: pg_basebackup bug: base backup is double the size of the database

От
Bruce Momjian
Дата:
On Wed, Jan 21, 2015 at 04:50:07PM -0200, Matheus de Oliveira wrote:
>
> On Wed, Jan 21, 2015 at 4:42 PM, Magnus Hagander <magnus@hagander.net> wrote:
>
>     Might it be hardlinks created by pg_upgrade? If so, they can just be
>     removed... 
>
>
> hm... Does pg_upgrade create the links in the root instead of versioned
> directory? That seems odd to me.

Not that I know of.  pg_upgrade does whatever pg_dump does, which is
CREATE TABLESPACE.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +


Re: pg_basebackup bug: base backup is double the size of the database

От
Bruce Momjian
Дата:
On Wed, Jan 21, 2015 at 11:01:15AM -0800, Craig James wrote:
>     If I'm not overlooking, then neither barman nor pg_basebackup is to blame,
>     but whoever created the hard links; if PostgreSQL did this (which I doubt)
>     then it is a bug.
>
>
> As I said, I absolutely didn't create these links. I never, ever monkey with
> the innards of the Postgres tablespaces. Never have, never will.  It had to be
> done by Postgres, pg_upgrade or some other Postgres binary.

If you used --link mode, it does create hard links between the old and
new clusters, which is clearly documented.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +


Re: pg_basebackup bug: base backup is double the size of the database

От
David G Johnston
Дата:
Craig James-2 wrote
> We've encountered a serious bug with pg_basebackup. It seems to be
> following hard links and duplicating all files in the tablespaces rather
> than preserving links.

This entire sentence doesn't make sense to me.  How does one "follow" a
hard-link?  A soft-link yes but a hard-link is an alias to actual data.  I'm
not sure directory hard-linking is even allowed or used so following in that
sense don't compute...


> # ls -l /data/postgres-9.3/main/pg_tblspc/16747
> lrwxrwxrwx 1 postgres postgres 27 2014-08-18 11:28
> /data/postgres-9.3/main/pg_tblspc/16747 -> /postgres/tablespaces/uorsy/
>
> # du -sh /data/postgres-9.3/tablespaces/uorsy
> *35G*     /data/postgres-9.3/tablespaces/uorsy

Your tablespace points to "/postgres/tablespaces/uorsy/" yet you proceed to
show us the contents of "/data/postgres-9.3/tablesapces/uorsy"...


> # du -sh /data/postgres-9.3/tablespaces/uorsy/*
> *35G*     /data/postgres-9.3/tablespaces/uorsy/8208624
> *8.1M*    /data/postgres-9.3/tablespaces/uorsy/PG_9.3_201306121
> 4.0K    /data/postgres-9.3/tablespaces/uorsy/pgsql_tmp
> 4.0K    /data/postgres-9.3/tablespaces/uorsy/PG_VERSION
>
> # find /data/postgres-9.3/tablespaces/uorsy \! -links 1 -type f | wc -l
> *740*
>
> In other words, this tablespace has 35G of real data, plus 740 hard links
> that effectively duplicate each data file.

I can't quite figure out what to make of the above - as others have said it
looks like user error at first glance and we do not have the benefit of
exploring the system or a failing test case to reject that and start
exploring how pg_upgrade (if indeed that is even the culprit) could be at
fault.  Even if you didn't manually create the hard-links some configuration
allowed them to be created where they didn't belong.  It very well could be
something incorrectly allowed but unusual enough that it isn't accounted for
in pg_upgrade et al.  Guessing what exactly that might be is going to be
seen as likely futile effort.  Especially since it could be something as
simple as an errant copy command gone wrong that caused the situation to
exist.


> When we look at the same data in the archive that pg_basebackup creates
> (invoked via barman), we find this:
>
> # du -sh /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747
> *70G*     /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747
>
> # du -sh /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/*
> *35G*
> /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/8208624
> *35G*
> /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/PG_9.3_201306121
> 4.0K
>  /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/pgsql_tmp
> 4.0K
>
> /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/PG_VERSION
>
> # find /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747 \!
> -links 1 -type f | wc -l
> *0*
>
> That is, no hard links, and all of the data files are duplicated.

Of course the backup is going to create it own copy of the files...if it
were to store hard (or soft) links the restoration would fail if the data
being pointed to were to become corrupt.


> And of course, when we try to actually use this archive to recover, it's
> twice the
> size as the original database and doesn't fit on our disks.
>
> My guess is that pg_basebackup is using (or doing the equivalent of)
> rsync(1) without the --hard-links option, and that these hard links were
> created by pg_upgrade when we went from 8.4.17 to 9.3.5.

And how, exactly, did you perform the pg_upgrade.  As mentioned down-thread
pg_upgrade does use hard links; specifically to avoid duplication of data
(in exchange you lose the ability to easily fall back to the old database
version).  I'm doubtful that it, by itself, is contributing to this problem
but again my experience in this area is limited.  But what you have shown us
to this point is far from conclusive.


> What can we do to fix this? The whole cluster is about 350 databases and
> 800GB.

Unfortunately I've gotten as far as I can with the limited, and slightly
conflicting, information provided and the documentation for pg_upgrade and
tablespaces/physical-database-layout.  At first glance there seems to be
some gaps in the documentation but without actually exploring the capability
its only a gut feeling from trying to answer some questions while reading
your post.  But some of that could be not knowing if what you show is
"normal".  Specifically, what is uorsy/8208624 in [...]9.3/tablespaces?

There are two things that can be discovered here:
Is there a bug in pg_upgrade or some other tool that you are using?
How do I manually fix whatever went wrong with your installation?

You likely care more about the former but that likely requires more
interaction that is convenient to provide via e-mail.  You might have better
luck on IRC or with actual support people.  If you truly think this was
caused by a bug then reproducing it in a self-contained script would be most
helpful to the community.

The other, though obviously more costly (in terms of time) fix is to pg_dump
and restore to a clean setup.  That likely is not necessary since your
database is currently operational so some of what you are seeing must be
garbage somehow dumped there at some point in the past.  Others have already
hinted that the hard links are said garbage - now you get to decide whether
to act on that assumption or obtain more information first.

David J.





--
View this message in context:
http://postgresql.nabble.com/pg-basebackup-bug-base-backup-is-double-the-size-of-the-database-tp5834912p5834987.html
Sent from the PostgreSQL - admin mailing list archive at Nabble.com.


Re: Re: pg_basebackup bug: base backup is double the size of the database

От
Jerome VANANDRUEL -CAMPUS-
Дата:
Hi,

I already encountered this kind of problem (sym links become hard links, so twice  the size of the DB), with older version of Barman, which one are you using ?

Jérôme

On Thu, Jan 22, 2015 at 7:02 AM, David G Johnston <david.g.johnston@gmail.com> wrote:
Craig James-2 wrote
> We've encountered a serious bug with pg_basebackup. It seems to be
> following hard links and duplicating all files in the tablespaces rather
> than preserving links.

This entire sentence doesn't make sense to me.  How does one "follow" a
hard-link?  A soft-link yes but a hard-link is an alias to actual data.  I'm
not sure directory hard-linking is even allowed or used so following in that
sense don't compute...


> # ls -l /data/postgres-9.3/main/pg_tblspc/16747
> lrwxrwxrwx 1 postgres postgres 27 2014-08-18 11:28
> /data/postgres-9.3/main/pg_tblspc/16747 -> /postgres/tablespaces/uorsy/
>
> # du -sh /data/postgres-9.3/tablespaces/uorsy
> *35G*     /data/postgres-9.3/tablespaces/uorsy

Your tablespace points to "/postgres/tablespaces/uorsy/" yet you proceed to
show us the contents of "/data/postgres-9.3/tablesapces/uorsy"...


> # du -sh /data/postgres-9.3/tablespaces/uorsy/*
> *35G*     /data/postgres-9.3/tablespaces/uorsy/8208624
> *8.1M*    /data/postgres-9.3/tablespaces/uorsy/PG_9.3_201306121
> 4.0K    /data/postgres-9.3/tablespaces/uorsy/pgsql_tmp
> 4.0K    /data/postgres-9.3/tablespaces/uorsy/PG_VERSION
>
> # find /data/postgres-9.3/tablespaces/uorsy \! -links 1 -type f | wc -l
> *740*
>
> In other words, this tablespace has 35G of real data, plus 740 hard links
> that effectively duplicate each data file.

I can't quite figure out what to make of the above - as others have said it
looks like user error at first glance and we do not have the benefit of
exploring the system or a failing test case to reject that and start
exploring how pg_upgrade (if indeed that is even the culprit) could be at
fault.  Even if you didn't manually create the hard-links some configuration
allowed them to be created where they didn't belong.  It very well could be
something incorrectly allowed but unusual enough that it isn't accounted for
in pg_upgrade et al.  Guessing what exactly that might be is going to be
seen as likely futile effort.  Especially since it could be something as
simple as an errant copy command gone wrong that caused the situation to
exist.


> When we look at the same data in the archive that pg_basebackup creates
> (invoked via barman), we find this:
>
> # du -sh /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747
> *70G*     /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747
>
> # du -sh /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/*
> *35G*
> /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/8208624
> *35G*
> /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/PG_9.3_201306121
> 4.0K
>  /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/pgsql_tmp
> 4.0K
>
> /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/PG_VERSION
>
> # find /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747 \!
> -links 1 -type f | wc -l
> *0*
>
> That is, no hard links, and all of the data files are duplicated.

Of course the backup is going to create it own copy of the files...if it
were to store hard (or soft) links the restoration would fail if the data
being pointed to were to become corrupt.


> And of course, when we try to actually use this archive to recover, it's
> twice the
> size as the original database and doesn't fit on our disks.
>
> My guess is that pg_basebackup is using (or doing the equivalent of)
> rsync(1) without the --hard-links option, and that these hard links were
> created by pg_upgrade when we went from 8.4.17 to 9.3.5.

And how, exactly, did you perform the pg_upgrade.  As mentioned down-thread
pg_upgrade does use hard links; specifically to avoid duplication of data
(in exchange you lose the ability to easily fall back to the old database
version).  I'm doubtful that it, by itself, is contributing to this problem
but again my experience in this area is limited.  But what you have shown us
to this point is far from conclusive.


> What can we do to fix this? The whole cluster is about 350 databases and
> 800GB.

Unfortunately I've gotten as far as I can with the limited, and slightly
conflicting, information provided and the documentation for pg_upgrade and
tablespaces/physical-database-layout.  At first glance there seems to be
some gaps in the documentation but without actually exploring the capability
its only a gut feeling from trying to answer some questions while reading
your post.  But some of that could be not knowing if what you show is
"normal".  Specifically, what is uorsy/8208624 in [...]9.3/tablespaces?

There are two things that can be discovered here:
Is there a bug in pg_upgrade or some other tool that you are using?
How do I manually fix whatever went wrong with your installation?

You likely care more about the former but that likely requires more
interaction that is convenient to provide via e-mail.  You might have better
luck on IRC or with actual support people.  If you truly think this was
caused by a bug then reproducing it in a self-contained script would be most
helpful to the community.

The other, though obviously more costly (in terms of time) fix is to pg_dump
and restore to a clean setup.  That likely is not necessary since your
database is currently operational so some of what you are seeing must be
garbage somehow dumped there at some point in the past.  Others have already
hinted that the hard links are said garbage - now you get to decide whether
to act on that assumption or obtain more information first.

David J.





--
View this message in context: http://postgresql.nabble.com/pg-basebackup-bug-base-backup-is-double-the-size-of-the-database-tp5834912p5834987.html
Sent from the PostgreSQL - admin mailing list archive at Nabble.com.


--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: Re: pg_basebackup bug: base backup is double the size of the database

От
Craig James
Дата:


On Wed, Jan 21, 2015 at 10:02 PM, David G Johnston <david.g.johnston@gmail.com> wrote:
Craig James-2 wrote
> We've encountered a serious bug with pg_basebackup. It seems to be
> following hard links and duplicating all files in the tablespaces rather
> than preserving links.

This entire sentence doesn't make sense to me.  How does one "follow" a
hard-link?  A soft-link yes but a hard-link is an alias to actual data.  I'm
not sure directory hard-linking is even allowed or used so following in that
sense don't compute...

See the man page for rsync, the -H option, which explains it better:

       -H, --hard-links
              This  tells  rsync to look for hard-linked files in the transfer
              and link together the corresponding files on the receiving side.
              Without  this  option,  hard-linked  files  in  the transfer are
              treated as though they were separate files.

> My guess is that pg_basebackup is using (or doing the equivalent of)
> rsync(1) without the --hard-links option, and that these hard links were
> created by pg_upgrade when we went from 8.4.17 to 9.3.5.

And how, exactly, did you perform the pg_upgrade.  As mentioned down-thread
pg_upgrade does use hard links; specifically to avoid duplication of data
(in exchange you lose the ability to easily fall back to the old database
version).  I'm doubtful that it, by itself, is contributing to this problem
but again my experience in this area is limited.  But what you have shown us
to this point is far from conclusive.

I'm pretty sure I understand how this happened, but it's speculation.

This database live in /data/postgres-9.3, but PGDATA points to /postgres, which is a symbolic link to /data/postgres, which is a symbolic link to postgres-9.3. The tablespace are all in /data/postgres-9.3/tablespaces, but in the pg_tblspc directory, it's symbolic links to /postgres/tablespaces (which in fact resolve correctly), for example:

# ls -l /data/postgres-9.3/main/pg_tblspc/16747
lrwxrwxrwx 1 postgres postgres 27 2014-08-18 11:28 /data/postgres-9.3/main/pg_tblspc/16747 -> /postgres/tablespaces/uorsy/

Normally when pg_upgrade runs, you end up with two parallel directory hierarchies, and $PGDATA points to the new one when you're done. But because of the way our symbolic links work, both the new and the old directories are in the /data/postgres-9.3/tablespaces directory. You can't simply delete the old $PGDATA directory, because that would erase the entire database.

I'll have to dig around to prove to myself that this is the case.

Craig

Re: Re: pg_basebackup bug: base backup is double the size of the database

От
Craig James
Дата:
On Thu, Jan 22, 2015 at 2:00 AM, Jerome VANANDRUEL -CAMPUS- <jerome.vanandruel@decathlon.com> wrote:
Hi,

I already encountered this kind of problem (sym links become hard links, so twice  the size of the DB), with older version of Barman, which one are you using ?

I'll have to check the version.

In the mean time we fixed the problem by editing barman and adding "-H" to its rsync options.

Craig
 


Jérôme

On Thu, Jan 22, 2015 at 7:02 AM, David G Johnston <david.g.johnston@gmail.com> wrote:
Craig James-2 wrote
> We've encountered a serious bug with pg_basebackup. It seems to be
> following hard links and duplicating all files in the tablespaces rather
> than preserving links.

This entire sentence doesn't make sense to me.  How does one "follow" a
hard-link?  A soft-link yes but a hard-link is an alias to actual data.  I'm
not sure directory hard-linking is even allowed or used so following in that
sense don't compute...


> # ls -l /data/postgres-9.3/main/pg_tblspc/16747
> lrwxrwxrwx 1 postgres postgres 27 2014-08-18 11:28
> /data/postgres-9.3/main/pg_tblspc/16747 -> /postgres/tablespaces/uorsy/
>
> # du -sh /data/postgres-9.3/tablespaces/uorsy
> *35G*     /data/postgres-9.3/tablespaces/uorsy

Your tablespace points to "/postgres/tablespaces/uorsy/" yet you proceed to
show us the contents of "/data/postgres-9.3/tablesapces/uorsy"...


> # du -sh /data/postgres-9.3/tablespaces/uorsy/*
> *35G*     /data/postgres-9.3/tablespaces/uorsy/8208624
> *8.1M*    /data/postgres-9.3/tablespaces/uorsy/PG_9.3_201306121
> 4.0K    /data/postgres-9.3/tablespaces/uorsy/pgsql_tmp
> 4.0K    /data/postgres-9.3/tablespaces/uorsy/PG_VERSION
>
> # find /data/postgres-9.3/tablespaces/uorsy \! -links 1 -type f | wc -l
> *740*
>
> In other words, this tablespace has 35G of real data, plus 740 hard links
> that effectively duplicate each data file.

I can't quite figure out what to make of the above - as others have said it
looks like user error at first glance and we do not have the benefit of
exploring the system or a failing test case to reject that and start
exploring how pg_upgrade (if indeed that is even the culprit) could be at
fault.  Even if you didn't manually create the hard-links some configuration
allowed them to be created where they didn't belong.  It very well could be
something incorrectly allowed but unusual enough that it isn't accounted for
in pg_upgrade et al.  Guessing what exactly that might be is going to be
seen as likely futile effort.  Especially since it could be something as
simple as an errant copy command gone wrong that caused the situation to
exist.


> When we look at the same data in the archive that pg_basebackup creates
> (invoked via barman), we find this:
>
> # du -sh /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747
> *70G*     /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747
>
> # du -sh /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/*
> *35G*
> /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/8208624
> *35G*
> /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/PG_9.3_201306121
> 4.0K
>  /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/pgsql_tmp
> 4.0K
>
> /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/PG_VERSION
>
> # find /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747 \!
> -links 1 -type f | wc -l
> *0*
>
> That is, no hard links, and all of the data files are duplicated.

Of course the backup is going to create it own copy of the files...if it
were to store hard (or soft) links the restoration would fail if the data
being pointed to were to become corrupt.


> And of course, when we try to actually use this archive to recover, it's
> twice the
> size as the original database and doesn't fit on our disks.
>
> My guess is that pg_basebackup is using (or doing the equivalent of)
> rsync(1) without the --hard-links option, and that these hard links were
> created by pg_upgrade when we went from 8.4.17 to 9.3.5.

And how, exactly, did you perform the pg_upgrade.  As mentioned down-thread
pg_upgrade does use hard links; specifically to avoid duplication of data
(in exchange you lose the ability to easily fall back to the old database
version).  I'm doubtful that it, by itself, is contributing to this problem
but again my experience in this area is limited.  But what you have shown us
to this point is far from conclusive.


> What can we do to fix this? The whole cluster is about 350 databases and
> 800GB.

Unfortunately I've gotten as far as I can with the limited, and slightly
conflicting, information provided and the documentation for pg_upgrade and
tablespaces/physical-database-layout.  At first glance there seems to be
some gaps in the documentation but without actually exploring the capability
its only a gut feeling from trying to answer some questions while reading
your post.  But some of that could be not knowing if what you show is
"normal".  Specifically, what is uorsy/8208624 in [...]9.3/tablespaces?

There are two things that can be discovered here:
Is there a bug in pg_upgrade or some other tool that you are using?
How do I manually fix whatever went wrong with your installation?

You likely care more about the former but that likely requires more
interaction that is convenient to provide via e-mail.  You might have better
luck on IRC or with actual support people.  If you truly think this was
caused by a bug then reproducing it in a self-contained script would be most
helpful to the community.

The other, though obviously more costly (in terms of time) fix is to pg_dump
and restore to a clean setup.  That likely is not necessary since your
database is currently operational so some of what you are seeing must be
garbage somehow dumped there at some point in the past.  Others have already
hinted that the hard links are said garbage - now you get to decide whether
to act on that assumption or obtain more information first.

David J.





--
View this message in context: http://postgresql.nabble.com/pg-basebackup-bug-base-backup-is-double-the-size-of-the-database-tp5834912p5834987.html
Sent from the PostgreSQL - admin mailing list archive at Nabble.com.


--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin




--
---------------------------------
Craig A. James
Chief Technology Officer
eMolecules, Inc.
---------------------------------

Re: Re: pg_basebackup bug: base backup is double the size of the database

От
David Johnston
Дата:
On Thursday, January 22, 2015, Craig James <cjames@emolecules.com> wrote:


On Wed, Jan 21, 2015 at 10:02 PM, David G Johnston <david.g.johnston@gmail.com> wrote:
Craig James-2 wrote
> We've encountered a serious bug with pg_basebackup. It seems to be
> following hard links and duplicating all files in the tablespaces rather
> than preserving links.

This entire sentence doesn't make sense to me.  How does one "follow" a
hard-link?  A soft-link yes but a hard-link is an alias to actual data.  I'm
not sure directory hard-linking is even allowed or used so following in that
sense don't compute...

See the man page for rsync, the -H option, which explains it better:

       -H, --hard-links
              This  tells  rsync to look for hard-linked files in the transfer
              and link together the corresponding files on the receiving side.
              Without  this  option,  hard-linked  files  in  the transfer are
              treated as though they were separate files.


Which makes sense in a full system backup but a single-cluster backup should not (I think) have any situations where a file and a matching hard link are both within the same source structure.  The -H option should not be needed because the scenario it solves is not expected to exist.  That it does either means user error or a use-case that hasn't been considered.  It seems improvements could be made here but a reliable test case describing the specific setup is needed first.

David J.

 

Re: Re: pg_basebackup bug: base backup is double the size of the database

От
Jerome VANANDRUEL -CAMPUS-
Дата:
Hi,

I already encountered this kind of problem (sym links become hard links, so twice  the size of DB), with older version of Barman, which one are you using ?

Jérôme

On Thu, Jan 22, 2015 at 7:02 AM, David G Johnston <david.g.johnston@gmail.com> wrote:
Craig James-2 wrote
> We've encountered a serious bug with pg_basebackup. It seems to be
> following hard links and duplicating all files in the tablespaces rather
> than preserving links.

This entire sentence doesn't make sense to me.  How does one "follow" a
hard-link?  A soft-link yes but a hard-link is an alias to actual data.  I'm
not sure directory hard-linking is even allowed or used so following in that
sense don't compute...


> # ls -l /data/postgres-9.3/main/pg_tblspc/16747
> lrwxrwxrwx 1 postgres postgres 27 2014-08-18 11:28
> /data/postgres-9.3/main/pg_tblspc/16747 -> /postgres/tablespaces/uorsy/
>
> # du -sh /data/postgres-9.3/tablespaces/uorsy
> *35G*     /data/postgres-9.3/tablespaces/uorsy

Your tablespace points to "/postgres/tablespaces/uorsy/" yet you proceed to
show us the contents of "/data/postgres-9.3/tablesapces/uorsy"...


> # du -sh /data/postgres-9.3/tablespaces/uorsy/*
> *35G*     /data/postgres-9.3/tablespaces/uorsy/8208624
> *8.1M*    /data/postgres-9.3/tablespaces/uorsy/PG_9.3_201306121
> 4.0K    /data/postgres-9.3/tablespaces/uorsy/pgsql_tmp
> 4.0K    /data/postgres-9.3/tablespaces/uorsy/PG_VERSION
>
> # find /data/postgres-9.3/tablespaces/uorsy \! -links 1 -type f | wc -l
> *740*
>
> In other words, this tablespace has 35G of real data, plus 740 hard links
> that effectively duplicate each data file.

I can't quite figure out what to make of the above - as others have said it
looks like user error at first glance and we do not have the benefit of
exploring the system or a failing test case to reject that and start
exploring how pg_upgrade (if indeed that is even the culprit) could be at
fault.  Even if you didn't manually create the hard-links some configuration
allowed them to be created where they didn't belong.  It very well could be
something incorrectly allowed but unusual enough that it isn't accounted for
in pg_upgrade et al.  Guessing what exactly that might be is going to be
seen as likely futile effort.  Especially since it could be something as
simple as an errant copy command gone wrong that caused the situation to
exist.


> When we look at the same data in the archive that pg_basebackup creates
> (invoked via barman), we find this:
>
> # du -sh /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747
> *70G*     /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747
>
> # du -sh /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/*
> *35G*
> /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/8208624
> *35G*
> /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/PG_9.3_201306121
> 4.0K
>  /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/pgsql_tmp
> 4.0K
>
> /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/PG_VERSION
>
> # find /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747 \!
> -links 1 -type f | wc -l
> *0*
>
> That is, no hard links, and all of the data files are duplicated.

Of course the backup is going to create it own copy of the files...if it
were to store hard (or soft) links the restoration would fail if the data
being pointed to were to become corrupt.


> And of course, when we try to actually use this archive to recover, it's
> twice the
> size as the original database and doesn't fit on our disks.
>
> My guess is that pg_basebackup is using (or doing the equivalent of)
> rsync(1) without the --hard-links option, and that these hard links were
> created by pg_upgrade when we went from 8.4.17 to 9.3.5.

And how, exactly, did you perform the pg_upgrade.  As mentioned down-thread
pg_upgrade does use hard links; specifically to avoid duplication of data
(in exchange you lose the ability to easily fall back to the old database
version).  I'm doubtful that it, by itself, is contributing to this problem
but again my experience in this area is limited.  But what you have shown us
to this point is far from conclusive.


> What can we do to fix this? The whole cluster is about 350 databases and
> 800GB.

Unfortunately I've gotten as far as I can with the limited, and slightly
conflicting, information provided and the documentation for pg_upgrade and
tablespaces/physical-database-layout.  At first glance there seems to be
some gaps in the documentation but without actually exploring the capability
its only a gut feeling from trying to answer some questions while reading
your post.  But some of that could be not knowing if what you show is
"normal".  Specifically, what is uorsy/8208624 in [...]9.3/tablespaces?

There are two things that can be discovered here:
Is there a bug in pg_upgrade or some other tool that you are using?
How do I manually fix whatever went wrong with your installation?

You likely care more about the former but that likely requires more
interaction that is convenient to provide via e-mail.  You might have better
luck on IRC or with actual support people.  If you truly think this was
caused by a bug then reproducing it in a self-contained script would be most
helpful to the community.

The other, though obviously more costly (in terms of time) fix is to pg_dump
and restore to a clean setup.  That likely is not necessary since your
database is currently operational so some of what you are seeing must be
garbage somehow dumped there at some point in the past.  Others have already
hinted that the hard links are said garbage - now you get to decide whether
to act on that assumption or obtain more information first.

David J.





--
View this message in context: http://postgresql.nabble.com/pg-basebackup-bug-base-backup-is-double-the-size-of-the-database-tp5834912p5834987.html
Sent from the PostgreSQL - admin mailing list archive at Nabble.com.


--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin