Обсуждение: Data duplication when moving datafiles from one server to another.

Поиск
Список
Период
Сортировка

Data duplication when moving datafiles from one server to another.

От
Iñigo Martinez Lasala
Дата:
Good evening.

Yesterday we experienced some data duplication in several database tables after one sysadmin decided to test in a production environment an rsync script in order to migrate a database from one server to another one.
Postgresql (8.2) was running in source server and rsync script was launched from second one. Second one server had a one day old copy of the same database. Rsync script create a datafile replica in destination server.
Our sysadmin swear he didn't launch the script in a reverse way (that is, from destination to source)... so my question is
How this data duplication could happen? Due to an rsync lock in checkpoint segment or transaction logs? Or he has mistaken source with destination server?

Thans in advance.


Re: Data duplication when moving datafiles from one server to another.

От
"Kevin Grittner"
Дата:
IñigoMartinez Lasala <imartinez@vectorsf.com> wrote:

> Yesterday we experienced some data duplication in several database
> tables after one sysadmin decided to test in a production
> environment an rsync script in order to migrate a database from
> one server to another one.
> Postgresql (8.2) was running in source server and rsync script was
> launched from second one. Second one server had a one day old copy
> of the same database. Rsync script create a datafile replica in
> destination server.
> Our sysadmin swear he didn't launch the script in a reverse way
> (that is, from destination to source)... so my question is
> How this data duplication could happen?

Where did this duplication happen -- source or destination?  If in
the source, I would suspect operator error by your sysadmin; I have
never seen rsync cause problems with the source files.  If on the
target, I would suspect that you didn't follow PITR recovery
techniques properly.

If you need to follow up on this, please include a few more details,
like the exact PostgreSQL version from `select version();` and the
exact version of the OS.  The exact rsync statement(s) used would be
good to see, too -- maybe from the bash history (or similar) on the
machines.

-Kevin

Re: Data duplication when moving datafiles from one server to another.

От
Scott Marlowe
Дата:
On Mon, Dec 20, 2010 at 12:22 PM, Iñigo Martinez Lasala
<imartinez@vectorsf.com> wrote:
> Good evening.
>
> Yesterday we experienced some data duplication in several database tables
> after one sysadmin decided to test in a production environment an rsync
> script in order to migrate a database from one server to another one.
> Postgresql (8.2) was running in source server and rsync script was launched
> from second one. Second one server had a one day old copy of the same
> database. Rsync script create a datafile replica in destination server.
> Our sysadmin swear he didn't launch the script in a reverse way (that is,
> from destination to source)... so my question is
> How this data duplication could happen? Due to an rsync lock in checkpoint
> segment or transaction logs? Or he has mistaken source with destination
> server?

I use this method all the time to backup big dbs from one machine to another:

(on remote machine):
/etc/init.d/postgresql stop

(on main server):
rsync -avl --delete /data/* backupserver:/data/* # takes forever
sudo /etc/init.d/postgresql stop
rsync -avl --delete /data/* backupserver:/data/* # fast, we're just catching up
sudo /etc/init.d/postgresql start

(on remote machine):
/etc/init.d/postgresql start

You can run such a script at 2 in the morning with a few minutes
downtime without having to jump through a lot of hoops.

So, can you show us what commands your sysadmin / dba actually typed?
My guess is he left out the --delete or left the backup server db up
and running at the time of the rsync.  It needs to be shut down to do
this right.

Re: Data duplication when moving datafiles from one server to another.

От
Iñigo Martinez Lasala
Дата:
Hi Kevin, Scott.

Thank you for your response.

Data duplication happened on source server (that is, on production environment). This data duplication affected only on some tables, with some duplicated PKs and altered sequences (sequences value didn't match "select max()" and we had to fix it manually).
Postgres version is 8.2.7 64bit on Ubuntu Server 8.04 LTS

Rsync command was:
rsync -avt -lHpogDtx /srv/postgresql/8.2/main/ root@destination_server:/srv/postgresql/8.2/main/

and was launched from source server.

Remember: source server was running.

If rsync sentence was launched correctly (I assume yes), could a temporary lock over a checkpoint segment cause this behaviour? 

We have managed to delete duplicated registers and restore incorrect sequences, but I don't really know what happened. Since I've been indirect responsable from this (sysadmin reports me) I need to clarify the cause. I need an valid explanation for our client.


-----Original Message-----
From: Kevin Grittner <Kevin.Grittner@wicourts.gov>
To: pgsql-admin@postgresql.org, IñigoMartinez Lasala <imartinez@vectorsf.com>
Subject: Re: [ADMIN] Data duplication when moving datafiles from one server to another.
Date: Mon, 20 Dec 2010 13:43:01 -0600

IñigoMartinez Lasala <imartinez@vectorsf.com> wrote:
> Yesterday we experienced some data duplication in several database
> tables after one sysadmin decided to test in a production
> environment an rsync script in order to migrate a database from
> one server to another one.
> Postgresql (8.2) was running in source server and rsync script was
> launched from second one. Second one server had a one day old copy
> of the same database. Rsync script create a datafile replica in
> destination server. 
> Our sysadmin swear he didn't launch the script in a reverse way
> (that is, from destination to source)... so my question is
> How this data duplication could happen?
Where did this duplication happen -- source or destination?  If in
the source, I would suspect operator error by your sysadmin; I have
never seen rsync cause problems with the source files.  If on the
target, I would suspect that you didn't follow PITR recovery
techniques properly.
If you need to follow up on this, please include a few more details,
like the exact PostgreSQL version from `select version();` and the
exact version of the OS.  The exact rsync statement(s) used would be
good to see, too -- maybe from the bash history (or similar) on the
machines.
-Kevin

Re: Data duplication when moving datafiles from one server to another.

От
"Kevin Grittner"
Дата:
IñigoMartinez Lasala <imartinez@vectorsf.com> wrote:

> Data duplication happened on source server (that is, on production
> environment). This data duplication affected only on some tables,
> with some duplicated PKs and altered sequences (sequences value
> didn't match "select max()" and we had to fix it manually).

We rsync from production servers all the time, and have never seen
such problems.  Maybe someone else has ideas beyond my long shots
(below).

> Postgres version is 8.2.7 64bit on Ubuntu Server 8.04 LTS

There have been 12 bug fix releases for the 8.2 major release since
8.2.7.  It's not impossible that you somehow hit a bug which has
been fixed in the last 33 months.

http://www.postgresql.org/support/versioning

http://www.postgresql.org/docs/8.2/static/release.html

> Rsync command was:
> rsync -avt -lHpogDtx /srv/postgresql/8.2/main/
> root@destination_server:/srv/postgresql/8.2/main/

Did it run successfully on the first try, or might there have been
an attempt where the target was mis-specified?  A space instead of a
colon between the target server and path might have caused the kind
of corruption you're seeing.

Also -- I would avoid using root for such things.  We have remote
root login disabled entirely on our machines, and limit the use of
that login to situations where there is no reasonable alternative.

> and was launched from source server.

I find it's harder to accidentally damage the remote copy, so I run
rsync from the target where possible.

-Kevin