Обсуждение: How do i compare 2 postgresql instanses ?

Поиск
Список
Период
Сортировка

How do i compare 2 postgresql instanses ?

От
Potluri Srikanth
Дата:

How do i compare 2 postgresql instanses ?
1) compare all objects including data in tables a) production box b) DR site ?

can you help me with the query, through which i can get all the object of each schema/database ?



Regards,
Srikanth k Potluri

Re: How do i compare 2 postgresql instanses ?

От
Achilleas Mantzios
Дата:
Στις Tuesday 08 July 2008 15:42:28 ο/η Potluri Srikanth έγραψε:
>
>  How do i compare 2 postgresql instanses ?
>  1) compare all objects including data in tables a) production box b)
> DR site ?
>  can you help me with the query, through which i can get all the
> object of each schema/database ?
>  Regards,
>  Srikanth k Potluri
>

you could write a little program (awk) to produce an sql file of the form:
select count(*) from airlines;
select count(*) from airports;
select count(*) from airticket;
select count(*) from airticketflights;
....
select count(*) from zzzobjects;

and then run it against the two databases, and diff the outputs.
If the outputs are not identical then you have 2 different databases,
otherwise you have to search more to know the answer.
So this technique can prove that yours DBs are *not* identical, but does not say anything
about the opposite.

--
Achilleas Mantzios

Re: How do i compare 2 postgresql instanses ?

От
"Michael Nacos"
Дата:
Hello there. Simple counts can certainly prove two databases are not the same version. But what if you actually want to look at the data? I have been thinking about row checksums, would you say this would be a reasonable way of verifying two databases are equivalent? I have many other things on my mind, so I haven't tried implementing this yet.

Michael


you could write a little program (awk) to produce an sql file of the form:
select count(*) from airlines;
select count(*) from airports;
select count(*) from airticket;
select count(*) from airticketflights;
....
select count(*) from zzzobjects;

and then run it against the two databases, and diff the outputs.
If the outputs are not identical then you have 2 different databases,
otherwise you have to search more to know the answer.
So this technique can prove that yours DBs are *not* identical, but does not say anything
about the opposite.

--
Achilleas Mantzios


Re: How do i compare 2 postgresql instanses ?

От
"Kevin Grittner"
Дата:
>>> "Michael Nacos" <m.nacos@gmail.com> wrote:
> what if you actually want to look at the data? I have been
> thinking about row checksums, would you say this would be a
reasonable way
> of verifying two databases are equivalent?

Our home-grown replication technique does synchronization checks
during idle time.  It works pretty well for us, and we do use md5sum
techniques to minimize bandwidth.

In summary, a process requests a middle tier on a "source" machine to
send "sync" information; that middle tier reads some set of rows and
sends the md5 sum of all data in the rows, along with information on
what rows from which table were used to develop the number.  The
requesting process then asks similar middle tiers on all replication
targets to return their md5sum for the same set of rows.  If there are
any differences, the source and all differing replication targets are
queried for the actual values, which are compared using match-merge
type logic on the primary key sequence.

Something similar might work for you.

-Kevin

Re: How do i compare 2 postgresql instanses ?

От
"Michael Nacos"
Дата:

Thanks Kevin... I am working on something different right now but I'll soon have to give it a go, one way or the other...

Michael

On Tue, Jul 8, 2008 at 3:28 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:

Our home-grown replication technique does synchronization checks
during idle time.  It works pretty well for us, and we do use md5sum
techniques to minimize bandwidth.

In summary, a process requests a middle tier on a "source" machine to
send "sync" information; that middle tier reads some set of rows and
sends the md5 sum of all data in the rows, along with information on
what rows from which table were used to develop the number.  The
requesting process then asks similar middle tiers on all replication
targets to return their md5sum for the same set of rows.  If there are
any differences, the source and all differing replication targets are
queried for the actual values, which are compared using match-merge
type logic on the primary key sequence.

Something similar might work for you.

-Kevin