Re: Scriptable way to validate a pg_dump restore ?

Поиск
Список
Период
Сортировка
От Adrian Klaver
Тема Re: Scriptable way to validate a pg_dump restore ?
Дата
Msg-id e0fa9915-9356-45e6-891f-d24b6a274e7e@aklaver.com
обсуждение исходный текст
Ответ на Scriptable way to validate a pg_dump restore ?  (Laura Smith <n5d9xq3ti233xiyif2vp@protonmail.ch>)
Ответы Re: Scriptable way to validate a pg_dump restore ?
Список pgsql-general
On 1/29/24 00:12, Laura Smith wrote:
> Hi
> 
> Let's say I've got a scenario where I'm doing a pg_dump replication rather than online streaming, e.g. due to air-gap
orwhatever.
 
> 
> Is there a scriptable way to validate the restore ?  e.g. using doing something clever with ctid or something to
ensureboth the schema and all its rows were restored to the same point at which the dump was taken ?
 

Assuming you are using pg_restore on a non-plain text dump file then
from pg_restore.c:

* pg_restore.c

*      pg_restore is an utility extracting postgres database definitions
  *      from a backup archive created by pg_dump using the archiver
  *      interface.
  *
  *      pg_restore will read the backup archive and
  *      dump out a script that reproduces
  *      the schema of the database in terms of
  *                user-defined types
  *                user-defined functions
  *                tables
  *                indexes
  *                aggregates
  *                operators
  *                ACL - grant/revoke
  *
  * the output script is SQL that is understood by PostgreSQL
  *
  * Basic process in a restore operation is:
  *
  *      Open the Archive and read the TOC.
  *      Set flags in TOC entries, and *maybe* reorder them.
  *      Generate script to stdout
  *      Exit

Then:

    pg_restore -l -f <output_file> <dump_file>

to get the TOC mentioned above. Walk through that to verify schema is 
the same in the restored database.

This will not tell you whether all the data was transferred. You will 
either have to trust from pg_dump.c:

  *      pg_dump will read the system catalogs in a database and dump out a
  *      script that reproduces the schema in terms of SQL that is 
understood
  *      by PostgreSQL
  *
  *      Note that pg_dump runs in a transaction-snapshot mode transaction,
  *      so it sees a consistent snapshot of the database including system
  *      catalogs. However, it relies in part on various specialized backend
  *      functions like pg_get_indexdef(), and those things tend to look at
  *      the currently committed state.  So it is possible to get 'cache
  *      lookup failed' error if someone performs DDL changes while a 
dump is
  *      happening. The window for this sort of thing is from the 
acquisition
  *      of the transaction snapshot to getSchemaData() (when pg_dump 
acquires
  *      AccessShareLock on every table it intends to dump). It isn't 
very large,
  *      but it can happen.

Or come up with way to capture the state of the data at the time of dump 
and then compare to restored database. Something like Ron posted.

> 
> Thanks !
> 
> Laura
> 
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com




В списке pgsql-general по дате отправления:

Предыдущее
От: Ron Johnson
Дата:
Сообщение: Re: Query performance in 9.6.24 vs 14.10
Следующее
От: Ron Johnson
Дата:
Сообщение: Re: Query performance in 9.6.24 vs 14.10