Обсуждение: pg_dump -n switch lock schema from dml/ddl?

Поиск
Список
Период
Сортировка

pg_dump -n switch lock schema from dml/ddl?

От
Tony Capobianco
Дата:
I'm testing out various pg_dump scenarios using the -n switch and I have
a few questions:

- When using the -n switch, is the whole schema locked from all non-read
DML/DDL operations?

- If the whole schema is locked, once each table is dumped, is it then
released for non-read DML/DDL operations?

- Once the final table is dumped
(i.e. pg_dump: dumping contents of table zip_data),
are there additional background processes that are still performing
maintenance tasks?  Or is the entire process complete and all objects
are released?


I'm asking because I have a schema with a large table with many indexes
that is consuming the majority of the dump.  This version of the dump
takes about 4 hours.
As a solution, we run 2 separate dumps in parallel, one with the schema
excluding the large table and one including only the large table.
The option with just the large table takes 2.5 hours.  However, the
option with the schema excluding the large table still takes 4 hours.
If pg_dump locks each table individually, then releases when the dump is
completed, I must be encountering lock contention.
Also, I use the -v switch, however I'm not getting any information on
how long the dump of each object takes, is there an option that exists
where I can collect this information in the log file?

Thanks.
Tony


Re: pg_dump -n switch lock schema from dml/ddl?

От
Jeff Davis
Дата:
On Thu, 2011-11-10 at 11:46 -0500, Tony Capobianco wrote:
> I'm testing out various pg_dump scenarios using the -n switch and I have
> a few questions:
>
> - When using the -n switch, is the whole schema locked from all non-read
> DML/DDL operations?

No.

> - Once the final table is dumped
> (i.e. pg_dump: dumping contents of table zip_data),
> are there additional background processes that are still performing
> maintenance tasks?  Or is the entire process complete and all objects
> are released?

No background work results from using pg_dump.

> I'm asking because I have a schema with a large table with many indexes
> that is consuming the majority of the dump.  This version of the dump
> takes about 4 hours.
> As a solution, we run 2 separate dumps in parallel, one with the schema
> excluding the large table and one including only the large table.

FYI: you need to be a little careful running two pg_dumps in parallel.
It may (though not necessarily) increase the speed, but it also means
that you get different snapshots for the big table and all the rest of
the data.

Ordinarily, you only want on snapshot so that it's a single
point-in-time for all of the dumped data. Otherwise, you may have
inconsistent data.

> The option with just the large table takes 2.5 hours.  However, the
> option with the schema excluding the large table still takes 4 hours.
> If pg_dump locks each table individually, then releases when the dump is
> completed, I must be encountering lock contention.
> Also, I use the -v switch, however I'm not getting any information on
> how long the dump of each object takes, is there an option that exists
> where I can collect this information in the log file?

If you'd like to know what's happening on your system, the best way is
to start out with (while the pg_dumps are running):

  SELECT * FROM pg_stat_activity;

in a separate client connection. If the "waiting" flag is true on one
query for a significant amount of time, it may be lock-related.

Regards,
    Jeff Davis