Re: pg_dump and thousands of schemas

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: pg_dump and thousands of schemas
Дата
Msg-id 28392.1337976170@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: pg_dump and thousands of schemas  (Jeff Janes <jeff.janes@gmail.com>)
Ответы Re: pg_dump and thousands of schemas  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-performance
Jeff Janes <jeff.janes@gmail.com> writes:
> For dumping entire databases, It looks like the biggest problem is
> going to be LockReassignCurrentOwner in the server.  And that doesn't
> seem to be easy to fix, as any change to it to improve pg_dump will
> risk degrading normal use cases.

> If we want to be able to efficiently dump entire databases in a
> scalable way, it seems like there should be some way to obtain a
> data-base-wide AccessShare lock, which blocks AccessExclusive locks on
> any object in the database, and turns ordinary object-level
> AccessShare lock requests into no-ops.

I thought a little bit about that, but it seems fairly unworkable.
In the first place, pg_dump doesn't necessarily want lock on every table
in the database.  In the second, such a lock mechanism would have
logical difficulties, notably whether it would be considered to apply to
tables created after the lock request occurs.  If it does, then it would
effectively block all such creations (since creation takes exclusive
locks that ought to conflict).  If it doesn't, how would you implement
that?  In any case, we'd be adding significant cost and complexity to
lock acquisition operations, for something that only whole-database
pg_dump operations could conceivably make use of.

As far as the specific problem at hand goes, I think there might be a
less invasive solution.  I poked into the behavior with gdb (and you're
right, LockReassignCurrentOwner does get called during portal drop)
and noted that although pg_dump is indeed holding thousands of locks,
any given statement that it issues touches only a few of them.  So the
loop in LockReassignCurrentOwner iterates over the whole lock table but
does something useful at only a few entries.

We could fix things for this usage pattern with what seems to me to
be a pretty low-overhead method: add a fixed-size array to
ResourceOwners in which we can remember up to N LOCALLOCKs, for N around
10 or so.  Add a LOCALLOCK to that array when we add the ResourceOwner to
that LOCALLOCK, so long as the array hasn't overflowed.  (If the array
does overflow, we mark it as overflowed and stop adding entries.)  Then,
in LockReassignCurrentOwner, we only iterate over the whole hash table
if the ResourceOwner's array has overflowed.  If it hasn't, use the
array to visit just the LOCALLOCKs that need work.

Comments?

            regards, tom lane

В списке pgsql-performance по дате отправления:

Предыдущее
От: Jeff Janes
Дата:
Сообщение: Re: pg_dump and thousands of schemas
Следующее
От: Jeff Janes
Дата:
Сообщение: Re: pg_dump and thousands of schemas