Обсуждение: Re: [PERFORM] pg_dump and thousands of schemas

Поиск
Список
Период
Сортировка

Re: [PERFORM] pg_dump and thousands of schemas

От
Tatsuo Ishii
Дата:
>> Yeah, Jeff's experiments indicated that the remaining bottleneck is lock
>> management in the server.  What I fixed so far on the pg_dump side
>> should be enough to let partial dumps run at reasonable speed even if
>> the whole database contains many tables.  But if psql is taking
>> AccessShareLock on lots of tables, there's still a problem.
>
> Ok, I modified the part of pg_dump where tremendous number of LOCK
> TABLE are issued. I replace them with single LOCK TABLE with multiple
> tables. With 100k tables LOCK statements took 13 minutes in total, now
> it only takes 3 seconds. Comments?

Shall I commit to master and all supported branches?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

Re: [PERFORM] pg_dump and thousands of schemas

От
Tom Lane
Дата:
Tatsuo Ishii <ishii@postgresql.org> writes:
>> Ok, I modified the part of pg_dump where tremendous number of LOCK
>> TABLE are issued. I replace them with single LOCK TABLE with multiple
>> tables. With 100k tables LOCK statements took 13 minutes in total, now
>> it only takes 3 seconds. Comments?

> Shall I commit to master and all supported branches?

I'm not excited by this patch.  It dodges the O(N^2) lock behavior for
the initial phase of acquiring the locks, but it does nothing for the
lock-related slowdown occurring in all pg_dump's subsequent commands.
I think we really need to get in the server-side fix that Jeff Janes is
working on, and then re-measure to see if something like this is still
worth the trouble.  I am also a tad concerned about whether we might not
have problems with parsing memory usage, or some such, with thousands of
tables being listed in a single command.

            regards, tom lane

Re: [PERFORM] pg_dump and thousands of schemas

От
Stephen Frost
Дата:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Tatsuo Ishii <ishii@postgresql.org> writes:
> > Shall I commit to master and all supported branches?
>
> I'm not excited by this patch.  It dodges the O(N^2) lock behavior for
> the initial phase of acquiring the locks, but it does nothing for the
> lock-related slowdown occurring in all pg_dump's subsequent commands.
> I think we really need to get in the server-side fix that Jeff Janes is
> working on, and then re-measure to see if something like this is still
> worth the trouble.  I am also a tad concerned about whether we might not
> have problems with parsing memory usage, or some such, with thousands of
> tables being listed in a single command.

I can't imagine a case where it's actually better to incur the latency
penalty (which is apparently on the order of *minutes* of additional
time here..) than to worry about the potential memory usage of having to
parse such a command.

If that's really a concern, where is that threshold, and could we simply
cap pg_dump's operations based on it?  Is 1000 alright?  Doing a 'lock'
w/ 1000 tables at a time is still going to be hugely better than doing
them individually and the amount of gain between every-1000 and
all-at-once is likely to be pretty minimal anyway...

The current situation where the client-to-server latency accounts for
multiple minutes of time is just ridiculous, however, so I feel we need
some form of this patch, even if the server side is magically made much
faster.  The constant back-and-forth isn't cheap.

    Thanks,

        Stephen

Re: [PERFORM] pg_dump and thousands of schemas

От
Tom Lane
Дата:
Stephen Frost <sfrost@snowman.net> writes:
> The current situation where the client-to-server latency accounts for
> multiple minutes of time is just ridiculous, however, so I feel we need
> some form of this patch, even if the server side is magically made much
> faster.  The constant back-and-forth isn't cheap.

No, you're missing my point.  I don't believe that client-to-server
latency, or any other O(N) cost, has anything to do with the problem
here.  The problem, as Jeff has demonstrated, is the O(N^2) costs
associated with management of the local lock table.  It is utterly
pointless to worry about O(N) costs until that's fixed; and it's just
wrong to claim that you've created a significant speedup by eliminating
a constant factor when all you've done is staved off occurrences of the
O(N^2) problem.

Once we've gotten rid of the local lock table problem, we can re-measure
and see what the true benefit of this patch is.  I'm of the opinion
that it will be in the noise compared to the overall runtime of pg_dump.
I could be wrong, but you won't convince me of that with measurements
taken while the local lock table problem is still there.

            regards, tom lane

Re: [PERFORM] pg_dump and thousands of schemas

От
Tatsuo Ishii
Дата:
> I'm not excited by this patch.  It dodges the O(N^2) lock behavior for
> the initial phase of acquiring the locks, but it does nothing for the
> lock-related slowdown occurring in all pg_dump's subsequent commands.
> I think we really need to get in the server-side fix that Jeff Janes is
> working on, and then re-measure to see if something like this is still
> worth the trouble.

Well, even with current backend, locking 100,000 tables has been done
in 3 seconds in my test. So even if Jeff Janes's fix is succeeded, I
guess it will just save 3 seconds in my case. and if number of tables
is smaller, the saving will smaller. This suggests that most of time
for processing LOCK has been spent in communication between pg_dump
and backend. Of course this is just my guess, though.

> I am also a tad concerned about whether we might not
> have problems with parsing memory usage, or some such, with thousands of
> tables being listed in a single command.

That's easy to fix. Just divide each LOCK statements into multiple
LOCK statements.

My big concern is, even if the locking part is fixed (either by Jeff
Jane's fix or by me) still much time in pg_dump is spent for SELECTs
against system catalogs. The fix will be turn many SELECTs into single
SELECT, probably using big IN clause for tables oids.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

Re: [PERFORM] pg_dump and thousands of schemas

От
Bruce Momjian
Дата:
On Thu, May 31, 2012 at 09:20:43AM +0900, Tatsuo Ishii wrote:
> >> Yeah, Jeff's experiments indicated that the remaining bottleneck is lock
> >> management in the server.  What I fixed so far on the pg_dump side
> >> should be enough to let partial dumps run at reasonable speed even if
> >> the whole database contains many tables.  But if psql is taking
> >> AccessShareLock on lots of tables, there's still a problem.
> >
> > Ok, I modified the part of pg_dump where tremendous number of LOCK
> > TABLE are issued. I replace them with single LOCK TABLE with multiple
> > tables. With 100k tables LOCK statements took 13 minutes in total, now
> > it only takes 3 seconds. Comments?
>
> Shall I commit to master and all supported branches?

Was this applied?

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +


Re: [PERFORM] pg_dump and thousands of schemas

От
Tom Lane
Дата:
Bruce Momjian <bruce@momjian.us> writes:
> On Thu, May 31, 2012 at 09:20:43AM +0900, Tatsuo Ishii wrote:
>>> Ok, I modified the part of pg_dump where tremendous number of LOCK
>>> TABLE are issued. I replace them with single LOCK TABLE with multiple
>>> tables. With 100k tables LOCK statements took 13 minutes in total, now
>>> it only takes 3 seconds. Comments?

>> Shall I commit to master and all supported branches?

> Was this applied?

No, we fixed the server side instead.

            regards, tom lane


Re: [PERFORM] pg_dump and thousands of schemas

От
Bruce Momjian
Дата:
On Thu, Aug 30, 2012 at 04:51:56PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > On Thu, May 31, 2012 at 09:20:43AM +0900, Tatsuo Ishii wrote:
> >>> Ok, I modified the part of pg_dump where tremendous number of LOCK
> >>> TABLE are issued. I replace them with single LOCK TABLE with multiple
> >>> tables. With 100k tables LOCK statements took 13 minutes in total, now
> >>> it only takes 3 seconds. Comments?
>
> >> Shall I commit to master and all supported branches?
>
> > Was this applied?
>
> No, we fixed the server side instead.

Again, thanks.  I knew we fixed the server, but wasn't clear that made
the client changes unnecessary, but I think I now do remember discussion
about that.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +


Re: [PERFORM] pg_dump and thousands of schemas

От
Robert Haas
Дата:
On Thu, Aug 30, 2012 at 4:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Bruce Momjian <bruce@momjian.us> writes:
>> On Thu, May 31, 2012 at 09:20:43AM +0900, Tatsuo Ishii wrote:
>>>> Ok, I modified the part of pg_dump where tremendous number of LOCK
>>>> TABLE are issued. I replace them with single LOCK TABLE with multiple
>>>> tables. With 100k tables LOCK statements took 13 minutes in total, now
>>>> it only takes 3 seconds. Comments?
>
>>> Shall I commit to master and all supported branches?
>
>> Was this applied?
>
> No, we fixed the server side instead.

But only for 9.2, right?  So people running back branches are still screwed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [PERFORM] pg_dump and thousands of schemas

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Aug 30, 2012 at 4:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Bruce Momjian <bruce@momjian.us> writes:
>>> On Thu, May 31, 2012 at 09:20:43AM +0900, Tatsuo Ishii wrote:
>>>> Ok, I modified the part of pg_dump where tremendous number of LOCK
>>>> TABLE are issued. I replace them with single LOCK TABLE with multiple
>>>> tables. With 100k tables LOCK statements took 13 minutes in total, now
>>>> it only takes 3 seconds. Comments?

>>> Was this applied?

>> No, we fixed the server side instead.

> But only for 9.2, right?  So people running back branches are still screwed.

Yeah, but they're screwed anyway, because there are a bunch of O(N^2)
behaviors involved here, not all of which are masked by what Tatsuo-san
suggested.

Six months or a year from now, we might have enough confidence in that
batch of 9.2 fixes to back-port them en masse.  Don't want to do it
today though.

            regards, tom lane


Re: [PERFORM] pg_dump and thousands of schemas

От
Jeff Janes
Дата:
On Thu, Aug 30, 2012 at 8:17 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Thu, Aug 30, 2012 at 4:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Bruce Momjian <bruce@momjian.us> writes:
>>>> On Thu, May 31, 2012 at 09:20:43AM +0900, Tatsuo Ishii wrote:
>>>>> Ok, I modified the part of pg_dump where tremendous number of LOCK
>>>>> TABLE are issued. I replace them with single LOCK TABLE with multiple
>>>>> tables. With 100k tables LOCK statements took 13 minutes in total, now
>>>>> it only takes 3 seconds. Comments?
>
>>>> Was this applied?
>
>>> No, we fixed the server side instead.
>
>> But only for 9.2, right?  So people running back branches are still screwed.
>
> Yeah, but they're screwed anyway, because there are a bunch of O(N^2)
> behaviors involved here, not all of which are masked by what Tatsuo-san
> suggested.

All of the other ones that I know of were associated with pg_dump
itself, and since it is recommended to run the newer version of
pg_dump against the older version of the server, no back patching
would be necessary to get the benefits of those particular fixes.

> Six months or a year from now, we might have enough confidence in that
> batch of 9.2 fixes to back-port them en masse.  Don't want to do it
> today though.


What would be the recommendation for people trying to upgrade, but who
can't get their data out in a reasonable window?

Putting Tatsuo-san's change into a future pg_dump might be more
conservative than back-porting the server's Lock Table change to the
server version they are trying to get rid of.

Cheers,

Jeff


Re: [PERFORM] pg_dump and thousands of schemas

От
Robert Haas
Дата:
On Sun, Sep 2, 2012 at 5:39 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> On Thu, Aug 30, 2012 at 8:17 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> On Thu, Aug 30, 2012 at 4:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> Bruce Momjian <bruce@momjian.us> writes:
>>>>> On Thu, May 31, 2012 at 09:20:43AM +0900, Tatsuo Ishii wrote:
>>>>>> Ok, I modified the part of pg_dump where tremendous number of LOCK
>>>>>> TABLE are issued. I replace them with single LOCK TABLE with multiple
>>>>>> tables. With 100k tables LOCK statements took 13 minutes in total, now
>>>>>> it only takes 3 seconds. Comments?
>>
>>>>> Was this applied?
>>
>>>> No, we fixed the server side instead.
>>
>>> But only for 9.2, right?  So people running back branches are still screwed.
>>
>> Yeah, but they're screwed anyway, because there are a bunch of O(N^2)
>> behaviors involved here, not all of which are masked by what Tatsuo-san
>> suggested.
>
> All of the other ones that I know of were associated with pg_dump
> itself, and since it is recommended to run the newer version of
> pg_dump against the older version of the server, no back patching
> would be necessary to get the benefits of those particular fixes.
>
>> Six months or a year from now, we might have enough confidence in that
>> batch of 9.2 fixes to back-port them en masse.  Don't want to do it
>> today though.
>
>
> What would be the recommendation for people trying to upgrade, but who
> can't get their data out in a reasonable window?
>
> Putting Tatsuo-san's change into a future pg_dump might be more
> conservative than back-porting the server's Lock Table change to the
> server version they are trying to get rid of.

What he said.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company