Обсуждение: remote pg_dump hangs always at same table

Поиск
Список
Период
Сортировка

remote pg_dump hangs always at same table

От
Axel Rau
Дата:
Hi all,

I have a daily remote backup session like
   pg_dump -h db -i -Fp database | gzip > file
which hangs after producing about 200 MB of compressed output:
----------
     TIME CMD
  0:00.05 /usr/local/pgsql/bin/pg_dumpall -h db -i
  0:21.45 /usr/local/pgsql/bin/pg_dump -h db -i -Fp database
----------
It happens always during copy of this table:
------------
                    Table "archiveopteryx.bodyparts"
  Column |  Type   |                     Modifiers
--------+---------+----------------------------------------------------
  id     | integer | not null default nextval('bodypart_ids'::regclass)
  bytes  | integer | not null
  hash   | text    | not null
  text   | text    |
  data   | bytea   |
Indexes:
     "bodyparts_pkey" PRIMARY KEY, btree (id)
     "b_h" btree (hash)
Referenced by:
     TABLE "unparsed_messages" CONSTRAINT
"unparsed_messages_bodypart_fkey" FOREIGN KEY (bodypart) REFERENCES
bodyparts(id) ON DELETE CASCADE
-----------
Interesting: I have a test and a production database with the same
schema but different data and it happens with both databases at
exactly this table.

There is no problem to dump the databases locally on the server and
fetch the dump from the client via scp.

Versions: Server: 8.3.6 on FreeBSD 7.0; Client: 8.3.7 on darwin 9.8.0.
Network: 2 adjacent LANs via firewall (OpenBSD 4.6/pf). State table
looks ok; extract:
---------
em0 tcp xxx.yyy.zzz.10:5432 <-     .10:63169       ESTABLISHED:ESTABLISHED
    [615902410 + 524280] wscale 3  [396592834 + 66176] wscale 3
    age 10:47:05, expires in 23:15:07, 199433:386306 pkts,
10820556:561637949 bytes, rule 77
    id: 4b8ecbd8001ddfbd creatorid: cd2329ff
dc0 tcp uuu.vvv.www.10:63169 -> xxx.yyy.zzz.10:5432
ESTABLISHED:ESTABLISHED
    [396592834 + 66176] wscale 3  [615902410 + 524280] wscale 3
    age 10:47:05, expires in 23:15:07, 199433:386306 pkts,
10820556:561637949 bytes, rule 172
    id: 4b8ecbd8001ddfbe creatorid: cd2329ff
---------
Netstat on client: tcp4 0 0 z.63169 db.postg ESTABLISHED
Netstat on Server: tcp4 0 0 db.postgresql z.63169 ESTABLISHED

Any help appreciated,
Axel
---
axel.rau@chaos1.de  PGP-Key:29E99DD6  +49 151 2300 9283  computing @
chaos claudius


Re: remote pg_dump hangs always at same table

От
Tom Lane
Дата:
Axel Rau <Axel.Rau@chaos1.de> writes:
> I have a daily remote backup session like
>    pg_dump -h db -i -Fp database | gzip > file
> which hangs after producing about 200 MB of compressed output:

Is it CPU-busy, or idle?  If the latter, is it blocked on a lock
according to pg_locks?

The most informative thing you could do is attach to both pg_dump and
its connected backend with gdb and get stack traces.  But looking at
pg_locks might solve the mystery without that.

            regards, tom lane

Re: remote pg_dump hangs always at same table

От
Axel Rau
Дата:
Am 13.03.2010 um 17:33 schrieb Tom Lane:

> Is it CPU-busy, or idle?
Idle.
>  If the latter, is it blocked on a lock
> according to pg_locks?
It has acquired a lot of shared and one exclusive lock which all have
been granted.
>
> The most informative thing you could do is attach to both pg_dump and
> its connected backend with gdb and get stack traces.  But looking at
> pg_locks might solve the mystery without that.
I will try to do that.

Thanks for responding,
Axel
---
axel.rau@chaos1.de  PGP-Key:29E99DD6  +49 151 2300 9283  computing @
chaos claudius


Re: remote pg_dump hangs always at same table

От
Axel Rau
Дата:
Am 13.03.2010 um 19:45 schrieb Axel Rau:

>> The most informative thing you could do is attach to both pg_dump and
>> its connected backend with gdb and get stack traces.  But looking at
>> pg_locks might solve the mystery without that.
> I will try to do that.

-----------
client:
#0  0x9405de0e in poll$UNIX2003 ()
#1  0x00044cfa in pqSocketCheck ()
#2  0x00045101 in pqWaitTimed ()
#3  0x00045167 in pqWait ()
#4  0x0004b9b3 in pqGetCopyData3 ()
#5  0x0001295f in dumpTableData_copy ()
#6  0x00022a19 in _PrintTocData ()
#7  0x0001ecd1 in RestoreArchive ()
#8  0x00015daa in main ()
-----------
server:
#0  0x00000008014b3d5a in read () from /lib/libc.so.7
#1  0x00000008012c7760 in read () from /lib/libthr.so.3
#2  0x0000000800e91142 in BIO_new_socket () from /lib/libcrypto.so.5
#3  0x000000000053039e in secure_write ()
#4  0x0000000800f0d9df in BIO_read () from /lib/libcrypto.so.5
#5  0x0000000800cdfe60 in ssl3_read_n () from /usr/lib/libssl.so.5
#6  0x0000000800ce0352 in ssl3_read_bytes () from /usr/lib/libssl.so.5
#7  0x0000000800ce6f11 in ssl3_get_message () from /usr/lib/libssl.so.5
#8  0x0000000800cd1989 in ssl3_get_client_hello () from /usr/lib/
libssl.so.5
#9  0x0000000800cd263c in ssl3_accept () from /usr/lib/libssl.so.5
#10 0x00000000005301af in secure_write ()
#11 0x0000000000535235 in pq_setkeepalivesidle ()
#12 0x0000000000535363 in pq_flush ()
#13 0x000000000053540c in pq_putmessage ()
#14 0x00000000004d62c8 in CreateCopyDestReceiver ()
#15 0x00000000004d648a in CreateCopyDestReceiver ()
#16 0x00000000004d8076 in CreateCopyDestReceiver ()
#17 0x00000000004da398 in DoCopy ()
#18 0x00000000005abd8b in ProcessUtility ()
#19 0x00000000005a87db in PostgresMain ()
#20 0x00000000005a9695 in FreeQueryDesc ()
#21 0x00000000005a9e04 in PortalRun ()
#22 0x00000000005a5b94 in pg_parse_query ()
#23 0x00000000005a6947 in PostgresMain ()
#24 0x000000000057d012 in ClosePostmasterPorts ()
#25 0x000000000057db54 in PostmasterMain ()
#26 0x00000000005374ee in main ()
-----------
Axel
---
axel.rau@chaos1.de  PGP-Key:29E99DD6  +49 151 2300 9283  computing @
chaos claudius


Re: remote pg_dump hangs always at same table

От
Tom Lane
Дата:
Axel Rau <Axel.Rau@chaos1.de> writes:
> client:
> #0  0x9405de0e in poll$UNIX2003 ()
> #1  0x00044cfa in pqSocketCheck ()
> #2  0x00045101 in pqWaitTimed ()
> #3  0x00045167 in pqWait ()
> #4  0x0004b9b3 in pqGetCopyData3 ()
> #5  0x0001295f in dumpTableData_copy ()
> #6  0x00022a19 in _PrintTocData ()
> #7  0x0001ecd1 in RestoreArchive ()
> #8  0x00015daa in main ()
> -----------
> server:
> #0  0x00000008014b3d5a in read () from /lib/libc.so.7
> #1  0x00000008012c7760 in read () from /lib/libthr.so.3
> #2  0x0000000800e91142 in BIO_new_socket () from /lib/libcrypto.so.5
> #3  0x000000000053039e in secure_write ()
> #4  0x0000000800f0d9df in BIO_read () from /lib/libcrypto.so.5
> #5  0x0000000800cdfe60 in ssl3_read_n () from /usr/lib/libssl.so.5
> #6  0x0000000800ce0352 in ssl3_read_bytes () from /usr/lib/libssl.so.5
> #7  0x0000000800ce6f11 in ssl3_get_message () from /usr/lib/libssl.so.5
> #8  0x0000000800cd1989 in ssl3_get_client_hello () from /usr/lib/
> libssl.so.5
> #9  0x0000000800cd263c in ssl3_accept () from /usr/lib/libssl.so.5
> #10 0x00000000005301af in secure_write ()
> #11 0x0000000000535235 in pq_setkeepalivesidle ()
> #12 0x0000000000535363 in pq_flush ()
> #13 0x000000000053540c in pq_putmessage ()
> #14 0x00000000004d62c8 in CreateCopyDestReceiver ()
> #15 0x00000000004d648a in CreateCopyDestReceiver ()
> #16 0x00000000004d8076 in CreateCopyDestReceiver ()
> #17 0x00000000004da398 in DoCopy ()

Hmm.  This backtrace is not tremendously trustworthy (it'd be better if
you had debug symbols installed for these libraries) but it seems fairly
clear that the backend is stuck trying to transmit data to the client,
rather than in any internal operation.  And evidently you're using SSL.
I wonder whether this is a case of the SSL-renegotiation-broken-by-
security-"fixes" issue.  Have you recently updated the openssl library
on either the client or the server?  Is it practical for you to try a
dump across a non-SSL connection to see if that works, or is your
network too insecure for that?  (Possibly you could try dumping
across an SSH tunnel instead of a direct connection, if so.)

If I'm right in blaming this on broken SSL renegotiation, you'll want
to update to Monday's minor releases, which will have a knob to allow
SSL renegotiation to be disabled.

            regards, tom lane

Re: remote pg_dump hangs always at same table

От
Axel Rau
Дата:
Am 13.03.2010 um 22:35 schrieb Tom Lane:

> Have you recently updated the openssl library
> on either the client or the server?
Yes, the client side got an update.
>  Is it practical for you to try a
> dump across a non-SSL connection to see if that works, or is your
> network too insecure for that?
I will try cleartext this night.

Axel
---
axel.rau@chaos1.de  PGP-Key:29E99DD6  +49 151 2300 9283  computing @
chaos claudius


Re: remote pg_dump hangs always at same table

От
Tom Lane
Дата:
BTW, if the SSL-renegotiation theory is correct, what that should mean
is that the dump fails after transmitting 512MB worth of data.  Is that
consistent with what you're seeing?

            regards, tom lane

Re: remote pg_dump hangs always at same table

От
Axel Rau
Дата:
Am 13.03.2010 um 22:59 schrieb Tom Lane:

> BTW, if the SSL-renegotiation theory is correct, what that should mean
> is that the dump fails after transmitting 512MB worth of data.  Is
> that
> consistent with what you're seeing?
561637949 is not too far away from it:

Am 13.03.2010 um 13:24 schrieb Axel Rau:
>   age 10:47:05, expires in 23:15:07, 199433:386306 pkts,
> 10820556:561637949 bytes, rule 172

(-; Axel
---
axel.rau@chaos1.de  PGP-Key:29E99DD6  +49 151 2300 9283  computing @
chaos claudius


Re: remote pg_dump hangs always at same table

От
Axel Rau
Дата:
Am 13.03.2010 um 23:34 schrieb Axel Rau:

>> if the SSL-renegotiation theory is correct
It was completly correct, because it worked last night. (-:

Thank you for this excellent diagnosis.

Axel
---
axel.rau@chaos1.de  PGP-Key:29E99DD6  +49 151 2300 9283  computing @
chaos claudius