Re: PG_DUMP very slow because of STDOUT ??

Поиск
Список
Период
Сортировка
От Andras Fabian
Тема Re: PG_DUMP very slow because of STDOUT ??
Дата
Msg-id B1A1AD14D5F9D647BD2A00988C53B8220ACA2643@atradaex03.nbg.atrada.net
обсуждение исходный текст
Ответ на PG_DUMP very slow because of STDOUT ??  (Andras Fabian <Fabian@atrada.net>)
Ответы Re: PG_DUMP very slow because of STDOUT ??  (Greg Smith <greg@2ndquadrant.com>)
Re: PG_DUMP very slow because of STDOUT ??  (Scott Marlowe <scott.marlowe@gmail.com>)
Re: PG_DUMP very slow because of STDOUT ??  (Craig Ringer <craig@postnewspapers.com.au>)
Список pgsql-general
This STDOU issue gets even weirder. Now I have set up our two new servers (identical hw/sw) as I would have needed to
doso anyways. After having PG running, I also set up the same test scenario as I have it on our problematic servers,
andstarted the COPY-to-STDOUT experiment. And you know what? Both new servers are performing well. No hanging, and the
3GByte test dump was written in around 3 minutes (as expected). To make things even more complicated ... I went back to
ourproduction servers. Now, the first one - which I froze up with oprofile this morning and needed a REBOOT - is
performingwell too! It needed 3 minutes for the test case ... WTF? BUT, the second production server, which did not
havea reboot, is still behaving badly. 
Now I tried to dig deeper (without killing a production server again) ... and came to comparing the outputs of PS (with
'-fax'parameter then, '-axl'). Now I have found something interesting: 
- all fast servers show the COPY process as being in the state Rs ("runnable (on run queue)")
- on the still slow server, this process is in 9 out of 10 samples in Ds ("uninterruptible sleep (usually IO)")

Now, this "Ds" state seems to be something unhealthy - especially if it is there almost all the time - as far as my
firstreeds on google show (and although it points to IO, there is seemingly only very little IO, and IO-wait is minimal
too).I have also done "-axl" with PS, which brings the following line for our process: 
F   UID   PID  PPID PRI  NI    VSZ   RSS WCHAN  STAT TTY        TIME COMMAND
1  5551  2819  4201  20   0 5941068 201192 conges Ds ?          2:05 postgres: postgres musicload_cache [local] COPY"

Now, as far as I understood from my google searches, the column WCHAN shows, where in the kernel my process is hanging.
Hereit says "conges". Now, can somebody tell me, what "conges" means ???? Or do I have other options to get out even
moreinfo from the system (maybe without oprofile - as it already burned my hand :-). 

And yes, now I see a reboot as a possible "Fix", but that would not ensure me, that the problem will not resurface. So,
forthe time being, I will leave my current second production server as is ... so I can further narrow down the
potentialreasons of this strange STDOUT slow down (especially I someone ha s a tip for me :-) 

Andras Fabian

(in the meantime my "slow" server finished the COPY ... it took 46 minutes instead of 3 minutes on the fast machines
...a slowdown of factor 15).  




-----Ursprüngliche Nachricht-----
Von: Andras Fabian
Gesendet: Montag, 12. Juli 2010 10:45
An: 'Tom Lane'
Cc: pgsql-general@postgresql.org
Betreff: AW: [GENERAL] PG_DUMP very slow because of STDOUT ??

Hi Tom (or others),

are there some recommended settings/ways to use oprofile on a situation like this??? I got it working, have seen a
firstprofile report, but then managed to completely freeze the server on a second try with different oprofile settings
(nexttests will go against the newly installed - next and identical - new servers).  

Andras Fabian

-----Ursprüngliche Nachricht-----
Von: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Gesendet: Freitag, 9. Juli 2010 15:39
An: Andras Fabian
Cc: pgsql-general@postgresql.org
Betreff: Re: [GENERAL] PG_DUMP very slow because of STDOUT ??

Andras Fabian <Fabian@atrada.net> writes:
> Now I ask, whats going on here ???? Why is COPY via STDOUT so much slower on out new machine?

Something weird about the network stack on the new machine, maybe.
Have you compared the transfer speeds for Unix-socket and TCP connections?

On a Red Hat box I would try using oprofile to see where the bottleneck
is ... don't know if that's available for Ubuntu.

            regards, tom lane

В списке pgsql-general по дате отправления:

Предыдущее
От: Thom Brown
Дата:
Сообщение: PostgreSQL 9.0 beta 3 release announcement
Следующее
От: Tom Wilcox
Дата:
Сообщение: Configure Postgres From SQL