Обсуждение: Does this perf output seem 'normal'?
Hi all, I am trying to work out why a piece of software, Mapnik, is executing slowly. All it is doing is loading a config file which causes about 12 preparation queries (i.e. with LIMIT 0) to be executed. I can see from pg_stat_statements these only take ~1ms in their totality. So next I ran "perf record -a -g sleep 0.4" and quickly loaded my PHP page to start the process (I am interfacing with mapnik via a PHP Extension), this is a sample of the result: https://cloud.githubusercontent.com/assets/4375326/14836015/08b0a8b4-0c05-11e6-96e5-f1836497cdfe.png The PHP page returns in ~60ms (well within the 400ms used for the perf recording). If possible, please can anyone with more familiarity with perf and Postgres tell me whether the result of the recording looks healthy? I am wondering if there is some problem with the way in which Mapnik is implementing libpq -- I already moved from Mapnik 2.2 because it was effectively DOSing Postgres by leaving client connections open. Any help would be greatly appreciated. Kind regards Peter
On Apr 26, 2016 4:29 PM, "Peter Devoy" <peter@3xe.co.uk> wrote:
>
> Hi all,
>
> I am trying to work out why a piece of software, Mapnik, is executing
> slowly. All it is doing is loading a config file which causes about
> 12 preparation queries (i.e. with LIMIT 0) to be executed. I can see
> from pg_stat_statements these only take ~1ms in their totality.
>
> So next I ran "perf record -a -g sleep 0.4" and quickly loaded my PHP
> page to start the process (I am interfacing with mapnik via a PHP
> Extension), this is a sample of the result:
>
> https://cloud.githubusercontent.com/assets/4375326/14836015/08b0a8b4-0c05-11e6-96e5-f1836497cdfe.png
I'm no expert in perf, but to me that looks like the profile of a system which is bored. Which matches your description, of spending 0.4 seconds watching something which only takes 0.06 seconds to happen. If you really want to profile this, you should fire it off in a tight loop, using wget or ab2 or curl.
Cheers,
Jeff
On 27.04.2016 2:27, Peter Devoy wrote: > Hi all, > > I am trying to work out why a piece of software, Mapnik, is executing > slowly. All it is doing is loading a config file which causes about > 12 preparation queries (i.e. with LIMIT 0) to be executed. I can see > from pg_stat_statements these only take ~1ms in their totality. > > So next I ran "perf record -a -g sleep 0.4" and quickly loaded my PHP > page to start the process (I am interfacing with mapnik via a PHP > Extension), this is a sample of the result: > > https://cloud.githubusercontent.com/assets/4375326/14836015/08b0a8b4-0c05-11e6-96e5-f1836497cdfe.png > > The PHP page returns in ~60ms (well within the 400ms used for the perf > recording). If possible, please can anyone with more familiarity with > perf and Postgres tell me whether the result of the recording looks > healthy? > > I am wondering if there is some problem with the way in which Mapnik > is implementing libpq -- I already moved from Mapnik 2.2 because it > was effectively DOSing Postgres by leaving client connections open. > > Any help would be greatly appreciated. > > > Kind regards > > Peter > > Hi! What do you want to see in perf stats? Maybe you can explain your problem more in details? -- Alex Ignatov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
> If you really want to profile this, you should fire it off in a tight loop, using wget or ab2 or curl. Thanks Jeff, that sounds like a smart idea. I will try later when I have access to the server. >Hi! What do you want to see in perf stats? Maybe you can explain your problem more in details? Hi Alex ) I am hoping to find out which function calls are taking the longest because to the code is taking too long to only parse a small XML file and do a few quick database queries. The last version of this software was able to do a lot more in the same period of time. I realise this most likely a client software issue but was hoping it may be obvious whether Mapnik is spending too much time with Postgres. Unfortunately I may just have to try again to build Mapnik from source so that I can get debugging/profiling output, etc. -- I was hoping to avoid it because it is quite painful.
I have now done a recording for 60 seconds during a batch of 1000 requests and posted the results on a new issue on the Mapnik repo. Although Postgres still comes out on top in the perf results I struggle to believe this is a Postgres issue. But, if anyone is curious, the issue is here: https://github.com/mapnik/mapnik/issues/3414 I may come back here if I don't have luck with the Mapnik developers. All the best Peter 3XE P: 01326 567155 M: 07770 693662 A: 3XE Ltd Tremough Innovation Centre PENRYN TR10 9TA 3XE Ltd · Registered in England and Wales · 9356871 On 27 April 2016 at 16:45, Peter Devoy <peter@3xe.co.uk> wrote: >> If you really want to profile this, you should fire it off in a tight loop, using wget or ab2 or curl. > > Thanks Jeff, that sounds like a smart idea. I will try later when I > have access to the server. > >>Hi! What do you want to see in perf stats? Maybe you can explain your problem more in details? > > Hi Alex ) I am hoping to find out which function calls are taking the > longest because to the code is taking too long to only parse a small > XML file and do a few quick database queries. The last version of > this software was able to do a lot more in the same period of time. I > realise this most likely a client software issue but was hoping it may > be obvious whether Mapnik is spending too much time with Postgres. > > Unfortunately I may just have to try again to build Mapnik from source > so that I can get debugging/profiling output, etc. -- I was hoping to > avoid it because it is quite painful.
On 28.04.2016 1:11, Peter Devoy wrote: > I have now done a recording for 60 seconds during a batch of 1000 > requests and posted the results on a new issue on the Mapnik repo. > > Although Postgres still comes out on top in the perf results I > struggle to believe this is a Postgres issue. But, if anyone is > curious, the issue is here: > https://github.com/mapnik/mapnik/issues/3414 > > I may come back here if I don't have luck with the Mapnik developers. > > All the best > > > Peter > 3XE > P: 01326 567155 > M: 07770 693662 > A: 3XE Ltd > Tremough Innovation Centre > PENRYN > TR10 9TA > 3XE Ltd · Registered in England and Wales · 9356871 > > > On 27 April 2016 at 16:45, Peter Devoy <peter@3xe.co.uk> wrote: >>> If you really want to profile this, you should fire it off in a tight loop, using wget or ab2 or curl. >> Thanks Jeff, that sounds like a smart idea. I will try later when I >> have access to the server. >> >>> Hi! What do you want to see in perf stats? Maybe you can explain your problem more in details? >> Hi Alex ) I am hoping to find out which function calls are taking the >> longest because to the code is taking too long to only parse a small >> XML file and do a few quick database queries. The last version of >> this software was able to do a lot more in the same period of time. I >> realise this most likely a client software issue but was hoping it may >> be obvious whether Mapnik is spending too much time with Postgres. >> >> Unfortunately I may just have to try again to build Mapnik from source >> so that I can get debugging/profiling output, etc. -- I was hoping to >> avoid it because it is quite painful. Can you expand node from perf report where 33.55% and 13.37% and 38.04% 38.04%. PS Also do you know that in perf report second column is the total cpu time share and not the first ? And that is why total sum of second column is 100% but sum from first can be != 100%? First column is the % of time when child function(procedure) is working relative to main proc time. -- Alex Ignatov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
>Also do you know that in perf report second column is the total cpu time share and not the first Thank you, I did notice they did not add up -- that makes more sense now. >Can you expand node from perf report Yes, my pleasure. I added "--call-graph fractal,2,callee" to the command, hopefully it seems more useful: https://cloud.githubusercontent.com/assets/4375326/14889937/a1378982-0d58-11e6-89b6-aac5e1620fbb.png To my amateur eye it seems like there is suspicious amount of memory & thread management going on...? -- Peter
On 28.04.2016 17:50, Peter Devoy wrote: >> Also do you know that in perf report second column is the total cpu time share and not the first > Thank you, I did notice they did not add up -- that makes more sense now. > >> Can you expand node from perf report > Yes, my pleasure. I added "--call-graph fractal,2,callee" to the > command, hopefully it seems more useful: > > https://cloud.githubusercontent.com/assets/4375326/14889937/a1378982-0d58-11e6-89b6-aac5e1620fbb.png > > To my amateur eye it seems like there is suspicious amount of memory & > thread management going on...? > > > -- > Peter Or this report is much more infomative and it is looks healthy =)! At first glance i see that postgres is simply forking i.e. spawning child process for the new session. Creating new PG connection is slow process, even if your select is light fast total response time can be very long. Have you ever heard about pgbouncer- the light Postgres connection pooler? It is placed between you app server and pg and proxing app server queries to pg server. pgbouncer have some constant number of connections to pg server and have some sort of queue on incoming connection from app server. TL;DR Use pgbouncer between app server and Postgres =) Alex Ignatov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
I believe the library is spawning a new connection for each query so that makes perfect sense. I will see what improvements can be made at the application layer and look at using pgbouncer if required =) Thank you for sharing your knowledge!