Обсуждение: Does this perf output seem 'normal'?

Поиск
Список
Период
Сортировка

Does this perf output seem 'normal'?

От
Peter Devoy
Дата:
Hi all,

I am trying to work out why a piece of software, Mapnik, is executing
slowly.  All it is doing is loading a config file which causes about
12 preparation queries (i.e. with LIMIT 0) to be executed.  I can see
from pg_stat_statements these only take ~1ms in their totality.

So next I ran "perf record -a -g sleep 0.4" and quickly loaded my PHP
page to start the process (I am interfacing with mapnik via a PHP
Extension), this is a sample of the result:

https://cloud.githubusercontent.com/assets/4375326/14836015/08b0a8b4-0c05-11e6-96e5-f1836497cdfe.png

The PHP page returns in ~60ms (well within the 400ms used for the perf
recording).  If possible, please can anyone with more familiarity with
perf and Postgres tell me whether the result of the recording looks
healthy?

I am wondering if there is some problem with the way in which Mapnik
is implementing libpq -- I already moved from Mapnik 2.2 because it
was effectively DOSing Postgres by leaving client connections open.

Any help would be greatly appreciated.


Kind regards

Peter


Re: Does this perf output seem 'normal'?

От
Jeff Janes
Дата:


On Apr 26, 2016 4:29 PM, "Peter Devoy" <peter@3xe.co.uk> wrote:
>
> Hi all,
>
> I am trying to work out why a piece of software, Mapnik, is executing
> slowly.  All it is doing is loading a config file which causes about
> 12 preparation queries (i.e. with LIMIT 0) to be executed.  I can see
> from pg_stat_statements these only take ~1ms in their totality.
>
> So next I ran "perf record -a -g sleep 0.4" and quickly loaded my PHP
> page to start the process (I am interfacing with mapnik via a PHP
> Extension), this is a sample of the result:
>
> https://cloud.githubusercontent.com/assets/4375326/14836015/08b0a8b4-0c05-11e6-96e5-f1836497cdfe.png

I'm no expert in perf, but to me that looks like the profile of a system which is bored.  Which matches your description, of spending 0.4 seconds watching something which only takes 0.06 seconds to happen.  If you really want to profile this, you should fire it off in a tight loop, using wget or ab2 or curl.

Cheers,

Jeff

Re: Does this perf output seem 'normal'?

От
Alex Ignatov
Дата:

On 27.04.2016 2:27, Peter Devoy wrote:
> Hi all,
>
> I am trying to work out why a piece of software, Mapnik, is executing
> slowly.  All it is doing is loading a config file which causes about
> 12 preparation queries (i.e. with LIMIT 0) to be executed.  I can see
> from pg_stat_statements these only take ~1ms in their totality.
>
> So next I ran "perf record -a -g sleep 0.4" and quickly loaded my PHP
> page to start the process (I am interfacing with mapnik via a PHP
> Extension), this is a sample of the result:
>
> https://cloud.githubusercontent.com/assets/4375326/14836015/08b0a8b4-0c05-11e6-96e5-f1836497cdfe.png
>
> The PHP page returns in ~60ms (well within the 400ms used for the perf
> recording).  If possible, please can anyone with more familiarity with
> perf and Postgres tell me whether the result of the recording looks
> healthy?
>
> I am wondering if there is some problem with the way in which Mapnik
> is implementing libpq -- I already moved from Mapnik 2.2 because it
> was effectively DOSing Postgres by leaving client connections open.
>
> Any help would be greatly appreciated.
>
>
> Kind regards
>
> Peter
>
>
Hi! What do you want to see in perf stats? Maybe you can explain your
problem more in details?

--
Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Does this perf output seem 'normal'?

От
Peter Devoy
Дата:
> If you really want to profile this, you should fire it off in a tight loop, using wget or ab2 or curl.

Thanks Jeff, that sounds like a smart idea.  I will try later when I
have access to the server.

>Hi! What do you want to see in perf stats? Maybe you can explain your problem more in details?

Hi Alex )  I am hoping to find out which function calls are taking the
longest because to the code is taking too long to only parse a small
XML file and do a few quick database queries.  The last version of
this software was able to do a lot more in the same period of time.  I
realise this most likely a client software issue but was hoping it may
be obvious whether Mapnik is spending too much time with Postgres.

Unfortunately I may just have to try again to build Mapnik from source
so that I can get debugging/profiling output, etc. -- I was hoping to
avoid it because it is quite painful.


Re: Does this perf output seem 'normal'?

От
Peter Devoy
Дата:
I have now done a recording for 60 seconds during a batch of 1000
requests and posted the results on a new issue on the Mapnik repo.

Although Postgres still comes out on top in the perf results I
struggle to believe this is a Postgres issue.  But, if anyone is
curious, the issue is here:
https://github.com/mapnik/mapnik/issues/3414

I may come back here if I don't have luck with the Mapnik developers.

All the best


Peter
3XE
P: 01326 567155
M: 07770 693662
A: 3XE Ltd
Tremough Innovation Centre
PENRYN
TR10 9TA
3XE Ltd · Registered in England and Wales · 9356871


On 27 April 2016 at 16:45, Peter Devoy <peter@3xe.co.uk> wrote:
>> If you really want to profile this, you should fire it off in a tight loop, using wget or ab2 or curl.
>
> Thanks Jeff, that sounds like a smart idea.  I will try later when I
> have access to the server.
>
>>Hi! What do you want to see in perf stats? Maybe you can explain your problem more in details?
>
> Hi Alex )  I am hoping to find out which function calls are taking the
> longest because to the code is taking too long to only parse a small
> XML file and do a few quick database queries.  The last version of
> this software was able to do a lot more in the same period of time.  I
> realise this most likely a client software issue but was hoping it may
> be obvious whether Mapnik is spending too much time with Postgres.
>
> Unfortunately I may just have to try again to build Mapnik from source
> so that I can get debugging/profiling output, etc. -- I was hoping to
> avoid it because it is quite painful.


Re: Does this perf output seem 'normal'?

От
Alex Ignatov
Дата:

On 28.04.2016 1:11, Peter Devoy wrote:
> I have now done a recording for 60 seconds during a batch of 1000
> requests and posted the results on a new issue on the Mapnik repo.
>
> Although Postgres still comes out on top in the perf results I
> struggle to believe this is a Postgres issue.  But, if anyone is
> curious, the issue is here:
> https://github.com/mapnik/mapnik/issues/3414
>
> I may come back here if I don't have luck with the Mapnik developers.
>
> All the best
>
>
> Peter
> 3XE
> P: 01326 567155
> M: 07770 693662
> A: 3XE Ltd
> Tremough Innovation Centre
> PENRYN
> TR10 9TA
> 3XE Ltd · Registered in England and Wales · 9356871
>
>
> On 27 April 2016 at 16:45, Peter Devoy <peter@3xe.co.uk> wrote:
>>> If you really want to profile this, you should fire it off in a tight loop, using wget or ab2 or curl.
>> Thanks Jeff, that sounds like a smart idea.  I will try later when I
>> have access to the server.
>>
>>> Hi! What do you want to see in perf stats? Maybe you can explain your problem more in details?
>> Hi Alex )  I am hoping to find out which function calls are taking the
>> longest because to the code is taking too long to only parse a small
>> XML file and do a few quick database queries.  The last version of
>> this software was able to do a lot more in the same period of time.  I
>> realise this most likely a client software issue but was hoping it may
>> be obvious whether Mapnik is spending too much time with Postgres.
>>
>> Unfortunately I may just have to try again to build Mapnik from source
>> so that I can get debugging/profiling output, etc. -- I was hoping to
>> avoid it because it is quite painful.
Can you expand node from perf report where 33.55% and 13.37% and 38.04%
38.04%.
PS Also do you  know that in perf report second column is the total cpu
time share and not the first ? And that is why total sum of second
column is 100% but sum from first can be != 100%? First column is the %
of time when child function(procedure) is working relative to main proc
time.



--
Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Does this perf output seem 'normal'?

От
Peter Devoy
Дата:
>Also do you  know that in perf report second column is the total cpu  time share and not the first
Thank you, I did notice they did not add up -- that makes more sense now.

>Can you expand node from perf report
Yes, my pleasure.  I added "--call-graph fractal,2,callee" to the
command, hopefully it seems more useful:

https://cloud.githubusercontent.com/assets/4375326/14889937/a1378982-0d58-11e6-89b6-aac5e1620fbb.png

To my amateur eye it seems like there is suspicious amount of memory &
thread management going on...?


--
Peter


Re: Does this perf output seem 'normal'?

От
Alex Ignatov
Дата:
On 28.04.2016 17:50, Peter Devoy wrote:
>> Also do you  know that in perf report second column is the total cpu  time share and not the first
> Thank you, I did notice they did not add up -- that makes more sense now.
>
>> Can you expand node from perf report
> Yes, my pleasure.  I added "--call-graph fractal,2,callee" to the
> command, hopefully it seems more useful:
>
> https://cloud.githubusercontent.com/assets/4375326/14889937/a1378982-0d58-11e6-89b6-aac5e1620fbb.png
>
> To my amateur eye it seems like there is suspicious amount of memory &
> thread management going on...?
>
>
> --
> Peter
Or this report is much more infomative and it is looks healthy =)!
At first glance i see that postgres is simply  forking i.e. spawning
child process for the new session. Creating new PG connection is slow
process, even if your select is light fast total response time can be
very long.

Have you ever heard about pgbouncer- the light Postgres connection
pooler? It is placed between you app server and pg and proxing app
server queries to pg server.  pgbouncer have some constant number of
connections to pg server and have some sort of queue on incoming
connection from app server.

TL;DR Use pgbouncer between app server and Postgres =)

Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Does this perf output seem 'normal'?

От
Peter Devoy
Дата:
I believe the library is spawning a new connection for each query so
that makes perfect sense.

I will see what improvements can be made at the application layer and
look at using pgbouncer if required =)

Thank you for sharing your knowledge!