Re: Re : Solaris Performance - Profiling

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Re : Solaris Performance - Profiling
Дата
Msg-id 25091.1017763371@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Re : Solaris Performance - Profiling  (Mark kirkwood <markir@slingshot.co.nz>)
Ответы Re: Re : Solaris Performance - Profiling (Solved)  (Mark kirkwood <markir@slingshot.co.nz>)
Список pgsql-general
Mark kirkwood <markir@slingshot.co.nz> writes:
> I am a little concerned that the cpu time recored by the profiler seems
> to be a bit on the short side at 4047.53 seconds (67 minutes), as
> opposed to 119.2 for the backend itself. Are these two numbers supposed
> to match closely ? (they did for shorter queries..)

Hmm.  Where exactly did you get those numbers from?  I see 4118.54 sec
as the total CPU accounted for in the profile.

Some versions of gprof hide the time spent in the profiler subroutine,
leading to discrepancies between the time accounted for in the profile
and the actual process CPU time.  However, yours doesn't seem to be one
of them --- internal_mcount is right there.

> Anyway in the hope that this captured output is interesting enough, here
> it is...or rather here is some of it (biggish file), so I am leaving a
> complete copy under :
> http://homepages.slingshot.co.nz/~markir/tar/solaris/
> Let me know if further information is likely to be helpful. I have
> profiled the same query (on the same dataset) on Freebsd and have the
> output available at :
> http://homepages.slingshot.co.nz/~markir/tar/freebsd/

Hmm.  Assuming that the profile data is trustworthy and the queries are
indeed the same (did you compare EXPLAIN output?), it seems that
Solaris' problem is a spectacularly bad qsort() implementation.
The FreeBSD profile shows:

                                       1             qsort [29]
                0.00        0.31       1/2           load_password_cache [44]
                0.00        0.31       1/2           tuplesort_performsort [45]
[29]     9.8    0.01        0.62       2+1       qsort [29]
                0.01        0.62  148033/148033      qsort_comparetup [30]
                                       1             qsort [29]

where Solaris has

                0.00     1387.23       1/2           load_password_cache [18]
                0.00     1387.23       1/2           tuplesort_performsort [19]
[4]     68.5    0.00     2774.46       2         qsort [4]
               19.73     2754.58       1/1           qst [5]
                0.01        0.14   88012/1680561053     qsort_comparetup [6]
                0.00        0.00       1/61473       .umul [154]

-----------------------------------------------

               19.73     2754.58       1/1           qsort [4]
[5]     68.5   19.73     2754.58       1         qst [5]
              167.74     2586.81 1680473041/1680561053     qsort_comparetup [6]
                0.02        0.00   59630/59703       .udiv [114]
                0.01        0.00   59630/61473       .umul [154]

and all the rest of the top profile entries are explained by the fact
that qsort_comparetup is called 1.68 billion times instead of 148K
times.

Can these really be the same dataset?  Can Solaris' qsort really be that
outstandingly incompetent?  How many rows are actually being sorted
here, anyway?

It might be entertaining to snarf a qsort off the net (from glibc,
perhaps) and link it into Postgres to see if you get better results.

            regards, tom lane

В списке pgsql-general по дате отправления:

Предыдущее
От: Doug McNaught
Дата:
Сообщение: Re: v7.2.1 Released: Critical Bug Fix
Следующее
От: "David Esposito"
Дата:
Сообщение: Re: Core dump on PG 7.1.3