Performance issue with 8.2.3 - "C" application

От: Karl Denninger
Тема: Performance issue with 8.2.3 - "C" application
Дата: ,
Msg-id: 46A6A549.1040606@denninger.net
(см: обсуждение, исходный текст)
Ответы: Re: Performance issue with 8.2.3 - "C" application  (Tom Lane)
Re: Performance issue with 8.2.3 - "C" application  (Nis Jørgensen)
Re: Performance issue with 8.2.3 - "C" application  (Gregory Stark)
Список: pgsql-performance

Скрыть дерево обсуждения

Performance issue with 8.2.3 - "C" application  (Karl Denninger, )
 Re: Performance issue with 8.2.3 - "C" application  (Tom Lane, )
  Re: Performance issue with 8.2.3 - "C" application  (Karl Denninger, )
   Re: Performance issue with 8.2.3 - "C" application  ("Merlin Moncure", )
    Re: Performance issue with 8.2.3 - "C" application  (Karl Denninger, )
     Re: Performance issue with 8.2.3 - "C" application  (Tom Lane, )
      Re: Performance issue with 8.2.3 - "C" application  (Karl Denninger, )
       Re: Performance issue with 8.2.3 - "C" application  (Karl Denninger, )
 Re: Performance issue with 8.2.3 - "C" application  (Nis Jørgensen, )
 Re: Performance issue with 8.2.3 - "C" application  (Gregory Stark, )

I've got an interesting issue here that I'm running into with 8.2.3

This is an application that has run quite well for a long time, and has
been operating without significant changes (other than recompilation)
since back in the early 7.x Postgres days.  But now we're seeing a LOT
more load than we used to with it, and suddenly, we're seeing odd
performance issues.

It APPEARS that the problem isn't query performance per-se.  That is,
while I can find a few processes here and there in a run state when I
look with a PS, I don't see them consistently churning.

But.... here's the query that has a habit of taking the most time....

select forum, * from post where toppost = 1 and (replied > (select
lastview from forumlog where login='theuser' and forum=post.forum and
number is null)) is not false AND (replied > (select lastview from
forumlog where login='theuser' and forum=post.forum and
number=post.number)) is not f
alse order by pinned desc, replied desc offset 0 limit 20

Now the numeric and "login" fields may change; when I plug it into
explain what I get back is:

                                             QUERY
PLAN
-------------------------------------------------------------------------------------------------------
 Limit  (cost=57270.22..57270.27 rows=20 width=757)
   ->  Sort  (cost=57270.22..57270.50 rows=113 width=757)
         Sort Key: pinned, replied
         ->  Index Scan using post_top on post  (cost=0.00..57266.37
rows=113 width=757)
               Index Cond: (toppost = 1)
               Filter: (((replied > (subplan)) IS NOT FALSE) AND
((replied > (subplan)) IS NOT FALSE))
               SubPlan
                 ->  Index Scan using forumlog_composite on forumlog
(cost=0.00..8.29 rows=1 width
                       Index Cond: (("login" = 'theuser'::text) AND
(forum = $0) AND (number = $1))
                 ->  Bitmap Heap Scan on forumlog  (cost=4.39..47.61
rows=1 width=8)
                       Recheck Cond: (("login" = 'theuser'::text) AND
(forum = $0))
                       Filter: (number IS NULL)
                       ->  Bitmap Index Scan on forumlog_composite
(cost=0.00..4.39 rows=12 width=0)
                             Index Cond: (("login" = 'theuser'::text)
AND (forum = $0))
(14 rows)

And indeed, it returns a fairly reasonable number of rows.

This takes a second or two to return - not all that bad - although this
is one that people hit a LOT.

One thing that DOES bother me is this line from the EXPLAIN output:
->  Index Scan using post_top on post  (cost=0.00..57266.53 rows=113
width=757)

This is indexed using:

  "post_top" btree (toppost)

Ain't nothing fancy there.  So how come the planner thinks this is going
to take that long?!?

More interesting, if I do a simple query on that line, I get....

ticker=> explain select forum from post where toppost='1';
                                QUERY PLAN
---------------------------------------------------------------------------
 Index Scan using post_top on post  (cost=0.00..632.03 rows=1013 width=11)
   Index Cond: (toppost = 1)

Hmmmmmm; that's a bit more reasonable.  So what's up with the above line?

What I'm seeing is that as concurrency increases, I see the CPU load
spike.  Disk I/O is low to moderate at less than 10% of maximum
according to systat -vm, no swap in use, 300mb dedicated to shared
memory buffers for Postgres (machine has 1GB of RAM and is a P4/3G/HT
running FreeBSD 6.2-STABLE)  It does not swap at all, so it does not
appear I've got a problem with running out of physical memory.  shmem is
pinned to physical memory via the sysctl tuning parameter to prevent
page table thrashing.

However, load goes up substantially and under moderate to high
concurrency gets into the 4-5 range with response going somewhat to
crap.  The application is still usable, but its not "crisp".  If I do a
"ps" during times that performance is particularly bad, I don't see any
particular overrepresentation of this query .vs. others (I have the
application doing a "setproctitle" so I know what command - and thus
what sets of queries - it is executing.)

Not sure where to start here.  It appears that I'm CPU limited and the
problem may be that this is a web-served application that must connect
to the Postgres backend for each transaction, perform its queries, and
then close the connection down - in other words the load may be coming
not from Postgres but rather from places I can't fix at the application
layer (e.g. fork() overhead, etc).  The DBMS and Apache server are on
the same machine, so there's no actual network overhead involved.

If that's the case the only solution is to throw more hardware at it.  I
can do that, but before I go tossing more CPU at the problem I'd like to
know I'm not just wasting money.

The application uses the "C" language interface and just calls
"Connectdb" - the only parameter is the dbname, so it should be
defaulting to the local socket.  It appears that this is the case.

The system load is typically divided up as about 2:1 system .vs. user
time on the CPU meter.

Ideas on where to start in trying to profile where the bottlenecks are?

The indices are btrees - I'm wondering if perhaps I should be using
something different here?

Thanks in advance.....

--
Karl Denninger ()
http://www.denninger.net




%SPAMBLOCK-SYS: Matched [@postgresql.org+], message ok


В списке pgsql-performance по дате сообщения:

От: Nis Jørgensen
Дата:
Сообщение: Re: Performance issue with 8.2.3 - "C" application
От: Gregory Stark
Дата:
Сообщение: Re: Performance issue with 8.2.3 - "C" application