Обсуждение: My server is oddly very slow

Поиск
Список
Период
Сортировка

My server is oddly very slow

От
adrien ducos
Дата:
I have 2 servers 1 for development (called dev) 1 for production (called
prod). The server for development is very basic  (a dual core 2.1ghtz on
a basic 300MB hard drive with 2GB of memory (this server is not
dedicated to postgres, it also uses php, mapserver and so on).

I assigned the memory as follow
max co 40
shared buffers 258MB
work_mem 8MB
maitainance_work_mem: 32MB
wal_buffers 8MB
check_point_seg 16
autovacuum on
effective-cache-size 744MB

Dev is installed on centos 4 32bits (on a virtual server openvz)

My production server is a QUAD CORE XEON E5310 (1,6Ghtz 3GB of RAM and
the hard drive is a very fast SAN with fiberchanel: the copy of files on
the SAN is about 3 times faster than on the development hard drive. This
is a fully dedicated server for postgres.

I assigned the memory as follow
max co 40
shared buffers 512MB
work_mem 16MB
maitainance_work_mem: 128MB
wal_buffers 8MB
check_point_seg 16
autovacuum on
effective-cache-size 1536MB

prod is installed on Red Hat Enterprise Linux ES release 4 (Nahant
Update 6) 32bits (not on a virtual server)
Those 2 distribution are very similar, and I use the same proprams RPM
on both systems

The version of both databases is postgres 8.4.1, I installed the same
database (in fact prod comes form a dump of dev once every week) The
dump is very big about 16GB compressed. Once it is dumped, I make a full
analyze on the database before I use it.

Now according to the hardware, I would suspect dev should be slower than
prod, but almose all queries are slower on prod, some times 10 times slower.

Today I made a query with an explain analyse (same query on both
servers) and I got for dev:


   1.
      "Limit  (cost=396689.94..399801.92 rows=2000 width=512) (actual
      time=8696.467..10028.179 rows=2000 loops=1)"
   2.
      "  ->  GroupAggregate  (cost=396689.94..424470.56 rows=17854
      width=512) (actual time=8696.463..10024.632 rows=2000 loops=1)"
   3.
      "        Filter: (sum(effectif_max) >= 10)"
   4.
      "        ->  Sort  (cost=396689.94..397088.34 rows=159360
      width=512) (actual time=8693.396..8745.436 rows=8344 loops=1)"
   5.
      "              Sort Key: implantation_company_id, virtual_building_id"
   6.
      "              Sort Method:  external merge  Disk: 125928kB"
   7.
      "              ->  Bitmap Heap Scan on gen_establishment_search
       (cost=8156.66..346425.20 rows=159360 width=512) (actual
      time=2545.813..6688.078 rows=152200 loops=1)"
   8.
      "                    Recheck Cond: (gis_departement_id =
      '75'::bpchar)"
   9.
      "                    Filter: (((telephone IS NOT NULL) OR
      (telephone_siege IS NOT NULL)) AND (NOT etablissement_masque) AND
      (ref_zone_permission_id = ANY ('{2,1}'::integer[])) AND
      (created_by_user_group = 1) AND (ref_establishment_type_id = ANY
      ('{1,2,3}'::integer[])))"
  10.
      "                    ->  Bitmap Index Scan on
      gen_establishment_search_gis_departement_id  (cost=0.00..8116.82
      rows=498680 width=0) (actual time=2435.952..2435.952 rows=502304
      loops=1)"
  11.
      "                          Index Cond: (gis_departement_id =
      '75'::bpchar)"
  12.
      "Total runtime: 10109.194 ms"


-> so 10 seconds






and for prod

   1.
      Limit  (cost=397787.02..400793.19 rows=2000 width=513) (actual
      time=94188.493..97457.042 rows=2000 loops=1)
   2.
         ->  GroupAggregate  (cost=397787.02..425288.95 rows=18297
      width=513) (actual time=94188.489..97452.268 rows=2000 loops=1)
   3.
               Filter: (sum(effectif_max) >= 10)
   4.
               ->  Sort  (cost=397787.02..398179.69 rows=157069
      width=513) (actual time=94187.660..94530.617 rows=8312 loops=1)
   5.
                     Sort Key: implantation_company_id, virtual_building_id
   6.
                     Sort Method:  external merge  Disk: 124192kB
   7.
                     ->  Bitmap Heap Scan on gen_establishment_search
       (cost=8231.64..347992.15 rows=157069 width=513) (actual
      time=836.994..91818.746 rows=149851 loops=1)
   8.
                           Recheck Cond: (gis_departement_id = '75'::bpchar)
   9.
                           Filter: (((telephone IS NOT NULL) OR
      (telephone_siege IS NOT NULL)) AND (NOT etablissement_masque) AND
      (ref_zone_permission_id = ANY ('{2,1}'::int
  10.
      eger[])) AND (created_by_user_group = 1) AND
      (ref_establishment_type_id = ANY ('{1,2,3}'::integer[])))
  11.
                           ->  Bitmap Index Scan on
      gen_establishment_search_gis_departement_id  (cost=0.00..8192.37
      rows=502886 width=0) (actual time=699.530..699.530 rows=
  12.
      507213 loops=1)
  13.
                                 Index Cond: (gis_departement_id =
      '75'::bpchar)
  14.
       Total runtime: 97496.574 ms


-> 90 seconds

for some other queries is is worst. None of the queries I have tested
are faster on prod.

So I checked the memory on prod during my query execution:
vmstat
procs -----------memory---------- ---swap-- -----io---- --system--
----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy
id wa
 0  1   1280  23320   6356 2975956    0    0    29    45    1     0  2
0 95  3

The swap is not used but something is a bit odd: the cache is using 98%
of the RAM

if I do
SHOW shared_buffers;
I get

shared_buffers
----------------
512MB


as I would expect from my configuration.


I've got out of idea, any idea?


--

Logo_HBS_mail.jpg
   Adrien DUCOS
   Analyste développeur
   aducos@hbs-research.com <mailto:aducos@hbs-research.com>
   www.hbs-research.com <http://www.hbs-research.com/>
   +33 (0)9 70 44 64 10
   24 rue de l'Est
   75020 Paris




Re: My server is oddly very slow

От
"Kevin Grittner"
Дата:
adrien ducos <aducos@hbs-research.com> wrote:

[rearranged somewhat]

> The version of both databases is postgres 8.4.1

[sigh]  You really should upgrade.

http://www.postgresql.org/support/versioning

http://www.postgresql.org/docs/8.4/static/release.html

> So I checked the memory on prod during my query execution:
> vmstat

> procs -----------memory----------
>  r  b   swpd   free   buff  cache
>  0  1   1280  23320   6356 2975956

> The swap is not used but something is a bit odd: the cache is
> using 98% of the RAM

The first line in vmstat (which is all you get if you run it without
a number) is averages since the OS was booted.  That's not usually
very helpful.  What does this show?:

free -m

or run the query while watching the output from:

vmstat 1

> the copy of files on the SAN is about 3 times faster than on the
> development hard drive.

Copy time doesn't always correlate real well with database
performance.  Have you tried running bonnie++ or similar?

-Kevin

Re: My server is oddly very slow

От
adrien ducos
Дата:
Kevin Grittner a écrit :
> adrien ducos <aducos@hbs-research.com> wrote:
>
> [rearranged somewhat]
>
>
>> The version of both databases is postgres 8.4.1
>>
>
> [sigh]  You really should upgrade.
>
> http://www.postgresql.org/support/versioning
>
> http://www.postgresql.org/docs/8.4/static/release.html
>
I know I should but only my system administrator is allowed to do so, we
will soon change to postgres 9.1 anyway.

>
>
>> So I checked the memory on prod during my query execution:
>> vmstat
>>
>
>
>> procs -----------memory----------
>>  r  b   swpd   free   buff  cache
>>  0  1   1280  23320   6356 2975956
>>
>
>
>> The swap is not used but something is a bit odd: the cache is
>> using 98% of the RAM
>>
>
> The first line in vmstat (which is all you get if you run it without
> a number) is averages since the OS was booted.  That's not usually
> very helpful.  What does this show?:
>
> free -m
>
> or run the query while watching the output from:
>
> vmstat 1
>
>
So I did that also, and have the same results:

free -m
total used free shared buffers cached
Mem: 3042 3000 41 0 4 2896
-/+ buffers/cache: 99 2942
Swap: 1983 1 1982

same for vmstat 1 while runing the query

vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system--
----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 1280 45752 4496 2963460 0 0 29 45 0 1 2 0 95 3


Almost everything is "cached", I read on some documents that cache means
linux uses the free memory to cache disk data (and avoid waisting the
free memory).

On the other hand, when I do ps -aux while the query is running I see
the query is using 17% of the RAM, so I guess it is using the RAM ok.

The question could be, why is linux marking that 95% of the memory is
cache while in the same time ps -aux shows my query uses 17% of the RAM.


>> the copy of files on the SAN is about 3 times faster than on the
>> development hard drive.
>>
>
> Copy time doesn't always correlate real well with database
> performance.  Have you tried running bonnie++ or similar?
>
Ok, we didn't check on that but we will. I'll see with my system admin
next week (he is on hollydays now). On the paper thought it is a SAN
composed of several velociraptors. It is supposed to be faster than a
normal and rather old hard drive (speed and access time).


Thanks,

Adrien


--

Logo_HBS_mail.jpg
Adrien DUCOS
Analyste développeur
aducos@hbs-research.com <mailto:aducos@hbs-research.com>
www.hbs-research.com <http://www.hbs-research.com/>
+33 (0)9 70 44 64 10
24 rue de l'Est
75020 Paris


Re: My server is oddly very slow

От
"Kevin Grittner"
Дата:
adrien ducos <aducos@hbs-research.com> wrote:
> Kevin Grittner a écrit :

>> What does this show?:
>>
>> free -m

>              total     used     free   shared  buffers   cached
> Mem:          3042     3000       41        0        4     2896
> -/+ buffers/cache:       99     2942
> Swap:         1983        1     1982

This was while the PostgreSQL server was running and showing
shared_buffers = 512MB?  If so, there's something seriously wrong
with your OS.

>> The first line in vmstat (which is all you get if you run it
>> without a number) is averages since the OS was booted.  That's
>> not usually very helpful.  What does this show?:
>>
>> vmstat 1

> same for vmstat 1 while runing the query
>
> vmstat 1
> procs -----------memory---------- ---swap-- -----io---- -system--
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs
>  0  0   1280  45752   4496 2963460    0    0   29    45    0    1

Those look like the sort of useless numbers you get on the *first*
line of vmstat, even with a number.  What did a few of the *other*
lines look like while the query was running?

-Kevin

Re: My server is oddly very slow

От
adrien ducos
Дата:
Kevin Grittner a écrit :
> adrien ducos <aducos@hbs-research.com> wrote:
>
>> Kevin Grittner a écrit :
>>
>
>
>>> What does this show?:
>>>
>>> free -m
>>>
>
>
>>              total     used     free   shared  buffers   cached
>> Mem:          3042     3000       41        0        4     2896
>> -/+ buffers/cache:       99     2942
>> Swap:         1983        1     1982
>>
>
> This was while the PostgreSQL server was running and showing
> shared_buffers = 512MB?  If so, there's something seriously wrong
> with your OS.
>
>
>>> The first line in vmstat (which is all you get if you run it
>>> without a number) is averages since the OS was booted.  That's
>>> not usually very helpful.  What does this show?:
>>>
>>> vmstat 1
>>>
>
>
>> same for vmstat 1 while runing the query
>>
>> vmstat 1
>> procs -----------memory---------- ---swap-- -----io---- -system--
>>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs
>>  0  0   1280  45752   4496 2963460    0    0   29    45    0    1
>>
>
> Those look like the sort of useless numbers you get on the *first*
> line of vmstat, even with a number.  What did a few of the *other*
> lines look like while the query was running?
>
> -Kevin
> .
>
>
They Look exactly the same as the first one, with cache changing form

2963460 to 2961000 so not much difference at all...

So what you say is what I thought, there is something wrong with the OS... we will try to put centos (the same OS as in
development)instead of redhat the OS installed by the internet provider and see how it goes. Thanks for your feedback. 

Adrien


--

Logo_HBS_mail.jpg
   Adrien DUCOS
   Analyste développeur
   aducos@hbs-research.com <mailto:aducos@hbs-research.com>
   www.hbs-research.com <http://www.hbs-research.com/>
   +33 (0)9 70 44 64 10
   24 rue de l'Est
   75020 Paris




Prepared statments: partial indexes are avoided!

От
adrien ducos
Дата:
I have postgres 8.4

I have a table "foo" with 16 million lines 99% of those lines have a
column "bar" = 1.

I had an index on this table:
CREATE INDEX index_foo_bar ON foo using btree (bar);

The thing is that the query
select count(*) from foo where bar = 1;
is not using the query (it is useless that is normal, using this index
would be slower than not using it)

the query
select count(*) from foo where bar = 2; uses the index I have the answer
in 20ms.

With a prepared statement I have

PREPARE fooplan (int) AS
    select count(*) from foo where bar = $1;
execute fooplan (2);

also a few milliseconds

Now in order to optimise this last request I droped the first index and
added a new index:

CREATE INDEX index_foo_bar ON foo using btree (bar) where
created_by_user_group <> 1;

since the query is only using the index in this case anyway.

with the query
explain analyze select count(*) from foo where bar = 2; it uses the new
index :

"Aggregate  (cost=8.29..8.30 rows=1 width=0) (actual time=0.119..0.119
rows=1 loops=1)"
"  ->  Index Scan using index_foo_bar on foo  (cost=0.00..8.29 rows=1
width=0) (actual time=0.017..0.084 rows=63 loops=1)"
"        Index Cond: (bar = 2)"
"Total runtime: 0.144 ms"

so great improvement from 20ms to 0.144ms

and with the prepared statement... things becomes very bad:
PREPARE fooplan (int) AS
    select count(*) from foo where bar = $1;
explain analyze execute fooplan (2);


"Aggregate  (cost=627406.08..627406.09 rows=1 width=0) (actual
time=11627.315..11627.316 rows=1 loops=1)"
"  ->  Seq Scan on foo  (cost=0.00..603856.80 rows=9419712 width=0)
(actual time=7070.334..11627.266 rows=63 loops=1)"
"        Filter: (bar = $1)"
"Total runtime: 11627.357 ms"

No index uses and the time becomes very bad.

This is probably due to the fact the prepared statement could have "1"
as an input and so it avoids the index completely, but this is not very
nice for software optimization since I am using pdo which uses prepared
statements all the time and is unable to use all the partial indexes.

The problem is I have 90 GB of indexes in the database and partial
indexes could help me to save some of this space on my server, in
addition to improve the speed of the queries.

Adrien DUCOS

--

Logo_HBS_mail.jpg

Adrien DUCOS
Analyste développeur
aducos@hbs-research.com <mailto:aducos@hbs-research.com>
www.hbs-research.com <http://www.hbs-research.com/>
+33 (0)9 70 44 64 10
11-15 quai du Président Paul Doumer
92400 Courbevoie



Re: Prepared statments: partial indexes are avoided!

От
Simon Riggs
Дата:
On Mon, Jun 20, 2011 at 3:01 PM, adrien ducos <aducos@hbs-research.com> wrote:
> I have postgres 8.4
>
> I have a table "foo" with 16 million lines 99% of those lines have a column
> "bar" = 1.
>
> I had an index on this table:
> CREATE INDEX index_foo_bar ON foo using btree (bar);
>
> The thing is that the query
> select count(*) from foo where bar = 1;
> is not using the query (it is useless that is normal, using this index would
> be slower than not using it)
>
> the query
> select count(*) from foo where bar = 2; uses the index I have the answer in
> 20ms.
>
> With a prepared statement I have
>
> PREPARE fooplan (int) AS
>   select count(*) from foo where bar = $1;
> execute fooplan (2);
>
> also a few milliseconds
>
> Now in order to optimise this last request I droped the first index and
> added a new index:
>
> CREATE INDEX index_foo_bar ON foo using btree (bar) where
> created_by_user_group <> 1;
>
> since the query is only using the index in this case anyway.
>
> with the query
> explain analyze select count(*) from foo where bar = 2; it uses the new
> index :
>
> "Aggregate  (cost=8.29..8.30 rows=1 width=0) (actual time=0.119..0.119
> rows=1 loops=1)"
> "  ->  Index Scan using index_foo_bar on foo  (cost=0.00..8.29 rows=1
> width=0) (actual time=0.017..0.084 rows=63 loops=1)"
> "        Index Cond: (bar = 2)"
> "Total runtime: 0.144 ms"
>
> so great improvement from 20ms to 0.144ms
>
> and with the prepared statement... things becomes very bad:
> PREPARE fooplan (int) AS
>   select count(*) from foo where bar = $1;
> explain analyze execute fooplan (2);
>
>
> "Aggregate  (cost=627406.08..627406.09 rows=1 width=0) (actual
> time=11627.315..11627.316 rows=1 loops=1)"
> "  ->  Seq Scan on foo  (cost=0.00..603856.80 rows=9419712 width=0) (actual
> time=7070.334..11627.266 rows=63 loops=1)"
> "        Filter: (bar = $1)"
> "Total runtime: 11627.357 ms"
>
> No index uses and the time becomes very bad.
>
> This is probably due to the fact the prepared statement could have "1" as an
> input and so it avoids the index completely, but this is not very nice for
> software optimization since I am using pdo which uses prepared statements
> all the time and is unable to use all the partial indexes.
>
> The problem is I have 90 GB of indexes in the database and partial indexes
> could help me to save some of this space on my server, in addition to
> improve the speed of the queries.

Unfortunately, prepared statements do act in the way you have seen.

I have a patch into 9.2 under discussion to improve upon this
situation, but don't hold your breath for that.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Prepared statments: partial indexes are avoided!

От
adrien ducos
Дата:
Simon Riggs a écrit :
> On Mon, Jun 20, 2011 at 3:01 PM, adrien ducos <aducos@hbs-research.com> wrote:
>
>> I have postgres 8.4
>>
>> I have a table "foo" with 16 million lines 99% of those lines have a column
>> "bar" = 1.
>>
>> I had an index on this table:
>> CREATE INDEX index_foo_bar ON foo using btree (bar);
>>
>> The thing is that the query
>> select count(*) from foo where bar = 1;
>> is not using the query (it is useless that is normal, using this index would
>> be slower than not using it)
>>
>> the query
>> select count(*) from foo where bar = 2; uses the index I have the answer in
>> 20ms.
>>
>> With a prepared statement I have
>>
>> PREPARE fooplan (int) AS
>>   select count(*) from foo where bar = $1;
>> execute fooplan (2);
>>
>> also a few milliseconds
>>
>> Now in order to optimise this last request I droped the first index and
>> added a new index:
>>
>> CREATE INDEX index_foo_bar ON foo using btree (bar) where
>> created_by_user_group <> 1;
>>
>> since the query is only using the index in this case anyway.
>>
>> with the query
>> explain analyze select count(*) from foo where bar = 2; it uses the new
>> index :
>>
>> "Aggregate  (cost=8.29..8.30 rows=1 width=0) (actual time=0.119..0.119
>> rows=1 loops=1)"
>> "  ->  Index Scan using index_foo_bar on foo  (cost=0.00..8.29 rows=1
>> width=0) (actual time=0.017..0.084 rows=63 loops=1)"
>> "        Index Cond: (bar = 2)"
>> "Total runtime: 0.144 ms"
>>
>> so great improvement from 20ms to 0.144ms
>>
>> and with the prepared statement... things becomes very bad:
>> PREPARE fooplan (int) AS
>>   select count(*) from foo where bar = $1;
>> explain analyze execute fooplan (2);
>>
>>
>> "Aggregate  (cost=627406.08..627406.09 rows=1 width=0) (actual
>> time=11627.315..11627.316 rows=1 loops=1)"
>> "  ->  Seq Scan on foo  (cost=0.00..603856.80 rows=9419712 width=0) (actual
>> time=7070.334..11627.266 rows=63 loops=1)"
>> "        Filter: (bar = $1)"
>> "Total runtime: 11627.357 ms"
>>
>> No index uses and the time becomes very bad.
>>
>> This is probably due to the fact the prepared statement could have "1" as an
>> input and so it avoids the index completely, but this is not very nice for
>> software optimization since I am using pdo which uses prepared statements
>> all the time and is unable to use all the partial indexes.
>>
>> The problem is I have 90 GB of indexes in the database and partial indexes
>> could help me to save some of this space on my server, in addition to
>> improve the speed of the queries.
>>
>
> Unfortunately, prepared statements do act in the way you have seen.
>
> I have a patch into 9.2 under discussion to improve upon this
> situation, but don't hold your breath for that.
>
>
Ok, Thanks for your answer. I think this should at least be writen in
this documentation:
http://www.postgresql.org/docs/9.0/static/indexes-partial.html
as a known limitation to avoid people spending hours of search for why
is the application becoming slow.
--

Logo_HBS_mail.jpg

Adrien DUCOS
Analyste développeur
aducos@hbs-research.com <mailto:aducos@hbs-research.com>
www.hbs-research.com <http://www.hbs-research.com/>
+33 (0)9 70 44 64 10
11-15 quai du Président Paul Doumer
92400 Courbevoie



Re: Prepared statments: partial indexes are avoided!

От
Greg Smith
Дата:
On 06/20/2011 10:55 AM, adrien ducos wrote:
> I think this should at least be writen in this documentation:
> http://www.postgresql.org/docs/9.0/static/indexes-partial.html
> as a known limitation to avoid people spending hours of search for why
> is the application becoming slow.

It's documented in the "Notes" section of
http://www.postgresql.org/docs/9.0/static/sql-prepare.html because it's
not specific to partial indexes.  Prepared statements are very risky
when you run them against queries with a very skewed distribution, like
your case where 99% of the values for a column are the same.  They help
to prevent SQL injection and can reduce parsing/planning overhead, but
prepared statements can have terrible performance in some unfortunately
common situations.  Just recently I had one clients end up switching to
another ORM specifically because they couldn't get the one they started
with to stop using prepared statements, and those gave them terrible
query plans.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


Re: Prepared statments: partial indexes are avoided!

От
bricklen
Дата:
On Mon, Jun 20, 2011 at 7:06 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Unfortunately, prepared statements do act in the way you have seen.
>
> I have a patch into 9.2 under discussion to improve upon this
> situation, but don't hold your breath for that.
>
> --
>  Simon Riggs                   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>

I haven't used Oracle in years, but I seem to recall that they had
"bind variable peeking" to work around this type of problem. That
would be interesting to have in PG.