Обсуждение: PostgreSQL performance issues

Поиск
Список
Период
Сортировка

PostgreSQL performance issues

От
"Tiago J. Adami"
Дата:
Hi all, I'm working for a brazillian company developing and maintaining a ERP sw that uses PostgreSQL as it main OLTP
databasesystem. We're just to start the migration to IBM DB2 because of many performance issues. I searched the
solutionfor these performance problems, and can't find anything on the other web foruns. I'll put them in this post as
topics,but first I'll describe how's the ERP database's schema: - More than 200 tables; - About 10 tables with about
10,000transactions and 15,000 new rows per day; - These 10 tables has at least 12 table indexes and 3 triggers; - Many
ofour customer servers uses AMD64 processors running Red Hat Enterprise (with EXT3), but we have some using Xeon
processorsand Windows 2003 Server (NTFS), and the issues still the same; - All servers have at least 1 Gb of dedicated
RAM,with no virtualization; - All servers uses at least 2 disks on RAID 0 (Ultra-SCSI disks); - Database encoding:
LATIN1. <b>The issue topics:</b> 1) As the database grows on our customers, lower performance occurs. After one week of
use,the I/O on database is extremely high. It appears that VACUUM FULL and/or VACUUM ANALYZE doesn't work on this
databases.2) We have a very complex view mount on other views. When we cancel a simple SELECT on this top-level view
(expectingreturn a max. of 100 rows for example) the PostgreSQL process starts a infinite loop (we left more than 4
daysand the loop doesn't stops), using 100% of all processors on the server. 3) On these servers, the disk usage grows
verysmall than the records loaded into database. For example, after restoring a backup, the database DIR have about 40
Gb(with all indexes created). After one week of use, and about 500,000 new records on tables, the database size grows
toabout 42 Gb, but on Windows 2003 Server we can see the high fragmentation of disk (maybe on linux this occurs too).
4)VACUUM FULL and/or VACUUM ANALYZE appears to doesn't work perfectly. <b>The temporary (but extensive) solution:</b> I
haveseem that one action could solve this problems for a short time. It is as follows: 1) Create a database dump into a
.SQLplain text file; 2) Drop the OTLP original database; 3) Create a new database using the original name; 4) Restores
the.SQL file with psql. The cost of use PostgreSQL database on our sw came to a very high levels, so we're evaluating
thesame database schema and data on other databases as IBM DB2 9, and these issues doesn't work. However, we need solve
thisproblems on PgSQL, as exists many customers who will not migrate to DB2. Anyone can help me? <br /><hr align="left"
width="300"/> View this message in context: <a
href="http://www.nabble.com/PostgreSQL-performance-issues-tf4670379.html#a13341797">PostgreSQLperformance issues</a><br
/>Sent from the <a href="http://www.nabble.com/PostgreSQL---hackers-f780.html">PostgreSQL - hackers mailing list
archive</a>at Nabble.com.<br /> 

Re: PostgreSQL performance issues

От
"Heikki Linnakangas"
Дата:
Tiago J. Adami wrote:
> The issue topics:
> 1) As the database grows on our customers, lower performance occurs. After
> one week of use, the I/O on database is extremely high. It appears that
> VACUUM FULL and/or VACUUM ANALYZE doesn't work on this databases.

That sounds like a classic case of "you're not running vacuum often
enough". VACUUM FULL shouldn't be necessary in normal operation. Have
you investigated which queries are causing the I/O?

What version of PostgreSQL are you running? In recent versions, just
enabling autovacuum does a reasonable job in most scenarios.

Check that you don't have any connections forgotten in "idle in
transaction" state. That would prevent VACUUM from recovering dead space.

> 2) We have a very complex view mount on other views. When we cancel a simple
> SELECT on this top-level view (expecting return a max. of 100 rows for
> example) the PostgreSQL process starts a infinite loop (we left more than 4
> days and the loop doesn't stops), using 100% of all processors on the
> server.

PostgreSQL has a multi-process, single-thread architecture, which means
that only a single CPU can be executing a single query at at time. That
makes it hard to believe that canceling a query uses 100% of *all*
processors.

Have you tried EXPLAINing that query to see why it take so long? Can you
post the query and the EXPLAIN output?

> 3) On these servers, the disk usage grows very small than the records loaded
> into database. For example, after restoring a backup, the database DIR have
> about 40 Gb (with all indexes created). After one week of use, and about
> 500,000 new records on tables, the database size grows to about 42 Gb, but
> on Windows 2003 Server we can see the high fragmentation of disk (maybe on
> linux this occurs too).

Do you think the fragmentation causes you problems? Do you do large
sequential scans regularly? I suppose you could run a defragmenter if
you suspect that's behind the increase in I/O.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: PostgreSQL performance issues

От
Gregory Stark
Дата:
"Tiago J. Adami" <adamitj@gmail.com> writes:

> The issue topics:
> 1) As the database grows on our customers, lower performance occurs. After
> one week of use, the I/O on database is extremely high. It appears that
> VACUUM FULL and/or VACUUM ANALYZE doesn't work on this databases.

VACUUM FULL is a last-resort utility for recovering from a bad situation. It
shouldn't be necessary under normal operation. The intended mode is to run
VACUUM (or VACUUM ANALYZE) frequently -- possibly several times a day -- to
maintain the data density.

How frequently are you running VACUUM (or VACUUM ANALYZE)? How many updates
and deletes are you executing in that interval?

If you run VACUUM (or VACUUM ANALYZE) interactively what does it print at the
end of the operation?

> 2) We have a very complex view mount on other views. When we cancel a simple
> SELECT on this top-level view (expecting return a max. of 100 rows for
> example) the PostgreSQL process starts a infinite loop (we left more than 4
> days and the loop doesn't stops), using 100% of all processors on the
> server.

That does sound bad. Would it be possible to attach to the process when it's
spinning and get a back trace? Also, what version is this precisely? Have you
taken all the bug-fix updates for the major version you're using?

> 3) On these servers, the disk usage grows very small than the records loaded
> into database. For example, after restoring a backup, the database DIR have
> about 40 Gb (with all indexes created). After one week of use, and about
> 500,000 new records on tables, the database size grows to about 42 Gb, but
> on Windows 2003 Server we can see the high fragmentation of disk (maybe on
> linux this occurs too).

Postgres does extend files as needed and some filesystems may deal better with
this than others. I think this is something we don't know much about on
Windows.

You might find running a CLUSTER on the fragmented tables improves matters.
CLUSTER effectively does a full vacuum too so it would leave you in a good
situation to monitor the growth and vacuum frequency necessary from that point
forward too. The downsides are that CLUSTER locks the table while it runs and
it requires enough space to store a whole second copy of the table and its
indexes.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com


Re: PostgreSQL performance issues

От
Deblauwe Gino
Дата:
Tiago J. Adami schreef:
> Hi all, I'm working for a brazillian company developing and 
> maintaining a ERP sw that uses PostgreSQL as it main OLTP database 
> system. We're just to start the migration to IBM DB2 because of many 
> performance issues. I searched the solution for these performance 
> problems, and can't find anything on the other web foruns. I'll put 
> them in this post as topics, but first I'll describe how's the ERP 
> database's schema: - More than 200 tables; - About 10 tables with 
> about 10,000 transactions and 15,000 new rows per day; - These 10 
> tables has at least 12 table indexes and 3 triggers; - Many of our 
> customer servers uses AMD64 processors running Red Hat Enterprise 
> (with EXT3), but we have some using Xeon processors and Windows 2003 
> Server (NTFS), and the issues still the same; - All servers have at 
> least 1 Gb of dedicated RAM, with no virtualization; - All servers 
> uses at least 2 disks on RAID 0 (Ultra-SCSI disks); - Database 
> encoding: LATIN 1. *The issue topics:* 1) As the database grows on our 
> customers, lower performance occurs. After one week of use, the I/O on 
> database is extremely high. It appears that VACUUM FULL and/or VACUUM 
> ANALYZE doesn't work on this databases. 2) We have a very complex view 
> mount on other views. When we cancel a simple SELECT on this top-level 
> view (expecting return a max. of 100 rows for example) the PostgreSQL 
> process starts a infinite loop (we left more than 4 days and the loop 
> doesn't stops), using 100% of all processors on the server. 3) On 
> these servers, the disk usage grows very small than the records loaded 
> into database. For example, after restoring a backup, the database DIR 
> have about 40 Gb (with all indexes created). After one week of use, 
> and about 500,000 new records on tables, the database size grows to 
> about 42 Gb, but on Windows 2003 Server we can see the high 
> fragmentation of disk (maybe on linux this occurs too). 4) VACUUM FULL 
> and/or VACUUM ANALYZE appears to doesn't work perfectly. *The 
> temporary (but extensive) solution:* I have seem that one action could 
> solve this problems for a short time. It is as follows: 1) Create a 
> database dump into a .SQL plain text file; 2) Drop the OTLP original 
> database; 3) Create a new database using the original name; 4) 
> Restores the .SQL file with psql. The cost of use PostgreSQL database 
> on our sw came to a very high levels, so we're evaluating the same 
> database schema and data on other databases as IBM DB2 9, and these 
> issues doesn't work. However, we need solve this problems on PgSQL, as 
> exists many customers who will not migrate to DB2. Anyone can help me?
> ------------------------------------------------------------------------
> View this message in context: PostgreSQL performance issues 
> <http://www.nabble.com/PostgreSQL-performance-issues-tf4670379.html#a13341797>
> Sent from the PostgreSQL - hackers mailing list archive 
> <http://www.nabble.com/PostgreSQL---hackers-f780.html> at Nabble.com.
a) I didn't see a reindex in your mail.  That's why a backup and a 
restore work and a vacuum doesn't
http://www.postgresql.org/docs/current/static/sql-reindex.html
Do this at least daily with that many inserts

b) Which version do you run of postgres?  pg82 is a lot faster then 
previous versions, pg83 will be again a lot faster when it comes out

c) Fragmentation won't happen on linux as far as I know, but if you want 
to be faster why do you use the slowest filesystem ever?
ReiserFS is a lot better.  On windows => frequent defragmenting, it's 
said that with NTFS it's not necessary anymore, but I don't believe it.




Re: PostgreSQL performance issues

От
Cédric Villemain
Дата:
Gregory Stark a écrit :
> "Tiago J. Adami" <adamitj@gmail.com> writes:
>
>   
>> The issue topics:
>> 1) As the database grows on our customers, lower performance occurs. After
>> one week of use, the I/O on database is extremely high. It appears that
>> VACUUM FULL and/or VACUUM ANALYZE doesn't work on this databases.
>>     
>
> VACUUM FULL is a last-resort utility for recovering from a bad situation. It
> shouldn't be necessary under normal operation. The intended mode is to run
> VACUUM (or VACUUM ANALYZE) frequently -- possibly several times a day -- to
> maintain the data density.
>
> How frequently are you running VACUUM (or VACUUM ANALYZE)? How many updates
> and deletes are you executing in that interval?
>
> If you run VACUUM (or VACUUM ANALYZE) interactively what does it print at the
> end of the operation?
>
>   
>> 2) We have a very complex view mount on other views. When we cancel a simple
>> SELECT on this top-level view (expecting return a max. of 100 rows for
>> example) the PostgreSQL process starts a infinite loop (we left more than 4
>> days and the loop doesn't stops), using 100% of all processors on the
>> server.
>>     
>
> That does sound bad. Would it be possible to attach to the process when it's
> spinning and get a back trace? Also, what version is this precisely? Have you
> taken all the bug-fix updates for the major version you're using?
>
>   
>> 3) On these servers, the disk usage grows very small than the records loaded
>> into database. For example, after restoring a backup, the database DIR have
>> about 40 Gb (with all indexes created). After one week of use, and about
>> 500,000 new records on tables, the database size grows to about 42 Gb, but
>> on Windows 2003 Server we can see the high fragmentation of disk (maybe on
>> linux this occurs too).
>>     
>
> Postgres does extend files as needed and some filesystems may deal better with
> this than others. I think this is something we don't know much about on
> Windows.
>   
humm, kernel 2.6.23 introduce fallocate ...
(I am perhaps about re-lauching a flamewar)
Does postgresql use posix_fallocate ?

> You might find running a CLUSTER on the fragmented tables improves matters.
> CLUSTER effectively does a full vacuum too so it would leave you in a good
> situation to monitor the growth and vacuum frequency necessary from that point
> forward too. The downsides are that CLUSTER locks the table while it runs and
> it requires enough space to store a whole second copy of the table and its
> indexes.
>
>   



Re: PostgreSQL performance issues

От
Simon Riggs
Дата:
On Mon, 2007-10-22 at 05:13 -0700, Tiago J. Adami wrote:

> Hi all, I'm working for a brazillian company developing and
> maintaining a ERP sw that uses PostgreSQL as it main OLTP database
> system. We're just to start the migration to IBM DB2 because of many
> performance issues. I searched the solution for these performance
> problems, and can't find anything on the other web foruns. 

If you are migrating to DB2 then I'm guessing you'll have spent some
time discussing things with a DB2 consultant/pre-sales guy. I've noticed
they've started to implement some PostgreSQL features now and probably
have a list of "Ten Questions to Ask the PostgreSQL Community". It's a
standard sales technique. I bet they haven't mentioned the things that
Postgres does better than DB2 and that they are playing catch-up on. 

If you are having problems with PostgreSQL you should do a little more
than search the forums. If you want to compare two products then you
shouldn't just ask an expert in one product, you should ask an expert in
each product. I'm available to help, but so are many others; my
understanding is that the Brazillian PostgreSQL community has many good
people. You may need to pay for the advice, but remember the pre-sales
advice on DB2 isn't free either, you just pay for it later.

The database schema you describe is very likely to be within the
capabilities of PostgreSQL on reasonable hardware. Limiting your
hardware so you can use the free version of DB2 could well put a strain
on your implementation. I would also note that running any database
using RAID-0 is going to lose your data fairly soon, whichever database
you choose.

I hope that helps your general issues.

--  Simon Riggs 2ndQuadrant  http://www.2ndQuadrant.com



Re: PostgreSQL performance issues

От
Rafael Martinez
Дата:
Deblauwe Gino wrote:

> a) I didn't see a reindex in your mail.  That's why a backup and a
> restore work and a vacuum doesn't
> http://www.postgresql.org/docs/current/static/sql-reindex.html
> Do this at least daily with that many inserts
> 

Hello

I'am sorry to say that this advice does not sound correct.

He is talking about 10,000 transactions and 15,000 new rows per
table/day. This should not require a daily reindex of 12 indexes. It has
to be something else and more information will help to find the problem.

1GB ram sounds maybe to little for a 40GB+ database. It is difficult to
say more without knowing more about the EXPLAIN output for the selects
with problems. It would be good too to know some of the memory values in
postgresql.conf.

I can count with one hand the times that I have had to run a reindex to
fix something in the past years. We have databases with 9millons+
transactions and 2millons+ inserts/updates a day and we have never had
to run a reindex to get a good performance

PS.- RAID-0 for a database is a disaster waiting to happen.

regards
-- Rafael Martinez, <r.m.guerrero@usit.uio.no>Center for Information Technology ServicesUniversity of Oslo, Norway
PGP Public Key: http://folk.uio.no/rafael/


Re: PostgreSQL performance issues

От
Neil Conway
Дата:
On Mon, 2007-10-22 at 15:40 +0200, Cédric Villemain wrote:
> Does postgresql use posix_fallocate ?

No.

-Neil