Обсуждение: postgres slowdown question

Поиск
Список
Период
Сортировка

postgres slowdown question

От
"Shachindra Agarwal"
Дата:

Dear Postgres Masters:

 

We are using postgres 7.4 in our java application on RedHat linux. The Java application connects to Postgres via JDBC. The application goes through a ‘discovery’ phase, whereas it adds large amount of data into postgres. Typically, we are adding about a million records in various tables. The application also issues multiple queries to the database at the same time.  We do not delete any records during the discovery phase. Both the java application and the postgres are installed on the same machine.

 

At the beginning, the application is able to add in the order of 100 record per minute. Gradually (after several hours), it slows down to less than 10 records per minute. At this time, postgres processes take between 80-99% of CPU. When we reindex the database, the speed bumps up to about 30 records per minute. Now, postgres server takes between 50-70% CPU.

 

We have the following in the postgresql.conf :

 

max_fsm_pages = 500000

fsync = false

 

We certainly can not live with this kind of performance. I believe postgres should be able to handle much larger datasets but I can not point my finger as to what are we doing wrong. Can somebody please point me to the right direction.

 

With kind regards,

 

-- Shachindra Agarwal.

Re: postgres slowdown question

От
John A Meinel
Дата:
Shachindra Agarwal wrote:

> Dear Postgres Masters:
>
> We are using postgres 7.4 in our java application on RedHat linux. The
> Java application connects to Postgres via JDBC. The application goes
> through a ‘discovery’ phase, whereas it adds large amount of data into
> postgres. Typically, we are adding about a million records in various
> tables. The application also issues multiple queries to the database
> at the same time. We do not delete any records during the discovery
> phase. Both the java application and the postgres are installed on the
> same machine.
>
> At the beginning, the application is able to add in the order of 100
> record per minute. Gradually (after several hours), it slows down to
> less than 10 records per minute. At this time, postgres processes take
> between 80-99% of CPU. When we reindex the database, the speed bumps
> up to about 30 records per minute. Now, postgres server takes between
> 50-70% CPU.
>
> We have the following in the postgresql.conf :
>
> max_fsm_pages = 500000
>
> fsync = false
>
> We certainly can not live with this kind of performance. I believe
> postgres should be able to handle much larger datasets but I can not
> point my finger as to what are we doing wrong. Can somebody please
> point me to the right direction.
>
> With kind regards,
>
> -- Shachindra Agarwal.
>
A few questions first. How are you loading the data? Are you using
INSERT or COPY? Are you using a transaction, or are you autocommitting
each row?

You really need a transaction, and preferably use COPY. Both can help
performance a lot. (In some of the tests, single row inserts can be
10-100x slower than doing it in bulk.)

Also, it sounds like you have a foreign key issue. That as things fill
up, the foreign key reference checks are slowing you down.
Are you using ANALYZE as you go? A lot of times when you only have <1000
rows a sequential scan is faster than using an index, and if you don't
inform postgres that you have more rows, it might still use the old seqscan.

There are other possibilities, but it would be nice to know about your
table layout, and possibly an EXPLAIN ANALYZE of the inserts that are
going slow.

John
=:->

PS> I don't know if JDBC supports COPY, but it certainly should support
transactions.



Вложения

Re: postgres slowdown question

От
"Shachindra Agarwal"
Дата:
Thanks for the note. Please see my responses below:

-----Original Message-----
From: John A Meinel [mailto:john@arbash-meinel.com]
Sent: Wednesday, April 20, 2005 3:48 PM
To: Shachindra Agarwal
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] postgres slowdown question

Shachindra Agarwal wrote:

> Dear Postgres Masters:
>
> We are using postgres 7.4 in our java application on RedHat linux. The

> Java application connects to Postgres via JDBC. The application goes
> through a 'discovery' phase, whereas it adds large amount of data into

> postgres. Typically, we are adding about a million records in various
> tables. The application also issues multiple queries to the database
> at the same time. We do not delete any records during the discovery
> phase. Both the java application and the postgres are installed on the

> same machine.
>
> At the beginning, the application is able to add in the order of 100
> record per minute. Gradually (after several hours), it slows down to
> less than 10 records per minute. At this time, postgres processes take

> between 80-99% of CPU. When we reindex the database, the speed bumps
> up to about 30 records per minute. Now, postgres server takes between
> 50-70% CPU.
>
> We have the following in the postgresql.conf :
>
> max_fsm_pages = 500000
>
> fsync = false
>
> We certainly can not live with this kind of performance. I believe
> postgres should be able to handle much larger datasets but I can not
> point my finger as to what are we doing wrong. Can somebody please
> point me to the right direction.
>
> With kind regards,
>
> -- Shachindra Agarwal.
>
A few questions first. How are you loading the data? Are you using
INSERT or COPY? Are you using a transaction, or are you autocommitting
each row?

You really need a transaction, and preferably use COPY. Both can help
performance a lot. (In some of the tests, single row inserts can be
10-100x slower than doing it in bulk.)

>> We are using JDBC which supports 'inserts' and 'transactions'. We are
using both. The business logic adds one business object at a time. Each
object is added within its own transaction. Each object add results in 5
records in various tables in the the database. So, a commit is performed
after every 5 inserts.

Also, it sounds like you have a foreign key issue. That as things fill
up, the foreign key reference checks are slowing you down.
Are you using ANALYZE as you go? A lot of times when you only have <1000

rows a sequential scan is faster than using an index, and if you don't
inform postgres that you have more rows, it might still use the old
seqscan.

>> This could be the issue. I will start 'analyze' in a cron job. I will
update you with the results.

There are other possibilities, but it would be nice to know about your
table layout, and possibly an EXPLAIN ANALYZE of the inserts that are
going slow.

John
=:->

PS> I don't know if JDBC supports COPY, but it certainly should support
transactions.



Re: postgres slowdown question

От
John A Meinel
Дата:
Shachindra Agarwal wrote:

>Thanks for the note. Please see my responses below:
>
>
...

>
>
>>>We are using JDBC which supports 'inserts' and 'transactions'. We are
>>>
>>>
>using both. The business logic adds one business object at a time. Each
>object is added within its own transaction. Each object add results in 5
>records in various tables in the the database. So, a commit is performed
>after every 5 inserts.
>
>
>
Well, 5 inserts per commit is pretty low. It would be nice to see more
like 100 inserts per commit. Would it be possible during the "discovery"
phase to put the begin/commit logic a little bit higher?
Remember, each COMMIT requires at least one fsync. (I realize you have
fsync off for now). But commit is pretty expensive.

>Also, it sounds like you have a foreign key issue. That as things fill
>up, the foreign key reference checks are slowing you down.
>Are you using ANALYZE as you go? A lot of times when you only have <1000
>
>rows a sequential scan is faster than using an index, and if you don't
>inform postgres that you have more rows, it might still use the old
>seqscan.
>
>
>
>>>This could be the issue. I will start 'analyze' in a cron job. I will
>>>
>>>
>update you with the results.
>
>There are other possibilities, but it would be nice to know about your
>table layout, and possibly an EXPLAIN ANALYZE of the inserts that are
>going slow.
>
>John
>=:->
>
>PS> I don't know if JDBC supports COPY, but it certainly should support
>transactions.
>
>
Let us know if ANALYZE helps. If you are not deleting or updating
anything, you probably don't need to do VACUUM ANALYZE, but you might
think about it. It is a little more expensive since it has to go to
every tuple, rather than just a random sampling.

John
=:->


Вложения