Обсуждение: Performance issue with Insert

Поиск
Список
Период
Сортировка

Performance issue with Insert

От
Jenish
Дата:
Hi All,

I am facing some performance issue with insert into some table.

I am using postgres 8.4.x

Table is having 3 before insert trigger and one after insert trigger.

With all triggers enable it is inserting only 4-5 record per second.

But if I disable after insert trigger it is able to insert 667 records per second.

After insert trigger is recursive trigger.


My question.

How to avoid the bottleneck?

Parallel processing is possible in Postgres? How?


Please give you suggestion.  
--
Thanks & regards,
JENISH VYAS


Re: Performance issue with Insert

От
"Kevin Grittner"
Дата:
Jenish <jenishvyas@gmail.com> wrote:

> I am using postgres 8.4.x

With x being what?  On what OS and hardware?

> Table is having 3 before insert trigger and one after insert
> trigger.
>
> With all triggers enable it is inserting only 4-5 record per
> second.
>
> But if I disable after insert trigger it is able to insert 667
> records per second.

http://wiki.postgresql.org/wiki/SlowQueryQuestions

> After insert trigger is recursive trigger.

So are you counting only the top level inserts or also the ones
generated by the recursive inserts?

> My question.
>
> How to avoid the bottleneck?

First you need to find out what the bottleneck is.

> Parallel processing is possible in Postgres? How?

To achieve parallel processing in PostgreSQL you need to use
multiple connections.

-Kevin

Re: Performance issue with Insert

От
Jenish
Дата:
Hi,

DB : POSTGRES 8.4.8
OS  : Debian
HD : SAS 10k rpm

Shared_buffer is 4096 25 % of RAM , effective_cache is 8GB 75% of RAM

After insert trigger is again calling 2 more trigger and insert record in another table depends on condition.

with all trigger enable there are 8 insert and 32 updates(approx. update is depends on hierarchy)

Plz explain multiple connections. Current scenario application server is sending all requests.

-- 
Thanks & regards,
JENISH VYAS

On Mon, Jun 27, 2011 at 5:37 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
Jenish <jenishvyas@gmail.com> wrote:

> I am using postgres 8.4.x

With x being what?  On what OS and hardware?

> Table is having 3 before insert trigger and one after insert
> trigger.
>
> With all triggers enable it is inserting only 4-5 record per
> second.
>
> But if I disable after insert trigger it is able to insert 667
> records per second.

http://wiki.postgresql.org/wiki/SlowQueryQuestions

> After insert trigger is recursive trigger.

So are you counting only the top level inserts or also the ones
generated by the recursive inserts?

> My question.
>
> How to avoid the bottleneck?

First you need to find out what the bottleneck is.

> Parallel processing is possible in Postgres? How?

To achieve parallel processing in PostgreSQL you need to use
multiple connections.

-Kevin





Re: Performance issue with Insert

От
tv@fuzzy.cz
Дата:
> Hi,
>
> DB : POSTGRES 8.4.8
> OS  : Debian
> HD : SAS 10k rpm
>
> Shared_buffer is 4096 25 % of RAM , effective_cache is 8GB 75% of RAM
>
> After insert trigger is again calling 2 more trigger and insert record in
> another table depends on condition.
>
> with all trigger enable there are 8 insert and 32 updates(approx. update
> is
> depends on hierarchy)

Hi,

it's very difficult to give you reliable recommendations with this little
info, but the triggers are obviously the bottleneck. We have no idea what
queries are executed in them, but I guess there are some slow queries.

Find out what queries are executed in the triggers, benchmark each of them
and make them faster. Just don't forget that those SQL queries are
executed as prepared statements, so they may behave a bit differently than
plain queries. So use 'PREPARE' and 'EXPLAIN EXECUTE' to tune them.

> Plz explain multiple connections. Current scenario application server is
> sending all requests.

PostgreSQL does not support parallel queries (i.e. a query distributed on
multiple CPUs) so each query may use just a single CPU. If you're CPU
bound (one CPU is 100% utilized but the other CPUs are idle), you can
usually parallelize the workload on your own - just use multiple
connections.

But if you're using an application server and there are multiple
connections used, this is not going to help you. How many connections are
active at the same time? Are the CPUs idle or utilized?

Tomas


Re: Performance issue with Insert

От
Jenish
Дата:

Hi,

I have already checked all the statements present in the trigger, no one is taking more then 20 ms.

I am using 8-Processor, Quad-Core Server ,CPU utilization is more then 90-95 % for all. (htop result)

DB has 960 concurrent users. 

io : writing 3-4 MB per second or less (iotop result).

Scenario :  All insert are waiting for previous insert to complete. Cant we avoid this situation ?  
What is the "max_connections" postgresql support? 

Plz help....


-- 
Thanks & regards,
JENISH VYAS
  





On Mon, Jun 27, 2011 at 6:32 PM, <tv@fuzzy.cz> wrote:
> Hi,
>
> DB : POSTGRES 8.4.8
> OS  : Debian
> HD : SAS 10k rpm
>
> Shared_buffer is 4096 25 % of RAM , effective_cache is 8GB 75% of RAM
>
> After insert trigger is again calling 2 more trigger and insert record in
> another table depends on condition.
>
> with all trigger enable there are 8 insert and 32 updates(approx. update
> is
> depends on hierarchy)

Hi,

it's very difficult to give you reliable recommendations with this little
info, but the triggers are obviously the bottleneck. We have no idea what
queries are executed in them, but I guess there are some slow queries.

Find out what queries are executed in the triggers, benchmark each of them
and make them faster. Just don't forget that those SQL queries are
executed as prepared statements, so they may behave a bit differently than
plain queries. So use 'PREPARE' and 'EXPLAIN EXECUTE' to tune them.

> Plz explain multiple connections. Current scenario application server is
> sending all requests.

PostgreSQL does not support parallel queries (i.e. a query distributed on
multiple CPUs) so each query may use just a single CPU. If you're CPU
bound (one CPU is 100% utilized but the other CPUs are idle), you can
usually parallelize the workload on your own - just use multiple
connections.

But if you're using an application server and there are multiple
connections used, this is not going to help you. How many connections are
active at the same time? Are the CPUs idle or utilized?

Tomas




Re: Performance issue with Insert

От
Merlin Moncure
Дата:
On Mon, Jun 27, 2011 at 9:22 AM, Jenish <jenishvyas@gmail.com> wrote:
> Hi All,
>
> I am facing some performance issue with insert into some table.
>
> I am using postgres 8.4.x
>
> Table is having 3 before insert trigger and one after insert trigger.
>
> With all triggers enable it is inserting only 4-5 record per second.
>
> But if I disable after insert trigger it is able to insert 667 records per
> second.
>
> After insert trigger is recursive trigger.
>
> My question.
>
> How to avoid the bottleneck?
>
> Parallel processing is possible in Postgres? How?
>
> Please give you suggestion.

this sounds like a coding issue -- to get to the bottom of this we are
going to need to see the table and the triggers.

merlin

Re: Performance issue with Insert

От
Tomas Vondra
Дата:
Dne 27.6.2011 17:58, Jenish napsal(a):
>
> Hi,
>
> I have already checked all the statements present in the trigger, no one
> is taking more then 20 ms.
>
> I am using 8-Processor, Quad-Core Server ,CPU utilization is more then
> 90-95 % for all. (htop result)

So all cores are 95% utilized? That means you're CPU bound and you need
to fix that somehow.

How much of that belongs to postgres? Are there other processes
consuming significant portion of CPU? And what do you mean by
'utilized'? Does that mean user/sys time, or wait time?

> DB has 960 concurrent users.

Whad does that mean? Does that mean there are 960 active connections?

> io : writing 3-4 MB per second or less (iotop result).

Sequential or random? Post a few lines of 'iostat -x 1' and a few lines
of 'vmstat 1' (collected when the database is busy).

> Scenario :  All insert are waiting for previous insert to complete. Cant
> we avoid this situation ?

What do you mean by 'previous'? Does that mean another insert in the
same session (connection), or something performed in another session?

> What is the "max_connections" postgresql support?

That limits number of background processes - each connection is served
by a dedicated posgres process. You can see that in top / ps output.

High values usually mean you need some kind of pooling (you probably
already have one as you're using application server). And if the
connections are really active (doing something all the time), this
should not be significantly higher than the number of cores.

See, you have 8 cores, which means 8 seconds of CPU time each second. No
matter how many connections you allow, you still have just those 8
seconds. So if you need to perform 100x something that takes 1 second,
you need to spend 100 seconds of CPU time. So with those 8 cores, you
can do that in about 12,5 seconds.

Actually if you create too many connections, you'll notice it takes much
more - there's an overhead with process management, context switching,
locking etc.

regards
Tomas

Re: Performance issue with Insert

От
Tomas Vondra
Дата:
Dne 27.6.2011 17:01, Jenish napsal(a):
> Hi,
>
> DB : POSTGRES 8.4.8
> OS  : Debian
> HD : SAS 10k rpm
>
> Shared_buffer is 4096 25 % of RAM , effective_cache is 8GB 75% of RAM

How much data are we talking about? Does that fit into the shared
buffers or is it significantly larger? Do the triggers touch the whole
database or just a small part of it (active part)?

regards
Tomas

Re: Performance issue with Insert

От
Jenish
Дата:
Hi ,

This server is the dedicated database server.

 And I am testing the limit for the concurrent active users. When I am running my test for 400 concurrent user ie. Active connection. I am getting good performance but when I am running the same the same test for 950 concurrent users I am getting very bad performance.

 

>> Scenario :  All insert are waiting for previous insert to complete.

I don’t know whether it is the same session or different session.

 

DB id huge but Triggers are not touching the whole database.

I’ll provide the result set of vmstat and iostat tomorrow.


-- 
Thanks & regards,
JENISH VYAS

 

On Mon, Jun 27, 2011 at 10:48 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
Dne 27.6.2011 17:01, Jenish napsal(a):
> Hi,
>
> DB : POSTGRES 8.4.8
> OS  : Debian
> HD : SAS 10k rpm
>
> Shared_buffer is 4096 25 % of RAM , effective_cache is 8GB 75% of RAM

How much data are we talking about? Does that fit into the shared
buffers or is it significantly larger? Do the triggers touch the whole
database or just a small part of it (active part)?

regards
Tomas

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance




Re: Performance issue with Insert

От
Tomas Vondra
Дата:
Dne 27.6.2011 22:14, Jenish napsal(a):
>  And I am testing the limit for the concurrent active users. When I am
> running my test for 400 concurrent user ie. Active connection. I am
> getting good performance but when I am running the same the same test
> for 950 concurrent users I am getting very bad performance.

This is typical behaviour - the performance is good up until some point,
then it degrades much faster.

Why do you even need such number of connections? Does that really
improve performance (e.g. how many inserts do you do with 100 and 400
connections)?

Such number of active connections is not going to give you any advantage
I guess ...

regards
Tomas

Re: Performance issue with Insert

От
"Kevin Grittner"
Дата:
Jenish <jenishvyas@gmail.com> wrote:

> This server is the dedicated database server.
>
> And I am testing the limit for the concurrent active users. When I
> am running my test for 400 concurrent user ie. Active connection.
> I am getting good performance but when I am running the same the
> same test for 950 concurrent users I am getting very bad
> performance.

To serve a large number of concurrent users you need to use a
connection pooler which limits the number of database connections to
a small number.  Typically the most effective number of database
connections is somewhere between the number of actual cores on your
server and twice that plus the number of disk drives.  (It depends
on the details of your hardware and your load.)  The connection
pooler should queue requests which arrive when all database
connections are busy and release them for execution as transactions
complete.  Restricting the active database connections in this way
improves both throughput and latency and will allow you to serve a
much larger number of users without getting into bad performance;
and when you do "hit the wall" performance will degrade more
gracefully.

-Kevin