Обсуждение: recommendations for web/db connection pooling or DBD::Gofer reviews
When traffic to our PostgreSQL-backed website spikes, the first resource we see being exhausted is the DB slots on the master server (currently set to about 400). I expect that as new Apache/mod_perl children are being put to us, they are creating new database connections. I'm interested in recommendations to funnel more of that traffic through fewer DB slots, if that's possible. (We could also consider increasing the handles available, since the DB server has some CPU and memory to spare). I'm particularly interested in review of DBD::Gofer, which seems like it would help with this in our Perl application: http://search.cpan.org/dist/DBI/lib/DBD/Gofer.pm I realize it has limitations, like "no transactions", but I think we would still able to use it selectively in our application. Mark
On Mon, 07 Apr 2008 14:36:00 -0400 Mark Stosberg <mark@summersault.com> wrote: > I'm particularly interested in review of DBD::Gofer, which seems like > it would help with this in our Perl application: > http://search.cpan.org/dist/DBI/lib/DBD/Gofer.pm > > I realize it has limitations, like "no transactions", but I think we > would still able to use it selectively in our application. I would stick to proven postgresql technologies such as pgbouncer. Sincerely, Joshua D. Drake > > Mark > > -- The PostgreSQL Company since 1997: http://www.commandprompt.com/ PostgreSQL Community Conference: http://www.postgresqlconference.org/ United States PostgreSQL Association: http://www.postgresql.us/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
> I would stick to proven postgresql technologies such as pgbouncer.
Thanks for the fast recommendation, Joshua.  I'll consider it.
Our application is Slony-replicated web/db project with two slaves.
Does this design seem sensible?
- Run one pgbouncer server on the master, with settings to
   service the master and both slaves.
- We already handle balancing traffic between the slaves separately, so
that can remain unchanged.
- Use Session Pooling both both the masters and the slaves. In theory,
the slaves should just be doing transaction-less SELECT statements, so a
more aggressive setting might be possible, but I believe there might be
a "leak" in the logic where we create a temporary table on the slave in
one case.
- Redirect all application connections through pgbouncer
###
 From graphs we keep, we can see that the slaves currently use a max of
about 64 connections...they are far from maxing out what's possible. So
I was trying to think through if made sense to bother using the
pgBouncer layer with them. I through of two potential reasons to still
use it:
  - In the event of a major traffic spike on the web servers, pgbouncer
would keep the number of db slots under control.
  - Potentially there's a performance gain in having PgBouncer hold the
connections open.
Does that analysis seem correct?
For the master's pool size, I thought I would just choose a number
that's a little larger that the daily max number of DB slots in use.
    Mark
			
		> When traffic to our PostgreSQL-backed website spikes, the first resource > we see being exhausted is the DB slots on the master server (currently > set to about 400). > > I expect that as new Apache/mod_perl children are being put to us, they > are creating new database connections. > > I'm interested in recommendations to funnel more of that traffic through > fewer DB slots, if that's possible. (We could also consider increasing > the handles available, since the DB server has some CPU and memory to > spare). > > I'm particularly interested in review of DBD::Gofer, which seems like it > would help with this in our Perl application: > http://search.cpan.org/dist/DBI/lib/DBD/Gofer.pm > > I realize it has limitations, like "no transactions", but I think we > would still able to use it selectively in our application. Under heavy load, Apache has the usual failure mode of spawning so many threads/processes and database connections that it just exhausts all the memory on the webserver and also kills the database. As usual, I would use lighttpd as a frontend (also serving static files) to handle the large number of concurrent connections to clients, and then have it funnel this to a reasonable number of perl backends, something like 10-30. I don't know if fastcgi works with perl, but with PHP it certainly works very well. If you can't use fastcgi, use lighttpd as a HTTP proxy and apache with mod_perl behind. Recipe for good handling of heavy load is using an asynchronous server (which by design can handle any number of concurrent connections up to the OS' limit) in front of a small number of dynamic webpage generating threads/processes.
>     Under heavy load, Apache has the usual failure mode of spawning so
> many threads/processes and database connections that it just exhausts
> all the memory on the webserver and also kills the database.
>     As usual, I would use lighttpd as a frontend (also serving static
> files) to handle the large number of concurrent connections to clients,
> and then have it funnel this to a reasonable number of perl backends,
> something like 10-30. I don't know if fastcgi works with perl, but with
> PHP it certainly works very well. If you can't use fastcgi, use lighttpd
> as a HTTP proxy and apache with mod_perl behind.
>     Recipe for good handling of heavy load is using an asynchronous
> server (which by design can handle any number of concurrent connections
> up to the OS' limit) in front of a small number of dynamic webpage
> generating threads/processes.
Thanks for the response.
To be clear, it sounds like you are advocating solving the problem with
scaling the number of connections with a different approach, by limiting
the number of web server processes.
So, the front-end proxy would have a number of max connections, say 200,
  and it would connect to another httpd/mod_perl server behind with a
lower number of connections, say 20. If the backend httpd server was
busy, the proxy connection to it would just wait in a queue until it was
available.
Is that the kind of design you had in mind?
That seems like a reasonable option as well. We already have some
lightweight Apache servers in use on the project which currently just
serve static content.
    Mark
			
		On Apr 10, 2008, at 5:28 PM, Mark Stosberg wrote: > So, the front-end proxy would have a number of max connections, say > 200, and it would connect to another httpd/mod_perl server behind > with a lower number of connections, say 20. If the backend httpd > server was busy, the proxy connection to it would just wait in a > queue until it was available. If you read the mod_perl performance tuning guide, it will tell you to do exactly this. These are solved problems for many, many years now. The apache mod_proxy really does wonders...