Обсуждение: connections

Поиск
Список
Период
Сортировка

connections

От
"Roberts, Jon"
Дата:
With pgAdmin, every window is a new connection to the database and this
alone wastes resources when you have lots of users.

select count(distinct usename) as unique_users,
       count(*) as total_count,
       count(*)/cast(count(distinct usename) as float8)
  from pg_stat_activity;

For our system, we currently have 9 distinct users connected with a
total of 30 connections which translates to over 3 connections per user.
This makes sense because pgAdmin creates 1 connection for the main UI, 1
connection to the Maintenance database, and 1 connection per SQL window.


Now my system is actually running Greenplum which means I have 16 more
(17 total) connections per user's connection because each segment host
running gets a connection.  So my measly 9 users actually are consuming
510 database connections!

EnterpriseDB now has a similar product to Greenplum
http://www.enterprisedb.com/community/projects/gridsql.do and I'm
assuming it does the same thing.  It certainly does look that way based
on the fact both products make use of parallel execution with shared
nothing parallel database servers.

I would greatly prefer if pgAdmin only created a connection if it needed
to.  If a query window is busy and a user attempts to execute SQL while
the other window is busy, then it would create a new session.  A simpler
solution would be only to allow one connection per pgAdmin instance.

Is this even on the radar for EnterpriseDB or pgAdmin users?  Is this
even considered a problem?


Jon

Re: connections

От
"Dave Page"
Дата:
On Fri, Apr 18, 2008 at 7:30 PM, Roberts, Jon <Jon.Roberts@asurion.com> wrote:

> Now my system is actually running Greenplum which means I have 16 more
> (17 total) connections per user's connection because each segment host
> running gets a connection.  So my measly 9 users actually are consuming
> 510 database connections!
>
> EnterpriseDB now has a similar product to Greenplum
> http://www.enterprisedb.com/community/projects/gridsql.do and I'm
> assuming it does the same thing.  It certainly does look that way based
> on the fact both products make use of parallel execution with shared
> nothing parallel database servers.

I'm awaiting confirmation from the lead architect, but I believe that
a single connection from pgAdmin will result in 1 connection to each
node in the cluster in GridSQL. However I believe there is also a
certain amount of pooling going on so I don't believe it's such a
problem. Besides.... that may be 16 connections in a 16 node system,
but each of those is a complete PostgreSQL/EnterpriseDB server capable
of handling as many connections as it could if it were a standalone
server (which could be hundreds or thousands), so it's no more an
issue than if you had a single server.

> I would greatly prefer if pgAdmin only created a connection if it needed
> to.  If a query window is busy and a user attempts to execute SQL while
> the other window is busy, then it would create a new session.  A simpler
> solution would be only to allow one connection per pgAdmin instance.

pgAdmin does only create connections when it needs to. The problem
that you're glossing over is determining when a connection is 'idle'
and could be reused. What of temporary tables, or even more basic
things like sequences which can display different characteristics the
first time they're used in a given session compared to subsequent
times?

> Is this even on the radar for EnterpriseDB or pgAdmin users?  Is this
> even considered a problem?

No - it's not a problem that I've heard from anyone other than you
(and a few newbies who don't understand why we use multiple
connections), and it's certainly not something the GridSQL guys have
complained about, either internally in EnterpriseDB or out here in
public.

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com

Re: connections

От
"Dave Page"
Дата:
On Sun, Apr 20, 2008 at 6:00 PM, Dave Page <dpage@pgadmin.org> wrote:
> On Fri, Apr 18, 2008 at 7:30 PM, Roberts, Jon <Jon.Roberts@asurion.com> wrote:
>
> I'm awaiting confirmation from the lead architect, but I believe that
> a single connection from pgAdmin will result in 1 connection to each
> node in the cluster in GridSQL. However I believe there is also a
> certain amount of pooling going on so I don't believe it's such a
> problem. Besides.... that may be 16 connections in a 16 node system,
> but each of those is a complete PostgreSQL/EnterpriseDB server capable
> of handling as many connections as it could if it were a standalone
> server (which could be hundreds or thousands), so it's no more an
> issue than if you had a single server.

The word from the GridSQL guru is:

I am not sure I understand why he has so many connections because each
segment should be on a different host. Maybe they have something set
up where there are multiple logical nodes (er, segments) running on a
single postmaster instance.

Maybe that further explodes if each segment creates a connection to
every other one and persists it.

In GridSQL, we have a pool of connections that are used, starting with
5 for each underlying node. These are shared amongst the users, unless
somethings stateful happens in which case the underlying connection
needs to become persistent for a user session. We also have an extra
connection for the metadata database, and some vestiges of old code
for a pool of "coordinator" connections, but they are not used for
most queries.

Anyway, the bottom line is, GridSQL will not consume as many
connections as in the Greenplum case.

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com

Re: connections

От
"Roberts, Jon"
Дата:
> On Sun, Apr 20, 2008 at 6:00 PM, Dave Page <dpage@pgadmin.org> wrote:
> > On Fri, Apr 18, 2008 at 7:30 PM, Roberts, Jon
<Jon.Roberts@asurion.com>
> wrote:
> >
> > I'm awaiting confirmation from the lead architect, but I believe
that
> > a single connection from pgAdmin will result in 1 connection to each
> > node in the cluster in GridSQL. However I believe there is also a
> > certain amount of pooling going on so I don't believe it's such a
> > problem. Besides.... that may be 16 connections in a 16 node system,
> > but each of those is a complete PostgreSQL/EnterpriseDB server
capable
> > of handling as many connections as it could if it were a standalone
> > server (which could be hundreds or thousands), so it's no more an
> > issue than if you had a single server.
>
> The word from the GridSQL guru is:
>
> I am not sure I understand why he has so many connections because each
> segment should be on a different host. Maybe they have something set
> up where there are multiple logical nodes (er, segments) running on a
> single postmaster instance.
>

They run a postmaster instance per core so we have 4 databases running
per segment host.  We have four segment hosts so that means we have 16
databases running.  Then there is one more database which is the
"master" which is equivalent to the GridSQL Coordinator.


> Maybe that further explodes if each segment creates a connection to
> every other one and persists it.
>

I'm just a customer so I don't know for sure but I'm guessing it does.
They recommend each segment database have max_connections set to 5-10
higher than the master and they have a large interconnect network
between the servers.


> In GridSQL, we have a pool of connections that are used, starting with
> 5 for each underlying node. These are shared amongst the users, unless
> somethings stateful happens in which case the underlying connection
> needs to become persistent for a user session. We also have an extra
> connection for the metadata database, and some vestiges of old code
> for a pool of "coordinator" connections, but they are not used for
> most queries.
>
> Anyway, the bottom line is, GridSQL will not consume as many
> connections as in the Greenplum case.

Dave, thanks for the detailed response.  I really do appreciate it.



Jon