Обсуждение: Sthange things happen: SkyTools pgbouncer is NOT a balancer
Hello.
We discovered some time ago that pgbouncer is NOT a balancer, because it cannot spread connections/queries to the same database to multiple servers. It's unbeliveable, but it's a fact! So, database name in the config MUST be unique.
E.g. if we write
bardb = host=192.168.0.1 dbname=bardb
bardb = host=192.168.0.2 dbname=bardb
bardb = host=192.168.0.3 dbname=bardb
in the config, pgbouncer always uses the first connection, and others are ignored. Here is the part of the source code:
== loader.c:
void parse_database(char *name, char *connstr) {
...
db = add_database(name);
...
}
== objects.c:
PgDatabase *add_database(const char *name)
{
PgDatabase *db = find_database(name);
/* create new object if needed */
if (db == NULL) {
db = zmalloc(sizeof(*db));
...
}
return db;
}
In these functions "name" is a key from the config ("bardb" in our example). We see that it's useless to create duplicate keys in config elements in [databases] sections, because only the first one is accepted.
So, it's completely magical for me why "Session pooling", "Transaction pooling" and "Statement pooling" options are exist (see https://developer.skype.com/SkypeGarage/DbProjects/PgBouncer). If pgbouncer is not a balancer, what purpose is to use "Statement pooling" - if we sent 100 queries (e.g.) in the same connection, they will always be directed to the SAME MACHINE in its different connections, no balancing optimization at all.
We discovered some time ago that pgbouncer is NOT a balancer, because it cannot spread connections/queries to the same database to multiple servers. It's unbeliveable, but it's a fact! So, database name in the config MUST be unique.
E.g. if we write
bardb = host=192.168.0.1 dbname=bardb
bardb = host=192.168.0.2 dbname=bardb
bardb = host=192.168.0.3 dbname=bardb
in the config, pgbouncer always uses the first connection, and others are ignored. Here is the part of the source code:
== loader.c:
void parse_database(char *name, char *connstr) {
...
db = add_database(name);
...
}
== objects.c:
PgDatabase *add_database(const char *name)
{
PgDatabase *db = find_database(name);
/* create new object if needed */
if (db == NULL) {
db = zmalloc(sizeof(*db));
...
}
return db;
}
In these functions "name" is a key from the config ("bardb" in our example). We see that it's useless to create duplicate keys in config elements in [databases] sections, because only the first one is accepted.
So, it's completely magical for me why "Session pooling", "Transaction pooling" and "Statement pooling" options are exist (see https://developer.skype.com/SkypeGarage/DbProjects/PgBouncer). If pgbouncer is not a balancer, what purpose is to use "Statement pooling" - if we sent 100 queries (e.g.) in the same connection, they will always be directed to the SAME MACHINE in its different connections, no balancing optimization at all.
On Tue, 2007-09-11 at 20:02 +0400, Dmitry Koterov wrote: > Hello. > > We discovered some time ago that pgbouncer is NOT a balancer, because > it cannot spread connections/queries to the same database to multiple > servers. It's unbeliveable, but it's a fact! So, database name in the > config MUST be unique. > You might look at pgpool-II, that can load balance queries. Regards, Jeff Davis
AFAIK PgBouncer is not a balancer but a connection pooler. Skype said nothing about load balancing in its docs, so they are fair in this sense. Why did you decide it should balance the load? Regards, Ivan On 9/11/07, Dmitry Koterov <dmitry@koterov.ru> wrote: > Hello. > > We discovered some time ago that pgbouncer is NOT a balancer, because it > cannot spread connections/queries to the same database to multiple servers. > It's unbeliveable, but it's a fact! So, database name in the config MUST be > unique. > > E.g. if we write > > bardb = host=192.168.0.1 dbname=bardb > bardb = host=192.168.0.2 dbname=bardb > bardb = host=192.168.0.3 dbname=bardb > > in the config, pgbouncer always uses the first connection, and others are > ignored. Here is the part of the source code: > > == loader.c: > void parse_database(char *name, char *connstr) { > ... > db = add_database(name); > ... > } > > == objects.c: > PgDatabase *add_database(const char *name) > { > PgDatabase *db = find_database(name); > /* create new object if needed */ > if (db == NULL) { > db = zmalloc(sizeof(*db)); > ... > } > return db; > } > > In these functions "name" is a key from the config ("bardb" in our example). > We see that it's useless to create duplicate keys in config elements in > [databases] sections, because only the first one is accepted. > > So, it's completely magical for me why "Session pooling", "Transaction > pooling" and "Statement pooling" options are exist (see > https://developer.skype.com/SkypeGarage/DbProjects/PgBouncer). > If pgbouncer is not a balancer, what purpose is to use "Statement pooling" - > if we sent 100 queries (e.g.) in the same connection, they will always be > directed to the SAME MACHINE in its different connections, no balancing > optimization at all. > >
On 9/11/07, Dmitry Koterov <dmitry@koterov.ru> wrote: > We discovered some time ago that pgbouncer is NOT a balancer, because it > cannot spread connections/queries to the same database to multiple servers. > It's unbeliveable, but it's a fact! So, database name in the config MUST be > unique. Indeed, pgbouncer is just a connection pooler, not load balancer. There has been some talk about adding load balancing to it, but as it can be done as well in network level, I don't see much point. > So, it's completely magical for me why "Session pooling", "Transaction > pooling" and "Statement pooling" options are exist (see > https://developer.skype.com/SkypeGarage/DbProjects/PgBouncer). > If pgbouncer is not a balancer, what purpose is to use "Statement pooling" - > if we sent 100 queries (e.g.) in the same connection, they will always be > directed to the SAME MACHINE in its different connections, no balancing > optimization at all. To use less connections on server side. -- marko
On Tue, Sep 11, 2007 at 08:02:34PM +0400, Dmitry Koterov wrote: > So, it's completely magical for me why "Session pooling", "Transaction > pooling" and "Statement pooling" options are exist (see > https://developer.skype.com/SkypeGarage/DbProjects/PgBouncer). If pgbouncer > is not a balancer, what purpose is to use "Statement pooling" - if we sent > 100 queries (e.g.) in the same connection, they will always be directed to > the SAME MACHINE in its different connections, no balancing optimization at > all. i dont get where you get the idea that pgbouncer is loadbalancer. it was never said so in no documentation i have ever found. as for the purpose - of course there is purpose. with your "100 queries" example - if you will choose statement pooling they might be sent using different database connections (to the same database though). and if you're using autocommit, then the same will happen in "transaction pooling" mode. the whole point of pgbouncer is to pool database connections and reuse the same db-connection for many application connections. which it does simply great. depesz -- quicksil1er: "postgres is excellent, but like any DB it requires a highly paid DBA. here's my CV!" :) http://www.depesz.com/ - blog dla ciebie (i moje CV)
Hi
One of PgBouncer original design goals is to create small and robust connection pooler.
If we start adding more and more fancy features like load balancing then we can easily end up in the same place from which we wanted to get away (pgPool).
We do our load balancing on ip level when the servers are duplicates of each other. Good for read only stuff.
Next level of load balancing we use is splitting databases into partitons. That way we can distriobute one database over several servers. We use plProxy to do this. Downside is raised complexity. Benefit all the partitons can be read write. Also good for reducing risks. If one part goes down others are still functioning.
Asko
One of PgBouncer original design goals is to create small and robust connection pooler.
If we start adding more and more fancy features like load balancing then we can easily end up in the same place from which we wanted to get away (pgPool).
We do our load balancing on ip level when the servers are duplicates of each other. Good for read only stuff.
Next level of load balancing we use is splitting databases into partitons. That way we can distriobute one database over several servers. We use plProxy to do this. Downside is raised complexity. Benefit all the partitons can be read write. Also good for reducing risks. If one part goes down others are still functioning.
Asko