Обсуждение: "Dynamic routing" to different databases

Поиск
Список
Период
Сортировка

"Dynamic routing" to different databases

От
Alex Balashov
Дата:
Hello all,

We have a rather complex service delivery application that is presently
designed with a "single-tenant" use case in mind: one self-contained
PostgreSQL database with all necessary tables, and one service consuming
that database, configured via a static connection string.

We are trying to figure out the most economical technical possibilities
for retrofitting "multi-tenant" functionality into it; you might think
of it as a kind of virtual hosting, where different requests coming into
the consuming service, segregated by, say, request URI domain, result in
very different behaviour.

Obviously, one way to do this is to add an additional layer of
abstraction at the database schema and the application level itself,
creating nested entities and so forth. This is impractical, however; it
would be nothing short of a wholesale overhaul of the entire codebase
(which consists in large part of PostgreSQL stored procedures for the
business layer).

So, what we're trying to figure out is the feasibility of grafting
multi-tenant functionality into the existing paradigm. For instance, we
could have a "meta-data" database with a table that defines DNS domains
and maps them to other database names/connection info on the same
server. We could then create a new and self-contained database for each
"tenant". I think that would make the most sense by far.

The major limitation is that the consuming service can connect to one
database and one database only, so what is needed is some way to present
foreign databases through one PostgreSQL connection handle to one
particular database.

The dblink facility seems to provide ways of doing this, but what I am
not clear on is: 1) how transparently the foreign database can be
rendered, vs. having to graft in lots of dblink_*() calls and 2) whether
there is a significant performance hit in using dblink, assuming the
"remote" tenant databases are located on the same cluster. Can this be
done using foreign data wrappers or something of the sort, for instance?

Failing that, is there some functionality in PgPool or another
connection proxy which can effectively take

    SELECT * FROM tbl META_ATTRIB = 'domain.com';

and transform this into

    SELECT * FROM tbl;

on the appropriate database for 'domain.com', which is a mapping that
would be maintained in something that is, at least in principle,
dynamically reloadable?

Advice much appreciated!

-- Alex

--
Alex Balashov | Principal | Evariste Systems LLC
303 Perimeter Center North, Suite 300
Atlanta, GA 30346
United States

Tel: +1-800-250-5920 (toll-free) / +1-678-954-0671 (direct)
Web: http://www.evaristesys.com/, http://www.csrpswitch.com/


Re: "Dynamic routing" to different databases

От
Alex Balashov
Дата:
After some more research, my conclusion - unless anyone has better
suggestions - is to just bite the bullet and partition these multiple
databases into schemas.

That's going to take a lot of code modification, but not nearly as much
as a full-on application-layer multitenant alternative.

On 09/30/2015 11:23 AM, Alex Balashov wrote:

> Hello all,
>
> We have a rather complex service delivery application that is presently
> designed with a "single-tenant" use case in mind: one self-contained
> PostgreSQL database with all necessary tables, and one service consuming
> that database, configured via a static connection string.
>
> We are trying to figure out the most economical technical possibilities
> for retrofitting "multi-tenant" functionality into it; you might think
> of it as a kind of virtual hosting, where different requests coming into
> the consuming service, segregated by, say, request URI domain, result in
> very different behaviour.
>
> Obviously, one way to do this is to add an additional layer of
> abstraction at the database schema and the application level itself,
> creating nested entities and so forth. This is impractical, however; it
> would be nothing short of a wholesale overhaul of the entire codebase
> (which consists in large part of PostgreSQL stored procedures for the
> business layer).
>
> So, what we're trying to figure out is the feasibility of grafting
> multi-tenant functionality into the existing paradigm. For instance, we
> could have a "meta-data" database with a table that defines DNS domains
> and maps them to other database names/connection info on the same
> server. We could then create a new and self-contained database for each
> "tenant". I think that would make the most sense by far.
>
> The major limitation is that the consuming service can connect to one
> database and one database only, so what is needed is some way to present
> foreign databases through one PostgreSQL connection handle to one
> particular database.
>
> The dblink facility seems to provide ways of doing this, but what I am
> not clear on is: 1) how transparently the foreign database can be
> rendered, vs. having to graft in lots of dblink_*() calls and 2) whether
> there is a significant performance hit in using dblink, assuming the
> "remote" tenant databases are located on the same cluster. Can this be
> done using foreign data wrappers or something of the sort, for instance?
>
> Failing that, is there some functionality in PgPool or another
> connection proxy which can effectively take
>
>     SELECT * FROM tbl META_ATTRIB = 'domain.com';
>
> and transform this into
>
>     SELECT * FROM tbl;
>
> on the appropriate database for 'domain.com', which is a mapping that
> would be maintained in something that is, at least in principle,
> dynamically reloadable?
>
> Advice much appreciated!
>
> -- Alex
>


--
Alex Balashov | Principal | Evariste Systems LLC
303 Perimeter Center North, Suite 300
Atlanta, GA 30346
United States

Tel: +1-800-250-5920 (toll-free) / +1-678-954-0671 (direct)
Web: http://www.evaristesys.com/, http://www.csrpswitch.com/


Re: "Dynamic routing" to different databases

От
Igor Neyman
Дата:

-----Original Message-----
From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Alex Balashov
Sent: Wednesday, September 30, 2015 7:24 PM
To: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] "Dynamic routing" to different databases

After some more research, my conclusion - unless anyone has better suggestions - is to just bite the bullet and
partitionthese multiple databases into schemas.
 

That's going to take a lot of code modification, but not nearly as much as a full-on application-layer multitenant
alternative.

On 09/30/2015 11:23 AM, Alex Balashov wrote:

> Hello all,
>
> We have a rather complex service delivery application that is 
> presently designed with a "single-tenant" use case in mind: one 
> self-contained PostgreSQL database with all necessary tables, and one 
> service consuming that database, configured via a static connection string.
>
> We are trying to figure out the most economical technical 
> possibilities for retrofitting "multi-tenant" functionality into it; 
> you might think of it as a kind of virtual hosting, where different 
> requests coming into the consuming service, segregated by, say, 
> request URI domain, result in very different behaviour.
>
> Obviously, one way to do this is to add an additional layer of 
> abstraction at the database schema and the application level itself, 
> creating nested entities and so forth. This is impractical, however; 
> it would be nothing short of a wholesale overhaul of the entire 
> codebase (which consists in large part of PostgreSQL stored procedures 
> for the business layer).
>
> So, what we're trying to figure out is the feasibility of grafting 
> multi-tenant functionality into the existing paradigm. For instance, 
> we could have a "meta-data" database with a table that defines DNS 
> domains and maps them to other database names/connection info on the 
> same server. We could then create a new and self-contained database 
> for each "tenant". I think that would make the most sense by far.
>
> The major limitation is that the consuming service can connect to one 
> database and one database only, so what is needed is some way to 
> present foreign databases through one PostgreSQL connection handle to 
> one particular database.
>
> The dblink facility seems to provide ways of doing this, but what I am 
> not clear on is: 1) how transparently the foreign database can be 
> rendered, vs. having to graft in lots of dblink_*() calls and 2) 
> whether there is a significant performance hit in using dblink, 
> assuming the "remote" tenant databases are located on the same 
> cluster. Can this be done using foreign data wrappers or something of the sort, for instance?
>
> Failing that, is there some functionality in PgPool or another 
> connection proxy which can effectively take
>
>     SELECT * FROM tbl META_ATTRIB = 'domain.com';
>
> and transform this into
>
>     SELECT * FROM tbl;
>
> on the appropriate database for 'domain.com', which is a mapping that 
> would be maintained in something that is, at least in principle, 
> dynamically reloadable?
>
> Advice much appreciated!
>
> -- Alex
>

Why should it " take a lot of code modification"?
Modifying "search_path" is not enough?

Regards,
Igor Neyman


Re: "Dynamic routing" to different databases

От
Alex Balashov
Дата:
On 10/01/2015 09:44 AM, Igor Neyman wrote:

> Why should it " take a lot of code modification"?
> Modifying "search_path" is not enough?

You're right, it's not that big of a deal. I had been under the
impression that

    SET search_path TO schema;

is a psql-level construct, not something that can be embedded in live
queries. So, I was expecting to have to rewrite thousands of lines of
stored procedures to take a schema parameter and construct dynamic query
strings with dynamic table names and so on.

Assuming one can simply do:

    SET search_path TO <schema>; SELECT query_business_as_usual();

I'm in a pretty good place!

--
Alex Balashov | Principal | Evariste Systems LLC
303 Perimeter Center North, Suite 300
Atlanta, GA 30346
United States

Tel: +1-800-250-5920 (toll-free) / +1-678-954-0671 (direct)
Web: http://www.evaristesys.com/, http://www.csrpswitch.com/


Re: "Dynamic routing" to different databases

От
"Nicholson, Brad (Toronto, ON, CA)"
Дата:
> -----Original Message-----
> From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-
> owner@postgresql.org] On Behalf Of Alex Balashov
> Sent: Thursday, October 01, 2015 10:47 AM
> To: pgsql-admin@postgresql.org
> Subject: Re: [ADMIN] "Dynamic routing" to different databases 
> is a psql-level construct, not something that can be embedded in live queries.
> So, I was expecting to have to rewrite thousands of lines of stored procedures
> to take a schema parameter and construct dynamic query strings with dynamic
> table names and so on.
> 
> Assuming one can simply do:
> 
>     SET search_path TO <schema>; SELECT query_business_as_usual();
> 
> I'm in a pretty good place!

You can set the default search path on a per-role/user basis.  What you could do is map a DB user (or set of users) to
aspecific schema, then set that users search path for the appropriate schema and the queries will automatically get
routedto the appropriate schema.  Segmenting your different applications user also allows you to use the role system to
restrictaccess to the data between applications.
 

The potential downside to this approach is that you won't be able to use something like pgbouncer to share DB
connectionsbetween the different users/schemas.  This can, depending on your usage and access patterns lead to problems
withhigh connection counts to the DB being needed.
 

Brad.   

Re: "Dynamic routing" to different databases

От
Alex Balashov
Дата:
Brad,

On 10/01/2015 12:48 PM, Nicholson, Brad (Toronto, ON, CA) wrote:

> You can set the default search path on a per-role/user basis.  What
> you could do is map a DB user (or set of users) to a specific schema,
> then set that users search path for the appropriate schema and the
> queries will automatically get routed to the appropriate schema.
> Segmenting your different applications user also allows you to use
> the role system to restrict access to the data between applications.

I'd love to, but unfortunately the service consuming the database is
very, very strictly limited to one connection string, which entails one
set of credentials, one server and one database, and more connections
cannot be created as-needed at runtime without restarting the service,
which in our industry is not acceptable.

So, unfortunately I cannot avail myself of such an elegant solution;
instead, I must access all the schemas via one role.

-- Alex

--
Alex Balashov | Principal | Evariste Systems LLC
303 Perimeter Center North, Suite 300
Atlanta, GA 30346
United States

Tel: +1-800-250-5920 (toll-free) / +1-678-954-0671 (direct)
Web: http://www.evaristesys.com/, http://www.csrpswitch.com/