Обсуждение: Multi-tenancy in Postgres

Поиск
Список
Период
Сортировка

Multi-tenancy in Postgres

От
Emrul Islam
Дата:
Hello,

I've just read through a paper here: http://www.edbt.org/Proceedings/2011-Uppsala/papers/edbt/a12-schiller.pdf about multi-tenancy.

They used Postgres for their work and while it is academic and would need further work I'm just wondering if anyone in the Postgres team is looking at implementing some of the functionality described?


Thank you

Re: Multi-tenancy in Postgres

От
Greg Smith
Дата:
Emrul Islam wrote:
> I've just read through a paper
> here: http://www.edbt.org/Proceedings/2011-Uppsala/papers/edbt/a12-schiller.pdf
> about multi-tenancy.
>
> They used Postgres for their work and while it is academic and would
> need further work I'm just wondering if anyone in the Postgres team is
> looking at implementing some of the functionality described?

They seem fuzzy about what actual businesses who implement multi-tenant
environments, such as hosting companies, actually want and find missing
in PostgreSQL right now.  They've tried to solve a problem I never would
have considered interesting in the first place.

On the "Shared Machine" side of things, we find complaints like how
individual PostgreSQL instances use too much power.  See "Latch
implementation that wakes on postmaster death", currently under
development aimed at 9.2, aimed right at kicking that one around.

This "Shared Table" approach they spend so much time worrying about and
improving?  No one cares about that except companies hosting a single
application on their giant box.  This idea that there are large number
of tenants running the same application, but whom need to be isolated
from one another in some way, is not the normal state of things.  Yes,
it happens on the big servers at Salesforce.com who all run the same
application; that is not a common situation however.

What the hosting companies actually want from PostgreSQL is a good
implementation of "Shared Process".  One database install, every tenant
gets their own schema, tables and are expected to use some resources.
You can do this right now; I believe the infrastructure at Heroku is
built that way for example.  How do the ideas in this paper actually
solve the problems they're seeing with that approach?  I don't know for
sure, but I don't see anything exciting there.

I makes me kind of sad when people put a lot of work into doing a good
job on a problem that doesn't really matter very much in the real world,
and that's the overwhelming feel I get from reading this paper.
Advanced schema inheritance stuff?  Don't care.  Providing query cost
constraint limits for individual tenants?  Now that's a useful problem
to talk about, one that people deploying multi-tenant databases are
actually being killed by.  And discussing aspects of that problem does
flare up among the PostgreSQL developers regularly.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
Comprehensive and Customized PostgreSQL Training Classes:
http://www.2ndquadrant.us/postgresql-training/


Re: Multi-tenancy in Postgres

От
Rob Sargent
Дата:
I think Greg might be forgetting that some of us don't always get to
choose what we work on.  I was in a shop that decided to go with
multi-tenancy for reason both technical and um, er envious. One schema
to update versus n, for an example of the former.  Amazon does it, for
the other example.  But at the end of the day I believe the "shot
callers" simply wanted to have "multi-tenant" on a power-point slide.
We went down the vpd-route with veils, but only to proof of concept.
Speaking of which, has that project simple withered on the vine?

On 06/28/2011 03:03 PM, Greg Smith wrote:
> Emrul Islam wrote:
>> I've just read through a paper here:
>> http://www.edbt.org/Proceedings/2011-Uppsala/papers/edbt/a12-schiller.pdf
>> about multi-tenancy.
>>
>> They used Postgres for their work and while it is academic and would
>> need further work I'm just wondering if anyone in the Postgres team
>> is looking at implementing some of the functionality described?
>
> They seem fuzzy about what actual businesses who implement
> multi-tenant environments, such as hosting companies, actually want
> and find missing in PostgreSQL right now.  They've tried to solve a
> problem I never would have considered interesting in the first place.
>
> On the "Shared Machine" side of things, we find complaints like how
> individual PostgreSQL instances use too much power.  See "Latch
> implementation that wakes on postmaster death", currently under
> development aimed at 9.2, aimed right at kicking that one around.
>
> This "Shared Table" approach they spend so much time worrying about
> and improving?  No one cares about that except companies hosting a
> single application on their giant box.  This idea that there are large
> number of tenants running the same application, but whom need to be
> isolated from one another in some way, is not the normal state of
> things.  Yes, it happens on the big servers at Salesforce.com who all
> run the same application; that is not a common situation however.
>
> What the hosting companies actually want from PostgreSQL is a good
> implementation of "Shared Process".  One database install, every
> tenant gets their own schema, tables and are expected to use some
> resources.  You can do this right now; I believe the infrastructure at
> Heroku is built that way for example.  How do the ideas in this paper
> actually solve the problems they're seeing with that approach?  I
> don't know for sure, but I don't see anything exciting there.
>
> I makes me kind of sad when people put a lot of work into doing a good
> job on a problem that doesn't really matter very much in the real
> world, and that's the overwhelming feel I get from reading this
> paper.  Advanced schema inheritance stuff?  Don't care.  Providing
> query cost constraint limits for individual tenants?  Now that's a
> useful problem to talk about, one that people deploying multi-tenant
> databases are actually being killed by.  And discussing aspects of
> that problem does flare up among the PostgreSQL developers regularly.
>

Re: Multi-tenancy in Postgres

От
Greg Smith
Дата:
On 06/28/2011 05:45 PM, Rob Sargent wrote:
> I think Greg might be forgetting that some of us don't always get to
> choose what we work on.  I was in a shop that decided to go with
> multi-tenancy for reason both technical and um, er envious.

There are certainly successful deployments of multi-tenant PostgreSQL
out there, ones that make sense.  What I was trying to communicate is
that the particular variation proposed by this academic paper doesn't
seem the right direction for PostgreSQL development to head in to me.
This project is stubborn about resolving the problems people actually
have, and the ones the paper tries to solve are not the ones I've seen
in my own experiments in multi-tenant deployments.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
Comprehensive and Customized PostgreSQL Training Classes:
http://www.2ndquadrant.us/postgresql-training/


Re: Multi-tenancy in Postgres

От
Rob Sargent
Дата:

On 06/28/2011 04:52 PM, Greg Smith wrote:
> On 06/28/2011 05:45 PM, Rob Sargent wrote:
>> I think Greg might be forgetting that some of us don't always get to
>> choose what we work on.  I was in a shop that decided to go with
>> multi-tenancy for reason both technical and um, er envious.
>
> There are certainly successful deployments of multi-tenant PostgreSQL
> out there, ones that make sense.  What I was trying to communicate is
> that the particular variation proposed by this academic paper doesn't
> seem the right direction for PostgreSQL development to head in to me.
> This project is stubborn about resolving the problems people actually
> have, and the ones the paper tries to solve are not the ones I've seen
> in my own experiments in multi-tenant deployments.
>
Yes, your point is well taken here, and that wasn't even hinted at in my
previous (top! oops) post.  My point was that hacks in the field (i.e.
me) will have to do multi-tenancy on postgres and though this
implementation may not become the answer, any leg up would be appreciated.



Re: Multi-tenancy in Postgres

От
Radosław Smogura
Дата:
 On Tue, 28 Jun 2011 17:04:54 -0600, Rob Sargent wrote:
> On 06/28/2011 04:52 PM, Greg Smith wrote:
>> On 06/28/2011 05:45 PM, Rob Sargent wrote:
>>> I think Greg might be forgetting that some of us don't always get
>>> to
>>> choose what we work on.  I was in a shop that decided to go with
>>> multi-tenancy for reason both technical and um, er envious.
>>
>> There are certainly successful deployments of multi-tenant
>> PostgreSQL
>> out there, ones that make sense.  What I was trying to communicate
>> is
>> that the particular variation proposed by this academic paper
>> doesn't
>> seem the right direction for PostgreSQL development to head in to
>> me.
>> This project is stubborn about resolving the problems people
>> actually
>> have, and the ones the paper tries to solve are not the ones I've
>> seen
>> in my own experiments in multi-tenant deployments.
>>
> Yes, your point is well taken here, and that wasn't even hinted at in
> my
> previous (top! oops) post.  My point was that hacks in the field
> (i.e.
> me) will have to do multi-tenancy on postgres and though this
> implementation may not become the answer, any leg up would be
> appreciated.

 I think this may be quite interesting solution. Actually I created such
 approach, for many reasons, but it's hard-coded, I mean in any place
 when query is executed I add this "tenancy" id, I called it differently,
 and it works perfectly.

 But such feature will not grow quite fast until PostgreSQL "ecosystem"
 will not grow, for example I see problems with Java + Hibernate +
 Caching, when "tenancy" id will be hidden, actually You may query for
 two "different" objects with same id, if we will allow dynamic tanacy
 switch (should be done, as You will loose connection pool benefits).

 Regards,
 Radek

Re: Multi-tenancy in Postgres

От
Emrul Islam
Дата:
Thank you so far for your perspectives on this.

I especially agree some of the things raised by Radoslaw and Rob.

While it may not be common to come across a scenario where this type of approach fits, I would like to point out that this type of solution is built into some commercial DBMS solutions already (SQL Azure and Progress).  We also see application-level solutions for this today (Tungsten's replicator and Grails' Multi-Tenant plugin and many others).  I do think some of what's being done at the application layer today should be handled by the DB - as Rob says, any leg-up would be appreciated).  If it was only for the likes of SalesForce then there wouldn't be in-db and application-level solutions being created to solve this.

I made the original post because I too am searching for ways to achieve this with more assistance from the DB (currently looking at Postgres SCHEMAs for per-customer separation and table inheritance to achieve something akin to a 'shared table' that can be extended to suit each customers customisations).  Sure, I could have one database per-customer but that creates overhead (which the paper goes into about).  Improving the overhead per-instance would help address some of this and is very welcome but even then, if I have one application and thousands of databases in a cluster then I'd have to alter each schema when I update the application - and deal with the error conditions that could arise.  Hosting companies may want better isolation for each customer, but SaaS applications operate on a different level where they control the database and the queries that execute on them so the same level of isolation may not be needed.

Also, if we look to the NoSQL world, one of the often-touted benefits there is for schema-less systems.  These aren't always chosen solely because people don't need a schema-less system, but because they need to, as an example, store different types of data per-customer.  The paper in my original mail goes some way to exploring ways to provide a solution whilst retaining the benefits that a RDBMS provides.  It may not be 'the answer' but I certainly feel it warrants some interest.


Thank you,

Emrul

On Wed, Jun 29, 2011 at 9:37 AM, Radosław Smogura <rsmogura@softperience.eu> wrote:
On Tue, 28 Jun 2011 17:04:54 -0600, Rob Sargent wrote:
On 06/28/2011 04:52 PM, Greg Smith wrote:
On 06/28/2011 05:45 PM, Rob Sargent wrote:
I think Greg might be forgetting that some of us don't always get to
choose what we work on.  I was in a shop that decided to go with
multi-tenancy for reason both technical and um, er envious.

There are certainly successful deployments of multi-tenant PostgreSQL
out there, ones that make sense.  What I was trying to communicate is
that the particular variation proposed by this academic paper doesn't
seem the right direction for PostgreSQL development to head in to me.
This project is stubborn about resolving the problems people actually
have, and the ones the paper tries to solve are not the ones I've seen
in my own experiments in multi-tenant deployments.

Yes, your point is well taken here, and that wasn't even hinted at in my
previous (top! oops) post.  My point was that hacks in the field (i.e.
me) will have to do multi-tenancy on postgres and though this
implementation may not become the answer, any leg up would be appreciated.

I think this may be quite interesting solution. Actually I created such approach, for many reasons, but it's hard-coded, I mean in any place when query is executed I add this "tenancy" id, I called it differently, and it works perfectly.

But such feature will not grow quite fast until PostgreSQL "ecosystem" will not grow, for example I see problems with Java + Hibernate + Caching, when "tenancy" id will be hidden, actually You may query for two "different" objects with same id, if we will allow dynamic tanacy switch (should be done, as You will loose connection pool benefits).

Regards,
Radek


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general