Re: How PostgreSQL handles multiple DDBB instances?

Поиск
Список
Период
Сортировка
От Craig James
Тема Re: How PostgreSQL handles multiple DDBB instances?
Дата
Msg-id 465CD47C.7050301@emolecules.com
обсуждение исходный текст
Ответ на Re: How PostgreSQL handles multiple DDBB instances?  (Jeff Davis <pgsql@j-davis.com>)
Список pgsql-performance
On Fri, 2007-05-25 at 20:16 +0200, Arnau wrote:
>    The point I'm worried is performance. Do you think the performance
> would be better executing exactly the same queries only adding an extra
> column to all the tables e.g. customer_id, than open a connection to the
> only one customers DB and execute the query there?

There is no simple answer to this question; it depends too much on your data.  In many cases, adding a customer_id to
everytable, and perhaps also per-customer views (per Jeff's suggestion), can work really well. 

However, performance is not the only consideration, or even the main consideration.  We operate with about 150 separate
databases. In our cases, administration issues and software design outweighed performance issues. 

For example, with separate databases, security is simpler, *and* it's easy to convince the customer that their data is
protected. Creating views only helps for read-only access.  When the customer wants to modify their data, how will you
keepthem from accessing and overwriting one another's data?  Even with views, can you convince the customer you've done
itright?  With separate databases, you use the built-in security of Postgres, and don't have to duplicate it in your
schemaand apps. 

With separate databases, it's really easy to discard a customer.  This can be particularly important for a big customer
withmillions of linked records.  In a database-for-everyone design, you'll have lots of foreign keys, indexes, etc.
thatmake deleting a whole customer a REALLY big job.  Contrast that with just discarding a whole database, which
typicallytakes a couple seconds. 

But even more important (to us) is the simplicity of the applications and management.  It's far more than just an extra
"... and customer = xyz" added to every query.  Throwing the customers together means every application has to
understandsecurity, and many operations that would be simple become horribly tangled.  Want to back up a customer's
data? You can't use pg_dump, you have to write your own dump app.  Want to restore a customer's data?  Same.  Want to
doa big update?  Your whole database is affected and probably needs to be vacuum/analyzed.  On and on, at every turn,
managementand applications are more complex. 

If you have hundreds of separate databases, it's also easy to scale: Just buy more servers, and move some of the
databases. With a single monster database, as load increases, you may hit the wall sooner or later. 

Postgres is really good at maintaining many separate databases.  Why do it yourself?

There are indeed performance issues, but even that's not black and white.  Depending on the specifics of your queries
andthe load on your servers, you may get better performance from a single monster database, or from hundreds of
separatedatabases. 

So, your question has no simple answer.  You should indeed evaluate the performance, but other issues may dominate your
decision.

Craig



В списке pgsql-performance по дате отправления:

Предыдущее
От: "Tyrrill, Ed"
Дата:
Сообщение: Very slow left outer join
Следующее
От: Michael Glaesemann
Дата:
Сообщение: Re: Very slow left outer join