Обсуждение: Experience with large number of tables in single PostgreSQL instance

Поиск
Список
Период
Сортировка

Experience with large number of tables in single PostgreSQL instance

От
Vedran Krivokuca
Дата:
Hello all,


We are exploring possible strategies on deploying PostgreSQL with
application which will store fairly much data (current implementation
stores around 345 GB of data, and it will be subject of up to 10 times
more data stored in the database).

Now, not to go into too much details to avoid bothering you, the key
fact is that expected data  sets needed at any moment for manipulation
at any given point in time (let's call them operational data sets)
will be around 1/10.000th of whole data set. There can be up to 100
concurrent different data sets operations at the same time.
Operational data sets are strictly defined in terms of what they
contain, but they could contain any piece of complete data set, so we
have 2 obvious strategies here to go with:

1) we can go with different instances of PostgreSQL service, let's say
(for pure theory) 10 of them on the same HA cluster setup. Every
instance would hold let's say 1/10th of that big recordset, and around
3.000 database tables (this apparently shouldn't be of any problem to
PostgreSQL, see below).

2) we can go with single instance of PostgreSQL service which would
then contain 30.000 database tables.

So, consider this purely theoretical discussion - does anyone here
have experience of running PostgreSQL service with large number of
database tables (couple of thousands up to couple of tens of
thousands)?

This entry in Wiki indicates PostgreSQL in general doesn't have
problem with "thousands of tables", how about tens of thousands?
http://postgres.cz/wiki/PostgreSQL_SQL_Tricks#Faster_table.27s_list


Best regards,
V.

--
  Pozdrav/Greetings,
  Vedran Krivokuća
  Disclaimer: This message may contain information.


Re: Experience with large number of tables in single PostgreSQL instance

От
Vedran Krivokuca
Дата:
On Wed, Apr 10, 2013 at 9:07 AM, Vedran Krivokuca <vkrivokuca@gmail.com> wrote:
> 1) we can go with different instances of PostgreSQL service, let's say
> (for pure theory) 10 of them on the same HA cluster setup. Every
> instance would hold let's say 1/10th of that big recordset, and around
> 3.000 database tables (this apparently shouldn't be of any problem to
> PostgreSQL, see below).

I am talking shit here, obviously, excuse me. We would go with, for
example, 10 or 100 databases in the same PostgreSQL instance.
Buzzwording got better of me. :) The rest of my initial e-mail still
applies, I will cherish any input of experience on this subject.


V.

--
  Pozdrav/Greetings,
  Vedran Krivokuća
  Disclaimer: This message may contain information.


Re: Experience with large number of tables in single PostgreSQL instance

От
Vasilis Ventirozos
Дата:



On Wed, Apr 10, 2013 at 10:10 AM, Vedran Krivokuca <vkrivokuca@gmail.com> wrote:
On Wed, Apr 10, 2013 at 9:07 AM, Vedran Krivokuca <vkrivokuca@gmail.com> wrote:
> 1) we can go with different instances of PostgreSQL service, let's say
> (for pure theory) 10 of them on the same HA cluster setup. Every
> instance would hold let's say 1/10th of that big recordset, and around
> 3.000 database tables (this apparently shouldn't be of any problem to
> PostgreSQL, see below).

I am talking shit here, obviously, excuse me. We would go with, for
example, 10 or 100 databases in the same PostgreSQL instance.
Buzzwording got better of me. :) The rest of my initial e-mail still
applies, I will cherish any input of experience on this subject.


V.

--
  Pozdrav/Greetings,
  Vedran Krivokuća
  Disclaimer: This message may contain information.


--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

I think that a limitation would be on the OS (filesystem and kernel) because each table 
is a file (or more) under a directory (per database). I haven't got experience on how
modern Linux system cope with that but i think it would be something easy to test.
Database wise you shouldn't have a problem.

Re: Experience with large number of tables in single PostgreSQL instance

От
Dale Betts
Дата:
I'd agree, certainly in my experiences.

You need to ensure OS parameters such as the max open files (fs.file-max if we're talking Linux) is set appropriately. Baring in mind each user will have an open file on each underlying datafile for the databases they're connected to.

Dale
.hss1 {font-family:Verdana; font-weight:bold; font-size:10pt; color:154584 ; margin-bottom: 0px} .hss2 {font-family:Verdana; font-size:8pt; color:EE3150 ; margin-bottom:20px} .hss3 {font-family:Verdana; font-size:7pt; color:154584 ; margin-bottom: 0px} .hss4 {font-family:Verdana; font-size:7pt; color:gray ; margin-top: 20px}
Dale Betts
Technical Specialist
T 01623 48 98 22
F 01623 48 98 20
Healthcare Software Systems
3rd Floor i2 Mansfield
Hamilton Court
Oakham Business Park
Mansfield
NG18 5FB
THE INFORMATION CONTAINED IN THIS MESSAGE IS CONFIDENTIAL and is intended for the addressee only. If you are not the addressee, any disclosure, reproduction, copying, distribution, or other dissemination or use of this communication is strictly prohibited. Any views or opinions presented are those of the sender and do not necessarily represent Healthcare Software Systems.

HSS Ltd is a company registered in England and Wales with company number 2112370

On 10/04/2013 08:24, Vasilis Ventirozos wrote:



On Wed, Apr 10, 2013 at 10:10 AM, Vedran Krivokuca <vkrivokuca@gmail.com> wrote:
On Wed, Apr 10, 2013 at 9:07 AM, Vedran Krivokuca <vkrivokuca@gmail.com> wrote:
> 1) we can go with different instances of PostgreSQL service, let's say
> (for pure theory) 10 of them on the same HA cluster setup. Every
> instance would hold let's say 1/10th of that big recordset, and around
> 3.000 database tables (this apparently shouldn't be of any problem to
> PostgreSQL, see below).

I am talking shit here, obviously, excuse me. We would go with, for
example, 10 or 100 databases in the same PostgreSQL instance.
Buzzwording got better of me. :) The rest of my initial e-mail still
applies, I will cherish any input of experience on this subject.


V.

--
  Pozdrav/Greetings,
  Vedran Krivokuća
  Disclaimer: This message may contain information.


--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

I think that a limitation would be on the OS (filesystem and kernel) because each table 
is a file (or more) under a directory (per database). I haven't got experience on how 

modern Linux system cope with that but i think it would be something easy to test.
Database wise you shouldn't have a problem.


Вложения

Re: Experience with large number of tables in single PostgreSQL instance

От
Vedran Krivokuca
Дата:
On Wed, Apr 10, 2013 at 10:43 AM, Dale Betts <dale.betts@hssnet.com> wrote:
>
> I'd agree, certainly in my experiences.
>
> You need to ensure OS parameters such as the max open files (fs.file-max if we're talking
> Linux) is set appropriately. Baring in mind each user will have an open file on each
> underlying datafile for the databases they're connected to.

Ok, to reply to myself and other interested parties, further insights
are welcome!

I've been digging today for information on this subject and:

http://postgresql.1045698.n5.nabble.com/GENERAL-Maximum-number-of-tables-per-database-and-slowness-td1853836.html
Basically saying: any number of tables is acceptable (well in range of
couple of tens of thousands we might end up with), but have in mind
pg_catalog might seriously grow with such number of tables.
Huge pg_catalog tends to slow down query planner. This thread is from
2005., but I doubt those fact changed substionally with time.

I'll probably do some testing, but I can hardly replicate production
volume of data. Seems that due to predictable nature of the queries we
are running against this system we'll go with multiple databases, each
containing up to couple thousand tables at most. Still weighting pros
and cons of each approach.



--
  Pozdrav/Greetings,
  Vedran Krivokuća
  Disclaimer: This message may contain information.


Re: Experience with large number of tables in single PostgreSQL instance

От
Prashanth Ranjalkar
Дата:
Every table or index is created in the form of OS files therefore max open files need to be set appropriately in order to achieve the larger table count. There would be no limitation for creation of tables in PostgreSQL and performance would be the major criteria as catalogs get overburdened. 

 
Thanks & Regards,
 
Prashanth Ranjalkar
Database Consultant & Architect
Skype:prashanth.ranjalkar


On Wed, Apr 10, 2013 at 2:13 PM, Dale Betts <dale.betts@hssnet.com> wrote:
I'd agree, certainly in my experiences.

You need to ensure OS parameters such as the max open files (fs.file-max if we're talking Linux) is set appropriately. Baring in mind each user will have an open file on each underlying datafile for the databases they're connected to.

Dale
Dale Betts
Technical Specialist
T 01623 48 98 22
F 01623 48 98 20
Healthcare Software Systems
3rd Floor i2 Mansfield
Hamilton Court
Oakham Business Park
Mansfield
NG18 5FB
THE INFORMATION CONTAINED IN THIS MESSAGE IS CONFIDENTIAL and is intended for the addressee only. If you are not the addressee, any disclosure, reproduction, copying, distribution, or other dissemination or use of this communication is strictly prohibited. Any views or opinions presented are those of the sender and do not necessarily represent Healthcare Software Systems.

HSS Ltd is a company registered in England and Wales with company number 2112370

On 10/04/2013 08:24, Vasilis Ventirozos wrote:



On Wed, Apr 10, 2013 at 10:10 AM, Vedran Krivokuca <vkrivokuca@gmail.com> wrote:
On Wed, Apr 10, 2013 at 9:07 AM, Vedran Krivokuca <vkrivokuca@gmail.com> wrote:
> 1) we can go with different instances of PostgreSQL service, let's say
> (for pure theory) 10 of them on the same HA cluster setup. Every
> instance would hold let's say 1/10th of that big recordset, and around
> 3.000 database tables (this apparently shouldn't be of any problem to
> PostgreSQL, see below).

I am talking shit here, obviously, excuse me. We would go with, for
example, 10 or 100 databases in the same PostgreSQL instance.
Buzzwording got better of me. :) The rest of my initial e-mail still
applies, I will cherish any input of experience on this subject.


V.

--
  Pozdrav/Greetings,
  Vedran Krivokuća
  Disclaimer: This message may contain information.


--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

I think that a limitation would be on the OS (filesystem and kernel) because each table 
is a file (or more) under a directory (per database). I haven't got experience on how 

modern Linux system cope with that but i think it would be something easy to test.
Database wise you shouldn't have a problem.



Вложения

Re: Experience with large number of tables in single PostgreSQL instance

От
Marc Mamin
Дата:
> 
> 2) we can go with single instance of PostgreSQL service which would
> then contain 30.000 database tables.
> 
> So, consider this purely theoretical discussion - does anyone here have
> experience of running PostgreSQL service with large number of database
> tables (couple of thousands up to couple of tens of thousands)?


Hello,

We are handling data for several customers, each of them having its own schema.
This allows us to easily bulk move data on other servers when place gets rare.
These are comparable to data warehouse data and are used by a reporting application.
A majority of the tables are seldom or never used which may moderate possible issues with very large pg_catalog.

We are overall impressed how well Postgres handle this :-)
On the other hand we've spent a *LOT* of work in table and query design and we do have some problems with the parser
performance:

We do log all durations greater than 1 seconds. 1 % of these are "parse" statements
about 90 % of these long parse statements are below 10 seconds.

Here the figure of our server hosting the most tables.
I expect them to be somewhat above the maximum that we should allow.

DB size :3500 GB

pg_catalog size: 2GB

select count(*) from pg_tables;
120'884

select count(*) from pg_indexes;
219'082

select count(*) from pg_attribute;
2'779'199

Server: 48 GB RAM & 12 cores


regards,

Marc Mamin