Re: Large number of partitions of a table

Поиск
Список
Период
Сортировка
От Ketan Popat
Тема Re: Large number of partitions of a table
Дата
Msg-id CAKx+mvaDeH1DYJDwHbei70ZHeYXOnKdNCtJFX7Q+wQxa9TNivQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Large number of partitions of a table  (Ron <ronljohnsonjr@gmail.com>)
Ответы Re: Large number of partitions of a table  (Victor Sudakov <vas@sibptus.ru>)
Список pgsql-admin
Hi Victor,

Thanks for starting a thread on a large number of partitions with PostgreSQL.

If you are okay, I would like to start with understanding the problem rather than the solution.
What I have understood so far is, you have a multi-tenant application and you are using PostgreSQL 13.5 as a data store for the same.

Few questions there:
  • How many tenants are you going to have to start with? What do your 3-5 years projections in terms of number of tenants look like?
  • What is per tenant data volume? Both Avg and Max
  • What is going to be your table size range over 3-5 years?
    • How many small (< 100 G), medium (100-300 G) and large tables (>300 GB)
  • What is going to be max table size?
  • Have you considered HASH partitioning based on your tenant id over list?
  • What is going to be your query SLA (Avg and P99)?
  • Are you using the community edition of PostgreSQL or Aurora?
  • So far my understanding is no cross tenant queries from end user/OLTP traffic however feel free to correct as this is key design consideration
  • Do you have separate data lake/data mart for cross tenant for internal/analytical needs?
  • What does the break up of read vs write look like?
    • Within write, can you share break up of insert, update, delete?
    • How complex your select queries would look like? In terms of numbers of tables being joined, working data set in terms of rows and volumes, data aggregation
Once we have above input, we can have a better data point oriented discussions around (a) partition pruning for select, updates and delete and plan time overhead (b) data clustering aspects (c) vacuum aspects of medium and large tables (d) depending upon scale of your operation, consider leveraging multi-schema and/or multiple databases to limit number of partitions per table provided your application can redirect a tenant to appropriate schema/databases.

Thanks,
Ketan




On Tue, Jan 18, 2022 at 5:49 PM Ron <ronljohnsonjr@gmail.com> wrote:
On 1/18/22 6:09 PM, Mladen Gogala wrote:
> On 1/18/22 09:47, Ron wrote:
>> EXPLAIN plans are going to be hilariously gigantic, which means that
>> query planning would take a loooong time,  And the query planner (in v12,
>> at least) can generate some pretty bad plans in partitioned tables; I bet
>> there are edge cases in the QP code that don't work well with 10000
>> partitions and 50000 indices.
>
> These things sometimes happen. On the oracle-l, there is a thread in
> progress about a generated SQL with 130,000 lines which fails parsing
> because parsing exhausts the available memory.
>

But there wasn't any programming.  The company saved so much money!!!

> In such cases, I am less inclined to look for a fault with the database
> software and more inclined to inflict an unusual and cruel punishment on
> whoever has designed the data model.
>

OOP was definitely involved, in some way or form...

--
Angular momentum makes the world go 'round.


В списке pgsql-admin по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Vacuum & analyze thresholds-- redundant?
Следующее
От: Victor Sudakov
Дата:
Сообщение: Re: Large number of partitions of a table