Re: Partitioning options

Поиск
Список
Период
Сортировка
От Alec Lazarescu
Тема Re: Partitioning options
Дата
Msg-id CAE+E=SS+43jDTqkknXqhZN6tjA1s6-q-_V6fuWFrwLrRHA1CPg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Partitioning options  (Justin <zzzzz.graf@gmail.com>)
Ответы Re: Partitioning options
Список pgsql-general
"Would probably look at a nested partitioning"

I'm not the original poster, but I have a schema with nested
(composite) partitions and I do run into some significant
inefficiencies compared to flat partitions in various schema metadata
operations (queries to get the list of tables, creating foreign keys,
etc.) in tables with 1,000+ total partitions.

One example:
https://www.postgresql.org/message-id/CAE%2BE%3DSQacy6t_3XzCWnY1eiRcNWfz4pp02FER0N7mU_F%2Bo8G_Q%40mail.gmail.com

Alec

On Sun, Feb 11, 2024 at 8:25 AM Justin <zzzzz.graf@gmail.com> wrote:
>
> Hi Marc,
>
> Nested partitioning still allows for simple data deletion by dropping the table that falls in that date range.
>
> Probably thinking of partitioning by multicolomn rules which is very complex  to set up
>
> On Fri, Feb 9, 2024, 10:29 AM Marc Millas <marc.millas@mokadb.com> wrote:
>>
>>
>>
>>
>> On Thu, Feb 8, 2024 at 10:25 PM Justin <zzzzz.graf@gmail.com> wrote:
>>>
>>> Hi Sud,
>>>
>>> Would not look at HASH partitioning as it is very expensive to add or subtract the number of partitions.
>>>
>>> Would probably look at a nested partitioning using  customer ID using range or list of IDs then  by transaction
date, Its easy to add  partitions and balance the partitions segments. 
>>
>>
>>  I'll not do that because, then, when getting rid of obsolete data, you must delete a huge number of records, and
vacuumeach partition. 
>> if partitioning by date, you will ease greatly the cleaning, by just getting rid of obsolete partitions which is
quitespeedy.( no delete, no vacuum, no index updates, ...) 
>> Marc
>>
>>>
>>> Keep in mind that SELECT queries being used on the partition must  use the partitioning KEY in the WHERE clause of
thequery or performance will suffer. 
>>>
>>> Suggest doing a query analysis before deploying partition to confirm the queries WHERE clauses matched the planned
partitionrule.  I suggest that 80% of the queries of the executed queries must match the partition rule if not don't
deploypartitioning or change  all the queries in the application to match the partition rule 
>>>
>>>
>>> On Thu, Feb 8, 2024 at 3:51 PM Greg Sabino Mullane <htamfids@gmail.com> wrote:
>>>>>
>>>>> Out of curiosity, As OP mentioned that there will be Joins and also filters on column Customer_id column , so why
don'tyou think that subpartition by customer_id will be a good option? I understand List subpartition may not be an
optionconsidering the new customer_ids gets added slowly in the future(and default list may not be allowed) and also OP
mentioned,there is skewed distribution of data for customer_id column. However what is the problem if OP will opt for
HASHsubpartition on customer_id in this situation? 
>>>>
>>>>
>>>> It doesn't really gain you much, given you would be hashing it, the customers are unevenly distributed, and OP
talkedabout filtering on the customer_id column. A hash partition would just be a lot more work and complexity for us
humansand for Postgres. Partitioning for the sake of partitioning is not a good thing. Yes, smaller tables are better,
butthey have to be smaller targeted tables. 
>>>>
>>>> sud wrote:
>>>>
>>>>> 130GB of storage space as we verified using the "pg_relation_size" function, for a sample data set.
>>>>
>>>>
>>>> You might also want to closely examine your schema. At that scale, every byte saved per row can add up.
>>>>
>>>> Cheers,
>>>> Greg
>>>>



В списке pgsql-general по дате отправления:

Предыдущее
От: "David G. Johnston"
Дата:
Сообщение: Re: Users and object privileges maintenance
Следующее
От: Darryl Green
Дата:
Сообщение: Partitioning, Identity and Uniqueness (given pg 16 changes)