Re: Should we warn against using too many partitions?

Поиск
Список
Период
Сортировка
От Amit Langote
Тема Re: Should we warn against using too many partitions?
Дата
Msg-id CA+HiwqEPczRV_CaQnLXQda2-=T8WQ=kOrdsJDeO3H0xZsuqpTw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Should we warn against using too many partitions?  (David Rowley <david.rowley@2ndquadrant.com>)
Ответы Re: Should we warn against using too many partitions?  (David Rowley <david.rowley@2ndquadrant.com>)
Список pgsql-hackers
Hi,

Thanks for the updated patches.

On Fri, Jun 7, 2019 at 2:34 PM David Rowley
<david.rowley@2ndquadrant.com> wrote:
> Anyway comments welcome.  If I had a few more minutes to spare I'd
> have wrapped OLTP in <acronym> tags, but out of time for now.

Some rewording suggestions.

1.

+    ...    Removal of unwanted data is also a factor to consider when
+    planning your partitioning strategy as an entire partition can be removed
+    fairly quickly.  However, if data that you want to keep exists in that
+    partition then that means having to resort to using
+    <command>DELETE</command> instead of removing the partition.

Not sure if the 2nd sentence is necessary or perhaps should be
rewritten in a way that helps to design to benefit from this.

Maybe:

...    Removal of unwanted data is also a factor to consider when
planning your partitioning strategy as an entire partition can be
removed fairly quickly, especially if the partition keys are chosen
such that all data that can be deleted together are grouped into
separate partitions.

2.

+    ... For example, if you choose to have one partition
+    per customer and you currently have a small number of large customers,
+    what will the implications be if in several years you obtain a large
+    number of small customers.

The sentence could be rewritten a bit.  Maybe as:

... For example, choosing a design with one partition per customer,
because you currently have a small number of large customers, will not
scale well several years down the line when you might have a large
number of small customers.

Btw, doesn't it suffice here to say "large number of customers"
instead of "large number of small customers"?

3.

+    ... In this case, it may be better to choose to
+    partition by <literal>RANGE</literal> and choose a reasonable number of
+    partitions

Maybe:

... and choose reasonable number of partitions, each containing the
data of a fixed number of customers.

4.

+    ...  It also
+    may be undesirable to have a large number of partitions as each partition
+    requires metadata about the partition to be stored in each session that
+    touches it.  If each session touches a large number of partitions over a
+    period of time then the memory consumption for this may become
+    significant.

It might be a good idea to reorder the sentences here to put the
problem first and the cause later.  Maybe like this:

Another reason to be concerned about having a large number of
partitions is that the server's memory consumption may grow
significantly over a period of time, especially if many sessions touch
large numbers of partitions.  That's because each partition requires
its own metadata that must be loaded into the local memory of each
session that touches it.

5.

+    With data warehouse type workloads it can make sense to use a larger
+    number of partitions than with an OLTP type workload.

Is there a comma missing between "With data warehouse type workloads"
and the rest of the sentence?

Thanks,
Amit



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Masahiko Sawada
Дата:
Сообщение: Re: doc: pg_trgm missing description for GUC "pg_trgm.strict_word_similarity_threshold"
Следующее
От: Daniel Gustafsson
Дата:
Сообщение: Re: be-gssapi-common.h should be located in src/include/libpq/