Re: Creating partitions automatically at least on HASH?

Поиск
Список
Период
Сортировка
От Rafia Sabih
Тема Re: Creating partitions automatically at least on HASH?
Дата
Msg-id CA+FpmFdaLu87AekTMEHP5-V9iMg1WJLD3_-sGpgcUjK7wzeU5w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Creating partitions automatically at least on HASH?  (Fabien COELHO <coelho@cri.ensmp.fr>)
Ответы Re: Creating partitions automatically at least on HASH?  (Amit Langote <amitlangote09@gmail.com>)
Список pgsql-hackers
On Mon, 26 Aug 2019 at 19:46, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

Hello Rafia,

>>    CREATE TABLE Stuff (...)
>>      PARTITION BY [HASH | RANGE | LIST] (…)
>>        DO NONE -- this is the default
>>        DO [IMMEDIATE|DEFERRED] USING (…)
>>
>> Where the USING part would be generic keword value pairs, eg:
>>
>> For HASH: (MODULUS 8) and/or (NPARTS 10)
>>
>> For RANGE: (START '1970-01-01', STOP '2020-01-01', INCREMENT '1 year')
>>      and/or (START 1970, STOP 2020, NPARTS 50)
>>
>> And possibly for LIST: (IN (…), IN (…), …), or possibly some other
>> keyword.
>>
>> The "DEFERRED" could be used as an open syntax for dynamic partitioning,
>> if later someone would feel like doing it.
>>
> ISTM that "USING" is better than "WITH" because WITH is already used
>> specifically for HASH and other optional stuff in CREATE TABLE.
>>
>> The text constant would be interpreted depending on the partitioning
>> expression/column type.
>>
>> Any opinion about the overall approach?

> I happen to start a similar discussion [1] being unaware of this one
> and there Ashutosh Sharma talked about interval partitioning in Oracle.
> Looking
> closely it looks like we can have this automatic partitioning more
> convenient by having something similar. Basically, it is creating
> partitions on demand or lazy partitioning.

Yep, the "what" of dynamic partitioning is more or less straightforward,
along the line you are describing.

For me there are really two questions:

  - having a extendable syntax, hence the mail I sent, which would cover
    both automatic static & dynamic partitioning and their parameters,
    given that we already have manual static, automatic static should
    be pretty easy.

  - implementing the stuff, with limited performance impact if possible
    for the dynamic case, which is non trivial.

> To explain a bit more, let's take range partition for example, first
> parent table is created and it's interval and start and end values are
> specified and it creates only the parent table just like it works today.

> Now, if there comes a insertion that does not belong to the existing (or
> any, in the case of first insertion) partition(s), then the
> corresponding partition is created,

Yep. Now, you also have to deal with race conditions issues, i.e. two
parallel session inserting tuples that must create the same partition, and
probably you would like to avoid a deadlock.

Hmmm, that shouldn't be very hard. Postgres handles many such things and I think mostly by a mutex guarded shared memory structure. E.g. we can have a shared memory structure associated with the parent table holding the information of all the available partitions, and keep this structure guarded by mutex. Anytime a new partition has to be created the relevant information is first entered in this structure before actually creating it.

> I think it is extensible to other partitioning schemes as well. Also it
> is likely to have a positive impact on the queries, because there will
> be required partitions only and would not require to educate
> planner/executor about many empty partitions.

Yep, but it creates other problems to solve…

Isn't it always the case. :)

--
Regards,
Rafia Sabih

В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Tom Turelinckx"
Дата:
Сообщение: Re: "ago" times on buildfarm status page
Следующее
От: Alexander Kukushkin
Дата:
Сообщение: Re: Statement timeout in pg_rewind