Обсуждение: Question on partitioning

Поиск
Список
Период
Сортировка

Question on partitioning

От
veem v
Дата:
Hello All,
In postgresql, Is it possible to partition an existing nonpartitioned table having data already residing in it and indexes and constraints defined in it, without the need of manually moving the data around, to make it faster? Similarly merging multiple partitions to one partition or splitting a single partition into multiple partitions?
Appreciate any reference doc or sample syntax for the same.

Thanks and Regards
Veem

Re: Question on partitioning

От
Ron Johnson
Дата:
On Sun, Feb 4, 2024 at 4:40 PM veem v <veema0000@gmail.com> wrote:
Hello All,
In postgresql, Is it possible to partition an existing nonpartitioned table having data already residing in it and indexes and constraints defined in it, without the need of manually moving the data around, to make it faster? Similarly merging multiple partitions to one partition or splitting a single partition into multiple partitions?
 
That depends on how you partition the table, and what the PK is.  Works great when the PK is constantly increasing (think "sequence" or "timestamp", and you partition by PK ranges; not so great when you want to partition by a different column.

Re: Question on partitioning

От
Laurenz Albe
Дата:
On Mon, 2024-02-05 at 03:09 +0530, veem v wrote:
> In postgresql, Is it possible to partition an existing nonpartitioned table having data
> already residing in it and indexes and constraints defined in it, without the need of
> manually moving the data around, to make it faster? Similarly merging multiple partitions
> to one partition or splitting a single partition into multiple partitions?

There is no way to do that.

Yours,
Laurenz Albe



Re: Question on partitioning

От
veem v
Дата:
On Mon, 5 Feb 2024 at 17:52, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Mon, 2024-02-05 at 03:09 +0530, veem v wrote:
> In postgresql, Is it possible to partition an existing nonpartitioned table having data
> already residing in it and indexes and constraints defined in it, without the need of
> manually moving the data around, to make it faster? Similarly merging multiple partitions
> to one partition or splitting a single partition into multiple partitions?

There is no way to do that.

Yours,
Laurenz Albe

Thank you very much Laurenz.

Actually in other databases (for example like in Oracle) there exists sql syntax to split one partition into multiple and merge multiple partitions back to one. So I was hoping there may be some way to do it in postgres. Anyway, thanks for clarifying my doubt on this.

Which means for any such operation we need to create a new partition table with that structure and load that with the data from the existing partition table. Also for making or converting a non partition table to a partitioned one , we have to first create the blank partition table structure as per our need and then pump the data from the non partition table to the newly created partitioned table, which means we need to take some downtime to switch from non partitioned table to partitioned one. Please correct if wrong.

Additionally I see a lot of other restrictions like 
1)When creating indexes on the partition table "concurrently" keywords are not allowed.
2)While creating foreign key ,  it does not allow a "not valid" clause if the table is partitioned.
3) While creating indexes on this table or running any ALTER command, the SELECT queries running from other processes run longer. Does it take any lock while doing DDL on the base table in postgres?


On Mon, 5 Feb 2024 at 17:52, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Mon, 2024-02-05 at 03:09 +0530, veem v wrote:
> In postgresql, Is it possible to partition an existing nonpartitioned table having data
> already residing in it and indexes and constraints defined in it, without the need of
> manually moving the data around, to make it faster? Similarly merging multiple partitions
> to one partition or splitting a single partition into multiple partitions?

There is no way to do that.

Yours,
Laurenz Albe

Re: Question on partitioning

От
Laurenz Albe
Дата:
On Tue, 2024-02-06 at 00:26 +0530, veem v wrote:
> On Mon, 5 Feb 2024 at 17:52, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> > On Mon, 2024-02-05 at 03:09 +0530, veem v wrote:
> > > In postgresql, Is it possible to partition an existing nonpartitioned table having data
> > > already residing in it and indexes and constraints defined in it, without the need of
> > > manually moving the data around, to make it faster?
> >
> > There is no way to do that.
>
> Which means for any such operation we need to create a new partition table with that
> structure and load that with the data from the existing partition table.

You could use logical replication to keep the downtime short.

> Additionally I see a lot of other restrictions like 
> 1)When creating indexes on the partition table "concurrently" keywords are not allowed.

That is no problem.
You create the index on ONLY the partitioned table,
then you create indexes CONCURRENTLY on the partitions,
then you attach these indexes as partitions to the index you created in the beginning.
As soon as the last index is attached, the index on the partitioned table becomes valid.

> 2)While creating foreign key ,  it does not allow a "not valid" clause if the table
>   is partitioned.

Create the foreign key on the partitions; the effect is the same.

> 3) While creating indexes on this table or running any ALTER command, the SELECT
>    queries running from other processes run longer. Does it take any lock while
>    doing DDL on the base table in postgres?

I am not sure what the "base table" is.  Most DDL always locks the table in
ACCESS EXCLUSIVE mode; that is not specific to partitioned tables.

Yours,
Laurenz Albe



Re: Question on partitioning

От
veem v
Дата:
Thank you Laurenz. Got it.

So basically , you mean to say any DDL on a table won't allow the table to be read by other processes. I was under the assumption that it should allow the read queries to move ahead at least. I must be wrong here. Thanks for correcting me.

On Tue, 6 Feb 2024 at 15:46, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Tue, 2024-02-06 at 00:26 +0530, veem v wrote:
> On Mon, 5 Feb 2024 at 17:52, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> > On Mon, 2024-02-05 at 03:09 +0530, veem v wrote:
> > > In postgresql, Is it possible to partition an existing nonpartitioned table having data
> > > already residing in it and indexes and constraints defined in it, without the need of
> > > manually moving the data around, to make it faster?
> >
> > There is no way to do that.
>
> Which means for any such operation we need to create a new partition table with that
> structure and load that with the data from the existing partition table.

You could use logical replication to keep the downtime short.

> Additionally I see a lot of other restrictions like 
> 1)When creating indexes on the partition table "concurrently" keywords are not allowed.

That is no problem.
You create the index on ONLY the partitioned table,
then you create indexes CONCURRENTLY on the partitions,
then you attach these indexes as partitions to the index you created in the beginning.
As soon as the last index is attached, the index on the partitioned table becomes valid.

> 2)While creating foreign key ,  it does not allow a "not valid" clause if the table
>   is partitioned.

Create the foreign key on the partitions; the effect is the same.

> 3) While creating indexes on this table or running any ALTER command, the SELECT
>    queries running from other processes run longer. Does it take any lock while
>    doing DDL on the base table in postgres?

I am not sure what the "base table" is.  Most DDL always locks the table in
ACCESS EXCLUSIVE mode; that is not specific to partitioned tables.

Yours,
Laurenz Albe

Re: Question on partitioning

От
Ron Johnson
Дата:
On Tue, Feb 6, 2024 at 2:40 PM veem v <veema0000@gmail.com> wrote:
Thank you Laurenz. Got it.

So basically , you mean to say any DDL on a table won't allow the table to be read by other processes. I was under the assumption that it should allow the read queries to move ahead at least. I must be wrong here. Thanks for correcting me.

That word "any" in "any DDL" is quite a big word.  It's certainly not going to allow reads while you're adding a Foreign Key.

And the whole purpose of adding the CONCURRENTLY key word to CREATE INDEX is because regular CREATE INDEX statements block everyone else.