Обсуждение: Partitioned tables and [un]loggedness

Поиск
Список
Период
Сортировка

Partitioned tables and [un]loggedness

От
Michael Paquier
Дата:
Hi all,

As a recent poke on a thread of 2019 has reminded me, the current
situation of partitioned tables with unlogged is kind of weird:
https://www.postgresql.org/message-id/15954-b61523bed4b110c4%40postgresql.org

To sum up the situation:
- ALTER TABLE .. SET LOGGED/UNLOGGED does not affect partitioned
tables.
- New partitions don't inherit the loggedness of the partitioned
table.

One of the things that could be done is to forbid the use of UNLOGGED
in partitioned tables, but I'm wondering if we could be smarter than
that to provide a more natural user experience.  I've been
experiencing with that and finished with the POC attached, that uses
the following properties:
- Support ALTER TABLE .. SET LOGGED/UNLOGGED for partitioned tables,
where the command only works on partitioned tables so that's only a
catalog switch.
- CREATE TABLE PARTITION OF would make a new partition inherit the
logged ness of the parent if UNLOGGED is not directly specified.

There are some open questions that need attention:
- How about ONLY?  Would it make sense to support it so as ALTER TABLE
ONLY on a partitioned table does not touch any of its partitions?
Would not specifying ONLY mean that the loggedness of the partitioned
table and all its partitions is changed?  That would mean a burst in
WAL when switching to LOGGED, which was a concern when this feature
was first implemented even for a single table, so for potentially
hundreds of them, that would really hurt if a DBA is not careful.
- CREATE TABLE does not have a LOGGED keyword, hence it is not
possible through the parser to force a partition to have a permanent
persistence even if its partitioned table uses UNLOGGED.

Like tablespaces or even recently access methods, the heuristics can
be fun to define depending on the impact of the table rewrites.  The
attached has the code and regression tests to make the rewrites
cheaper, which is to make new partitions inherit the loggedness of the
parent while altering a parent's property leaves the partitions
untouched.  With  the lack of a LOGGED keyword to force a partition to
be permanent when its partitioned table is unlogged, that's a bit
funny but feel free to comment.

Thanks,
--
Michael

Вложения

Re: Partitioned tables and [un]loggedness

От
Nathan Bossart
Дата:
On Wed, Apr 24, 2024 at 04:17:44PM +0900, Michael Paquier wrote:
> - Support ALTER TABLE .. SET LOGGED/UNLOGGED for partitioned tables,
> where the command only works on partitioned tables so that's only a
> catalog switch.

I'm not following what this means.  Does SET [UN]LOGGED on a partitioned
table recurse to its partitions?  Does this mean that you cannot changed
whether a single partition is [UN]LOGGED?  How does this work with
sub-partitioning?

> - CREATE TABLE PARTITION OF would make a new partition inherit the
> logged ness of the parent if UNLOGGED is not directly specified.

This one seems generally reasonable to me, provided it's properly noted in
the docs.

> - How about ONLY?  Would it make sense to support it so as ALTER TABLE
> ONLY on a partitioned table does not touch any of its partitions?
> Would not specifying ONLY mean that the loggedness of the partitioned
> table and all its partitions is changed?  That would mean a burst in
> WAL when switching to LOGGED, which was a concern when this feature
> was first implemented even for a single table, so for potentially
> hundreds of them, that would really hurt if a DBA is not careful.

I guess ONLY could be a way of changing the default for new partitions
without changing whether existing ones were logged.  I'm not tremendously
concerned about the burst-of-WAL problem.  Or, at least, I'm not any more
concerned about it for partitioned tables than I am for regular tables.  I
do wonder if we can improve the performance of setting tables LOGGED, but
that's for a separate thread...

> - CREATE TABLE does not have a LOGGED keyword, hence it is not
> possible through the parser to force a partition to have a permanent
> persistence even if its partitioned table uses UNLOGGED.

Could we add LOGGED for CREATE TABLE?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: Partitioned tables and [un]loggedness

От
"David G. Johnston"
Дата:
On Wed, Apr 24, 2024 at 1:26 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Wed, Apr 24, 2024 at 04:17:44PM +0900, Michael Paquier wrote:
> - Support ALTER TABLE .. SET LOGGED/UNLOGGED for partitioned tables,
> where the command only works on partitioned tables so that's only a
> catalog switch.

I'm not following what this means.  Does SET [UN]LOGGED on a partitioned
table recurse to its partitions?  Does this mean that you cannot changed
whether a single partition is [UN]LOGGED?  How does this work with
sub-partitioning?

> - CREATE TABLE PARTITION OF would make a new partition inherit the
> logged ness of the parent if UNLOGGED is not directly specified.

This one seems generally reasonable to me, provided it's properly noted in
the docs.

I don't see this being desirable at this point.  And especially not by itself.  It is an error to not specify TEMP on the partition create table command when the parent is temporary, and that one is a no-brainer for having the persistence mode of the child be derived from the parent.  The current policy of requiring the persistence of the child to be explicitly specified seems perfectly reasonable.  Or, put differently, the specific current persistence of the parent doesn't get copied or even considered when creating children.

In any case we aren't going to be able to do exactly what it means by marking a partitioned table unlogged - namely that we execute the truncate command on it after a crash.  Forcing the implementation here just so that our existing decision to ignore unlogged on the parent table is, IMO, letting optics corrupt good design.

I do agree with having an in-core way for the DBA to communicate that they wish for partitions to be created with a known persistence unless the create table command specifies something different.  The way I would approach this is to add something like the following to the syntax/system:

CREATE [ persistence_mode ] TABLE ...
...
WITH (partition_default_persistence = { logged, unlogged, temporary }) -- storage_parameter, logged is effectively the default

This method is more explicit and has zero implications for existing backups or upgrading.


> - How about ONLY?  Would it make sense to support it so as ALTER TABLE
> ONLY on a partitioned table does not touch any of its partitions?

I'd leave it to the community to develop and maintain scripts that iterate over the partition hierarchy and toggle the persistence mode between logged and unlogged on the individual partitions using whatever throttling and batching strategy each individual user requires for their situation.


> - CREATE TABLE does not have a LOGGED keyword, hence it is not
> possible through the parser to force a partition to have a permanent
> persistence even if its partitioned table uses UNLOGGED.

Could we add LOGGED for CREATE TABLE?

 
I do agree with adding LOGGED to the list of optional persistence_mode specifiers, possibly regardless of whether any of this happens.  Seems desirable to give our default mode a name.

David J.

 

Re: Partitioned tables and [un]loggedness

От
Michael Paquier
Дата:
On Wed, Apr 24, 2024 at 03:26:40PM -0500, Nathan Bossart wrote:
> On Wed, Apr 24, 2024 at 04:17:44PM +0900, Michael Paquier wrote:
> > - Support ALTER TABLE .. SET LOGGED/UNLOGGED for partitioned tables,
> > where the command only works on partitioned tables so that's only a
> > catalog switch.
>
> I'm not following what this means.  Does SET [UN]LOGGED on a partitioned
> table recurse to its partitions?  Does this mean that you cannot changed
> whether a single partition is [UN]LOGGED?  How does this work with
> sub-partitioning?

The POC implements ALTER TABLE .. SET LOGGED/UNLOGGED so as the change
only reflects on the relation defined on the query.  In short, if
specifying a partitioned table as relation, only its relpersistence is
changed in the patch.  If sub-partitioning is involved, the POC
behaves the same way, relpersistence for partitioned table A1 attached
to partitioned table A does not change under ALTER TABLE A.

Recursing to all the partitions and partitioned tables attached to a
partitioned table A when using an ALTER TABLE A, when ONLY is not
specified, is OK by me if that's the consensus folks prefer.  I just
wanted to gather some opinions first about what people find the most
natural.

>> - CREATE TABLE PARTITION OF would make a new partition inherit the
>> logged ness of the parent if UNLOGGED is not directly specified.
>
> This one seems generally reasonable to me, provided it's properly noted in
> the docs.

Noted.  That's a second piece.  This part felt natural to me, but it
would not fly far without a LOGGED keyword and a way to attach a new
"undefined" RELPERSISTENCE_ in gram.y's OptTemp, likely a '\0'.
That's going to require some tweaks for CTAS as these cannot be
partitioned, but that should be a few lines to fall back to a
permanent if the query is parsed with the new "undefined".

>> - How about ONLY?  Would it make sense to support it so as ALTER TABLE
>> ONLY on a partitioned table does not touch any of its partitions?
>> Would not specifying ONLY mean that the loggedness of the partitioned
>> table and all its partitions is changed?  That would mean a burst in
>> WAL when switching to LOGGED, which was a concern when this feature
>> was first implemented even for a single table, so for potentially
>> hundreds of them, that would really hurt if a DBA is not careful.
>
> I guess ONLY could be a way of changing the default for new partitions
> without changing whether existing ones were logged.  I'm not tremendously
> concerned about the burst-of-WAL problem.  Or, at least, I'm not any more
> concerned about it for partitioned tables than I am for regular tables.  I
> do wonder if we can improve the performance of setting tables LOGGED, but
> that's for a separate thread...

Yeah.  A burst of WAL caused by a switch to LOGGED for a few thousand
partitions would never be fun, perhaps I'm just worrying to much as
that should not be a regular operation.

>> - CREATE TABLE does not have a LOGGED keyword, hence it is not
>> possible through the parser to force a partition to have a permanent
>> persistence even if its partitioned table uses UNLOGGED.
>
> Could we add LOGGED for CREATE TABLE?

I don't see why not if people agree with it, that's a patch of its own
that would help greatly in making the [un]logged attribute be
inherited for new partitions, because it is them possible to force a
persistence to be permanent as much as unlogged at query level, easing
the manipulation of partition trees.
--
Michael

Вложения

Re: Partitioned tables and [un]loggedness

От
Michael Paquier
Дата:
On Wed, Apr 24, 2024 at 03:36:35PM -0700, David G. Johnston wrote:
> On Wed, Apr 24, 2024 at 1:26 PM Nathan Bossart <nathandbossart@gmail.com>
> wrote:
>> On Wed, Apr 24, 2024 at 04:17:44PM +0900, Michael Paquier wrote:
>>> - CREATE TABLE PARTITION OF would make a new partition inherit the
>>> logged ness of the parent if UNLOGGED is not directly specified.
>>
>> This one seems generally reasonable to me, provided it's properly noted in
>> the docs.
>
> I don't see this being desirable at this point.  And especially not by
> itself.  It is an error to not specify TEMP on the partition create table
> command when the parent is temporary, and that one is a no-brainer for
> having the persistence mode of the child be derived from the parent.  The
> current policy of requiring the persistence of the child to be explicitly
> specified seems perfectly reasonable.  Or, put differently, the specific
> current persistence of the parent doesn't get copied or even considered
> when creating children.

I disagree here, actually.  Temporary tables are a different beast
because they require automated cleanup which would include interacting
with the partitionining information if temp and non-temp relations are
mixed.  That's why the current restrictions are in place: you should
be able to mix them.  Mixing unlogged and logged is OK, they retain a
state in their catalogs.

> In any case we aren't going to be able to do exactly what it means by
> marking a partitioned table unlogged - namely that we execute the truncate
> command on it after a crash.  Forcing the implementation here just so that
> our existing decision to ignore unlogged on the parent table is, IMO,
> letting optics corrupt good design.

It depends on retention policies, for one.  I could imagine an
application where partitioning is used based on a key where we
classify records based on their persistency, and one does not care
about a portion of them, still would like some retention as long as
the application is healthy.

> I do agree with having an in-core way for the DBA to communicate that they
> wish for partitions to be created with a known persistence unless the
> create table command specifies something different.  The way I would
> approach this is to add something like the following to the syntax/system:
>
> CREATE [ persistence_mode ] TABLE ...
> ...
> WITH (partition_default_persistence = { logged, unlogged, temporary }) --
> storage_parameter, logged is effectively the default

While we have keywords to drive that at query level for TEMP and
UNLOGGED?  Not sure to be on board with this idea.  pg_dump may need
some changes to reflect the loggedness across the partitions, now that
I think about it even if there should be an ATTACH once the table is
created to link it to its partitioned table.  There should be no
rewrites at restore.

> I do agree with adding LOGGED to the list of optional persistence_mode
> specifiers, possibly regardless of whether any of this happens.  Seems
> desirable to give our default mode a name.

Yeah, at least it looks like Nathan and you are OK with this addition.
--
Michael

Вложения

Re: Partitioned tables and [un]loggedness

От
"David G. Johnston"
Дата:
On Wed, Apr 24, 2024 at 4:35 PM Michael Paquier <michael@paquier.xyz> wrote:

I disagree here, actually.  Temporary tables are a different beast
because they require automated cleanup which would include interacting
with the partitionining information if temp and non-temp relations are
mixed.  That's why the current restrictions are in place: you should
[ not ] ? 
be able to mix them.

My point is that if you feel that treating logged as a copy-able property is OK then doing the following should also just work:

postgres=# create temp table parentt ( id integer ) partition by range (id);
CREATE TABLE
postgres=# create table child10t partition of parentt for values from (0) to (9);
ERROR:  cannot create a permanent relation as partition of temporary relation "parentt"

i.e., child10t should be created as a temporary partition under parentt.

David J.

Re: Partitioned tables and [un]loggedness

От
Michael Paquier
Дата:
On Thu, Apr 25, 2024 at 08:35:32AM +0900, Michael Paquier wrote:
> That's why the current restrictions are in place: you should
> be able to mix them.

Correction due to a ENOCOFFEE: you should *not* be able to mix them.
--
Michael

Вложения

Re: Partitioned tables and [un]loggedness

От
Michael Paquier
Дата:
On Wed, Apr 24, 2024 at 04:43:58PM -0700, David G. Johnston wrote:
> My point is that if you feel that treating logged as a copy-able property
> is OK then doing the following should also just work:
>
> postgres=# create temp table parentt ( id integer ) partition by range (id);
> CREATE TABLE
> postgres=# create table child10t partition of parentt for values from (0)
> to (9);
> ERROR:  cannot create a permanent relation as partition of temporary
> relation "parentt"
>
> i.e., child10t should be created as a temporary partition under parentt.

Ah, indeed, I've missed your point here.  Lifting the error and
inheriting temporary in this case would make sense.
--
Michael

Вложения

Re: Partitioned tables and [un]loggedness

От
Michael Paquier
Дата:
On Thu, Apr 25, 2024 at 08:55:27AM +0900, Michael Paquier wrote:
> On Wed, Apr 24, 2024 at 04:43:58PM -0700, David G. Johnston wrote:
>> My point is that if you feel that treating logged as a copy-able property
>> is OK then doing the following should also just work:
>>
>> postgres=# create temp table parentt ( id integer ) partition by range (id);
>> CREATE TABLE
>> postgres=# create table child10t partition of parentt for values from (0)
>> to (9);
>> ERROR:  cannot create a permanent relation as partition of temporary
>> relation "parentt"
>>
>> i.e., child10t should be created as a temporary partition under parentt.
>
> Ah, indeed, I've missed your point here.  Lifting the error and
> inheriting temporary in this case would make sense.

The case of a temporary persistence is actually *very* tricky.  The
namespace, where the relation is created, is guessed and locked with
permission checks done based on the RangeVar when the CreateStmt is
transformed, which is before we try to look at its inheritance tree to
find its partitioned parent.  So we would somewhat need to shortcut
the existing RangeVar lookup and include the parents in the loop to
find out the correct namespace.  And this is much earlier than now.
The code complexity is not trivial, so I am getting cold feet when
trying to support this case in a robust fashion.  For now, I have
discarded this case and focused on the main problem with SET LOGGED
and UNLOGGED.

Switching between logged <-> unlogged does not have such
complications, because the namespace where the relation is created is
going to be the same.  So we won't lock or perform permission checks
on an incorrect namespace.

The addition of LOGGED makes the logic deciding how the loggedness of
a partition table based on its partitioned table or the query quite
easy to follow, but this needs some safety nets in the sequence, view
and CTAS code paths to handle with the case where the query specifies
no relpersistence.

I have also looked at support for ONLY, and I've been surprised that
it is not that complicated.  tablecmds.c has a ATSimpleRecursion()
that is smart enough to do an inheritance tree lookup and apply the
rewrites where they should happen in the step 3 of ALTER TABLE, while
handling ONLY on its own.  The relpersistence of partitioned tables is
updated in step 2, with the catalog changes.

Attached is a new patch series:
- 0001 refactors some code around ATPrepChangePersistence() that I
found confusing after applying the operation to partitioned tables.
- 0002 adds support for a LOGGED keyword.
- 0003 expands ALTER TABLE SET [UN]LOGGED to partitioned tables,
without recursion to partitions.
- 0004 adds the recursion logic, expanding regression tests to show
the difference.

0003 and 0004 should be merged together, I think.  Still, splitting
them makes reviews a bit easier.
--
Michael

Вложения