Обсуждение: RE: Patch for migration of the pg_commit_ts directory
Hi, Per my understanding, track_commit_timestamp must be on the new instance, otherwise it would be removed when new instance starts. Is it correct? If so, should we detect the difference and do some warnings/errors here? Or it is just the user's responsibility? Best regards, Hayato Kuroda FUJITSU LIMITED
Hi,
(Sorry but I cannot find your name)
> Yes, track_commit_timestamp must be installed in the new instance.
> This is only the responsibility of an experienced user.
> pg_upgrage should allow you to save pg_commit_ts if this data exists at the time of migration.
> Warnings are not needed, the loss of this data is not critical in most cases.
> They were lost with each migration if users did not manually migrate them.
So, your policy is that commit_ts is not the user data thus it is OK to drop during
the upgrade, is it correct? I want to know other's opinion around here.
Note that if we want to check, it can be done in check_control_data().
Regarding the patch:
```
-
+ cluster->controldata.chkpnt_oldstCommitTsxid = 0;
+ cluster->controldata.chkpnt_newstCommitTsxid = 0;
```
Other attributes are not initialized, you can follow.
```
+ if (old_cluster.controldata.chkpnt_oldstCommitTsxid > 0)
+ copy_subdir_files("pg_commit_ts", "pg_commit_ts");
```
Indent should be fixed. Please run pgindent.
```
- old_cluster.controldata.chkpnt_nxtxid,
- old_cluster.controldata.chkpnt_nxtxid,
+ old_cluster.controldata.chkpnt_oldstCommitTsxid == 0 ? old_cluster.controldata.chkpnt_nxtxid :
old_cluster.controldata.chkpnt_oldstCommitTsxid,
+ old_cluster.controldata.chkpnt_newstCommitTsxid == 0 ? old_cluster.controldata.chkpnt_nxtxid :
old_cluster.controldata.chkpnt_newstCommitTsxid,
```
To confirm, is there a possibility that only either of oldest/newest CommitTs exists?
Best regards,
Hayato Kuroda
FUJITSU LIMITED
On Thu, Oct 2, 2025 at 12:10 PM Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > > Yes, track_commit_timestamp must be installed in the new instance. > > This is only the responsibility of an experienced user. > > pg_upgrage should allow you to save pg_commit_ts if this data exists at the time of migration. > > Warnings are not needed, the loss of this data is not critical in most cases. > > They were lost with each migration if users did not manually migrate them. > > So, your policy is that commit_ts is not the user data thus it is OK to drop during > the upgrade, is it correct? > IIUC, the proposal is that if GUC track_commit_timestamp is enabled on the new instance then we should copy it, otherwise, we can drop copying it. Is my understanding correct? I think we can follow what is done check_new_cluster_replication_slots() for the case when track_commit_timestamp is not set on the new server. When we try to copy slots and the wal_level on the new server is minimal, we error out, so shouldn't we do the same here and error_out if track_commit_timestamp is not enabled and we have some valid commit_ts data to copy? -- With Regards, Amit Kapila.
On Thu, 2 Oct 2025 at 11:49, Amit Kapila <amit.kapila16@gmail.com> wrote:
When we try to
copy slots and the wal_level on the new server is minimal, we error
out, so shouldn't we do the same here and error_out if
track_commit_timestamp is not enabled and we have some valid commit_ts
data to copy?
+1 Sounds reasonable to me. It's better to give an explicit error;
otherwise, we would remove data that the user clearly wants to migrate
to the new cluster. And deleting data just because the user forgot to
specify one parameter in a new cluster looks like a bad joke to me.
otherwise, we would remove data that the user clearly wants to migrate
to the new cluster. And deleting data just because the user forgot to
specify one parameter in a new cluster looks like a bad joke to me.
Best regards,
Maxim Orlov.
On Mon, Oct 6, 2025 at 07:31:52PM +0500, ls7777 wrote: > Hi, > I'll try to return it to the original thread. > > I absolutely agree with you. Data loss during migration is an unacceptable > situation. And why it's still happening, I don't understand. But issuing > messages will require translating it into other languages and complicate the > patch. I'm not a programmer to do it beautifully. And not a beautiful patch is > unlikely to be accepted. > I will do a parameter check and an ERROR message. You don't need to worry about translations. Each language team will make sure your English message is translated. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Do not let urgent matters crowd out time for investment in the future.
Hi, Thanks for updating the patch. Regarding the check_track_commit_timestamp_parameter(), I'm not sure the function is needed. Since pg_controldata outputs whether the commit_ts module is enabled or not [1], can we obtain there instead? I imagined like we can add a new field at ControlData, it could be obtained in get_control_data(), values for instances could be compared in check_control_data(). Another possibility is that oldestCommitTsXid/newestCommitTsXid can be always valid if track_commit_timestamp is on. In this case the attribute is not needed anymore. It is enough to check these values. Can you check them? Regarding the test, I feel it is unnecessary long. I feel it is enough to ensure one or two transactions have the same timestamp on the both nodes, pgbench is not needed. Current test may not be able to finish on the slow machine. [1] ``` $ pg_controldata data/ | grep "track_" track_commit_timestamp setting: off ``` Best regards, Hayato Kuroda FUJITSU LIMITED
Hi,
> At the time check_control_data is launched for the new cluster, the control file
> does not contain information about the set track_commit_timestamp=on. It is
> installed only in postgresql.conf.
You did consider the case that track_commit_timestamp is set to on but the instance
has not started before the pg_upgrade, right? Valid point.
> The check_track_commit_timestamp_parameter function is only needed for the new
> cluster. And you can not run it for the old cluster, but use data from the
> old_cluster.controldata.chkpnt_newstCommitTsxid control file. But this will be
> less clear than using the check_track_commit_timestamp_parameter function.
Sorry, I'm not very clear this part. Can you describe the point bit more?
Other comments:
01. get_control_data
```
+ if (GET_MAJOR_VERSION(cluster->major_version) < 905)
+ {
+ cluster->controldata.chkpnt_oldstCommitTsxid = 0;
+ cluster->controldata.chkpnt_newstCommitTsxid = 0;
+ }
```
IIUC, checksum_version is checked like this because got_data_checksum_version
must be also set.
Since we do not have the boolean for commit_ts, not sure it is needed.
old_cluster and new_cluster are the global variable and they are initialized with
zero.
02. check_track_commit_timestamp_parameter
```
+ conn = connectToServer(cluster, "template1");
+ res = executeQueryOrDie(conn, "SELECT count(*) AS is_set "
+ "FROM pg_settings WHERE name = 'track_commit_timestamp' and setting = 'on'");
+ is_set = PQfnumber(res, "is_set");
+ cluster->track_commit_timestamp_on = atoi(PQgetvalue(res, 0, is_set)) == 1;
```
The SQL looks strange for me. Isn't it enough to directly obtain "setting" attribute
and check it is "on" or not? Also, conn and res can be the local variable of the
else part.
03. ClusterInfo
```
+ bool track_commit_timestamp_on; /* track_commit_timestamp for
+ * cluster */
```
The indent is broken, please run pgindent.
04. test
Could you please update meson.build? Otherwise the added test could not be run for meson build.
Best regards,
Hayato Kuroda
FUJITSU LIMITED
Hi,
Thanks for updating the patch and sorry for the late reply. Here are my comments.
01.
```
+ prep_status("Checking for pg_commit_ts");
```
I think we must clarify which node is being checked. Something like:
Checking for new cluster configuration for commit timestamp
02.
```
}
-
/* we have the result of cmd in "output". so parse it line by line now */
```
This change is not needed.
03.
```
+ /*
+ * Copy pg_commit_ts
+ */
```
I feel it can be removed or have more meanings. Something lile:
Copy old commit_timestamp data to new, if available.
04.
Regarding the test,
05.
```
+sub command_output
```
Can run_command() be used instead of defining new function?
06.
```
+$old->command_ok([ 'pgbench', '-i', '-s', '1' ], 'init pgbench');
+$old->command_ok([ 'pgbench', '-t', '1', '-j', '1' ], 'pgbench it');
```
I think no need to use pgbench anymore. Can we define tables and insert tuples
by myself?
07.
```
+command_fails(
+ [
+ 'pg_upgrade', '--no-sync',
+ '-d', $old->data_dir,
+ '-D', $new->data_dir,
+ '-b', $old->config_data('--bindir'),
+ '-B', $new->config_data('--bindir'),
+ '-s', $new->host,
+ '-p', $old->port,
+ '-P', $new->port,
+ $mode
+ ],
+ 'run of pg_upgrade fails with mismatch parameter track_commit_timestamp');
```
According to other test files, we do not use the shorten option.
Also, please verify the actual output by command_ok_or_fails_like() or command_checks_all().
08.
```
+sub xact_commit_ts
```
Not sure, but this function is introduced because we have 100 transactions. Can we omit this now?
09.
```
+# The new cluster should contain timestamps created during the pg_upgrade and
+# some more created by the pgbench.
+#
+print "\nCommit timestamp only in new cluster:\n";
+for my $line (@b) {
+ print "$line\n" unless $h1{$line};
+}
```
I don't think this is needed because the output is not verified.
Best regards,
Hayato Kuroda
FUJITSU LIMITED
Looks good to me. As I see it, the patch is almost ready, so I decided
to create an CF entry [0]. Feel free to edit it as you see fit.
Hi, Thanks for updating the patch. ``` +my ($xid,$ts) = $resold =~ /\s*(\d+)\s*\|(.*)/; ``` Per my understanding $ts is not used here. Can you remove it? Apart from above, I found some cosmetic issues. Please see attached my fix which can be applied atop HEAD. Can you check and include if it is OK? Best regards, Hayato Kuroda FUJITSU LIMITED