Обсуждение: RE: Patch for migration of the pg_commit_ts directory

Поиск
Список
Период
Сортировка

RE: Patch for migration of the pg_commit_ts directory

От
"Hayato Kuroda (Fujitsu)"
Дата:
Hi,

Per my understanding, track_commit_timestamp must be on the new instance,
otherwise it would be removed when new instance starts. Is it correct?
If so, should we detect the difference and do some warnings/errors here?
Or it is just the user's responsibility?

Best regards,
Hayato Kuroda
FUJITSU LIMITED


RE: Patch for migration of the pg_commit_ts directory

От
"Hayato Kuroda (Fujitsu)"
Дата:
Hi,
(Sorry but I cannot find your name)

> Yes, track_commit_timestamp must be installed in the new instance.
> This is only the responsibility of an experienced user.
> pg_upgrage should allow you to save pg_commit_ts if this data exists at the time of migration.
> Warnings are not needed, the loss of this data is not critical in most cases.
> They were lost with each migration if users did not manually migrate them.

So, your policy is that commit_ts is not the user data thus it is OK to drop during
the upgrade, is it correct? I want to know other's opinion around here.
Note that if we want to check, it can be done in check_control_data().

Regarding the patch:
```
-
+    cluster->controldata.chkpnt_oldstCommitTsxid = 0;
+    cluster->controldata.chkpnt_newstCommitTsxid = 0;
```

Other attributes are not initialized, you can follow.

```
+    if (old_cluster.controldata.chkpnt_oldstCommitTsxid > 0)
+                       copy_subdir_files("pg_commit_ts", "pg_commit_ts");
```

Indent should be fixed. Please run pgindent.

```
-              old_cluster.controldata.chkpnt_nxtxid,
-              old_cluster.controldata.chkpnt_nxtxid,
+              old_cluster.controldata.chkpnt_oldstCommitTsxid == 0 ? old_cluster.controldata.chkpnt_nxtxid :
old_cluster.controldata.chkpnt_oldstCommitTsxid,
+              old_cluster.controldata.chkpnt_newstCommitTsxid == 0 ? old_cluster.controldata.chkpnt_nxtxid :
old_cluster.controldata.chkpnt_newstCommitTsxid,
```

To confirm, is there a possibility that only either of oldest/newest CommitTs exists?

Best regards,
Hayato Kuroda
FUJITSU LIMITED


Re: Patch for migration of the pg_commit_ts directory

От
Amit Kapila
Дата:
On Thu, Oct 2, 2025 at 12:10 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> > Yes, track_commit_timestamp must be installed in the new instance.
> > This is only the responsibility of an experienced user.
> > pg_upgrage should allow you to save pg_commit_ts if this data exists at the time of migration.
> > Warnings are not needed, the loss of this data is not critical in most cases.
> > They were lost with each migration if users did not manually migrate them.
>
> So, your policy is that commit_ts is not the user data thus it is OK to drop during
> the upgrade, is it correct?
>

IIUC, the proposal is that if GUC track_commit_timestamp is enabled on
the new instance then we should copy it, otherwise, we can drop
copying it. Is my understanding correct? I think we can follow what is
done check_new_cluster_replication_slots() for the case when
track_commit_timestamp is not set on the new server. When we try to
copy slots and the wal_level on the new server is minimal, we error
out, so shouldn't we do the same here and error_out if
track_commit_timestamp is not enabled and we have some valid commit_ts
data to copy?

--
With Regards,
Amit Kapila.



Re: Patch for migration of the pg_commit_ts directory

От
Maxim Orlov
Дата:

On Thu, 2 Oct 2025 at 11:49, Amit Kapila <amit.kapila16@gmail.com> wrote:
 When we try to
copy slots and the wal_level on the new server is minimal, we error
out, so shouldn't we do the same here and error_out if
track_commit_timestamp is not enabled and we have some valid commit_ts
data to copy?

+1 Sounds reasonable to me. It's better to give an explicit error;
otherwise, we would remove data that the user clearly wants to migrate
to the new cluster. And deleting data just because the user forgot to
specify one parameter in a new cluster looks like a bad joke to me.   
 
--
Best regards,
Maxim Orlov.

Re: Patch for migration of the pg_commit_ts directory

От
Bruce Momjian
Дата:
On Mon, Oct  6, 2025 at 07:31:52PM +0500, ls7777 wrote:
> Hi,
> I'll try to return it to the original thread.
>  
> I absolutely agree with you. Data loss during migration is an unacceptable
> situation. And why it's still happening, I don't understand. But issuing
> messages will require translating it into other languages and complicate the
> patch.  I'm not a programmer to do it beautifully. And not a beautiful patch is
> unlikely to be accepted. 
> I will do a parameter check and an ERROR message.

You don't need to worry about translations.  Each language team will
make sure your English message is translated.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.



RE: Patch for migration of the pg_commit_ts directory

От
"Hayato Kuroda (Fujitsu)"
Дата:
Hi,

Thanks for updating the patch.

Regarding the check_track_commit_timestamp_parameter(), I'm not sure the function
is needed. Since pg_controldata outputs whether the commit_ts module is enabled
or not [1], can we obtain there instead?
I imagined like we can add a new field at ControlData, it could be obtained in get_control_data(),
values for instances could be compared in check_control_data().

Another possibility is that oldestCommitTsXid/newestCommitTsXid can be always
valid if track_commit_timestamp is on. In this case the attribute is not needed
anymore. It is enough to check these values. Can you check them?

Regarding the test, I feel it is unnecessary long. I feel it is enough to ensure
one or two transactions have the same timestamp on the both nodes, pgbench is not
needed. Current test may not be able to finish on the slow machine.

[1]
```
$ pg_controldata data/ | grep "track_"
track_commit_timestamp setting:       off
```

Best regards,
Hayato Kuroda
FUJITSU LIMITED


RE: Patch for migration of the pg_commit_ts directory

От
"Hayato Kuroda (Fujitsu)"
Дата:
Hi,

> At the time check_control_data is launched for the new cluster, the control file
> does not contain information about the set track_commit_timestamp=on. It is
> installed only in postgresql.conf.

You did consider the case that track_commit_timestamp is set to on but the instance
has not started before the pg_upgrade, right? Valid point.

> The check_track_commit_timestamp_parameter function is only needed for the new
> cluster. And you can not run it for the old cluster, but use data from the
> old_cluster.controldata.chkpnt_newstCommitTsxid control file. But this will be
> less clear than using the check_track_commit_timestamp_parameter function.

Sorry, I'm not very clear this part. Can you describe the point bit more?

Other comments:

01. get_control_data

```
+    if (GET_MAJOR_VERSION(cluster->major_version) < 905)
+    {
+        cluster->controldata.chkpnt_oldstCommitTsxid = 0;
+        cluster->controldata.chkpnt_newstCommitTsxid = 0;
+    }
```

IIUC, checksum_version is checked like this because got_data_checksum_version
must be also set.
Since we do not have the boolean for commit_ts, not sure it is needed.
old_cluster and new_cluster are the global variable and they are initialized with
zero.

02. check_track_commit_timestamp_parameter

```
+        conn = connectToServer(cluster, "template1");
+        res = executeQueryOrDie(conn, "SELECT count(*) AS is_set "
+                                "FROM pg_settings WHERE name = 'track_commit_timestamp' and setting = 'on'");
+        is_set = PQfnumber(res, "is_set");
+        cluster->track_commit_timestamp_on = atoi(PQgetvalue(res, 0, is_set)) == 1;
```

The SQL looks strange for me. Isn't it enough to directly obtain "setting" attribute
and check it is "on" or not? Also, conn and res can be the local variable of the
else part.

03. ClusterInfo

```
+    bool        track_commit_timestamp_on;    /* track_commit_timestamp for
+                                         *    cluster */
```

The indent is broken, please run pgindent.

04. test

Could you please update meson.build? Otherwise the added test could not be run for meson build.

Best regards,
Hayato Kuroda
FUJITSU LIMITED
 

RE: Patch for migration of the pg_commit_ts directory

От
"Hayato Kuroda (Fujitsu)"
Дата:
Hi,

Thanks for updating the patch and sorry for the late reply. Here are my comments.

01.
```
+    prep_status("Checking for pg_commit_ts");
```

I think we must clarify which node is being checked. Something like:
Checking for new cluster configuration for commit timestamp

02.
```
    }
-
     /* we have the result of cmd in "output". so parse it line by line now */
```

This change is not needed.

03.
```
+    /*
+     * Copy pg_commit_ts
+     */
```

I feel it can be removed or have more meanings. Something lile:
Copy old commit_timestamp data to new, if available.

04.

Regarding the test, 


05.
```
+sub command_output
```

Can run_command() be used instead of defining new function?

06.
```
+$old->command_ok([ 'pgbench', '-i', '-s', '1' ], 'init pgbench');
+$old->command_ok([ 'pgbench', '-t', '1', '-j', '1' ], 'pgbench it');
```

I think no need to use pgbench anymore. Can we define tables and insert tuples
by myself?

07.
```
+command_fails(
+    [
+        'pg_upgrade', '--no-sync',
+        '-d', $old->data_dir,
+        '-D', $new->data_dir,
+        '-b', $old->config_data('--bindir'),
+        '-B', $new->config_data('--bindir'),
+        '-s', $new->host,
+        '-p', $old->port,
+        '-P', $new->port,
+        $mode
+    ],
+    'run of pg_upgrade fails with mismatch parameter track_commit_timestamp');
```

According to other test files, we do not use the shorten option.
Also, please verify the actual output by command_ok_or_fails_like() or command_checks_all().

08.
```
+sub xact_commit_ts
```

Not sure, but this function is introduced because we have 100 transactions. Can we omit this now?

09.
```
+# The new cluster should contain timestamps created during the pg_upgrade and
+# some more created by the pgbench.
+#
+print "\nCommit timestamp only in new cluster:\n";
+for my $line (@b) {
+    print "$line\n" unless $h1{$line};
+}
```

I don't think this is needed because the output is not verified.

Best regards,
Hayato Kuroda
FUJITSU LIMITED


Re: Patch for migration of the pg_commit_ts directory

От
Maxim Orlov
Дата:
Looks good to me. As I see it, the patch is almost ready, so I decided 
to create an CF entry [0]. Feel free to edit it as you see fit.

And your mail also includes attachment (unknown_filename), interfering with the CF-bot.

RE: Patch for migration of the pg_commit_ts directory

От
"Hayato Kuroda (Fujitsu)"
Дата:
Hi,

Thanks for updating the patch.

```
+my ($xid,$ts) = $resold =~ /\s*(\d+)\s*\|(.*)/;
```

Per my understanding $ts is not used here. Can you remove it?

Apart from above, I found some cosmetic issues. Please see attached
my fix which can be applied atop HEAD. Can you check and include if
it is OK?

Best regards,
Hayato Kuroda
FUJITSU LIMITED
 

Вложения