Обсуждение: pg_upgrade: optimize replication slot caught-up check

Поиск
Список
Период
Сортировка

pg_upgrade: optimize replication slot caught-up check

От
Masahiko Sawada
Дата:
Hi all,

Commit 29d0a77fa6 improved pg_upgrade to allow migrating logical
slots. Currently, to check if the slots are ready to be migrated, we
call binary_upgrade_logical_slot_has_caught_up() for every single
slot. This checks if there are any unconsumed WAL records. However, we
noticed a performance issue. If there are many slots (e.g., 100 or
more) or if there is a WAL backlog, checking all slots one by one
takes a long time.

Here are some test results from my environment:
With an empty cluster: 1.55s
With 200 slots and 30MB backlog: 15.51s

Commit 6d3d2e8e5 introduced parallel checks per database, but a single
job might still have to check too many slots, causing delays.

Since binary_upgrade_logical_slot_has_caught_up() essentially checks
if any decodable record exists in the database, IIUC it is not
necessary to check every slot. We can optimize this by checking only
the slot with the minimum confirmed_flush_lsn. If that slot is caught
up, we can assume others are too. The attached patch implements this
optimization. With the patch, the test with 200 slots finished in
2.512s. The execution time is now stable regardless of the number of
slots.

One thing to note is that DecodeTXNNeedSkip() also considers
replication origin filters. Theoretically, a plugin could filter out
specific origins, which might lead to different results. However, this
is a very rare case. Even if it happens, it would just result in a
false positive (the upgrade fails safely), so the impact is minimal.
Therefore, the patch simplifies the check to be per-database instead
of per-slot.

Feedback is very welcome.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

Re: pg_upgrade: optimize replication slot caught-up check

От
Chao Li
Дата:

> On Jan 6, 2026, at 02:02, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Hi all,
>
> Commit 29d0a77fa6 improved pg_upgrade to allow migrating logical
> slots. Currently, to check if the slots are ready to be migrated, we
> call binary_upgrade_logical_slot_has_caught_up() for every single
> slot. This checks if there are any unconsumed WAL records. However, we
> noticed a performance issue. If there are many slots (e.g., 100 or
> more) or if there is a WAL backlog, checking all slots one by one
> takes a long time.
>
> Here are some test results from my environment:
> With an empty cluster: 1.55s
> With 200 slots and 30MB backlog: 15.51s
>
> Commit 6d3d2e8e5 introduced parallel checks per database, but a single
> job might still have to check too many slots, causing delays.
>
> Since binary_upgrade_logical_slot_has_caught_up() essentially checks
> if any decodable record exists in the database, IIUC it is not
> necessary to check every slot. We can optimize this by checking only
> the slot with the minimum confirmed_flush_lsn. If that slot is caught
> up, we can assume others are too. The attached patch implements this
> optimization. With the patch, the test with 200 slots finished in
> 2.512s. The execution time is now stable regardless of the number of
> slots.
>
> One thing to note is that DecodeTXNNeedSkip() also considers
> replication origin filters. Theoretically, a plugin could filter out
> specific origins, which might lead to different results. However, this
> is a very rare case. Even if it happens, it would just result in a
> false positive (the upgrade fails safely), so the impact is minimal.
> Therefore, the patch simplifies the check to be per-database instead
> of per-slot.
>
> Feedback is very welcome.
>
> Regards,
>
> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com
> <v1-0001-pg_upgrade-Optimize-replication-slot-caught-up-ch.patch>

Hi Masahiko-san,

Basically I think the optimization idea makes sense to me. Reducing the check from O(slots × WAL) to O(databases × WAL)
soundsa win. Just a few small comments: 

1
```
+static void process_old_cluter_logical_slot_catchup_infos(DbInfo *dbinfo, PGresult *res, void *arg);
```

Typo in the function name: cluter->cluster

2
···
-        logical_slot_infos_query = get_old_cluster_logical_slot_infos_query();
+        const char *slot_info_query = "SELECT slot_name, plugin, two_phase, failover, “
···

logical_slot_infos_query is no longer used, should be removed.

3
```
+                "  ORDER BY confirmed_flush_lsn ASC "
```

I am thinking, if confirmed_flush_lsn can be NULL? If that’s true, we may want to add “NULLS LAST”.

4
```
+    Assert(num_slots == dbinfo->slot_arr.nslots);
+
+    for (int i = 0; i < num_slots; i++)
+    {
+        char       *slotname;
+        bool        caught_up;
+
+        slotname = PQgetvalue(res, i, PQfnumber(res, "slot_name"));
+        caught_up = (strcmp(PQgetvalue(res, i, PQfnumber(res, "caught_up")), "t") == 0);
+
+        for (int slotnum = 0; slotnum < dbinfo->slot_arr.nslots; slotnum++)
+        {
```

As num_slots == dbinfo->slot_arr.nslots, this is still a O(num_slots^2) check. Given this patch’s goal is to eliminate
theburden from large number of slots, maybe we can build a hash table to make the check faster. 

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







Re: pg_upgrade: optimize replication slot caught-up check

От
Masahiko Sawada
Дата:
On Mon, Jan 5, 2026 at 6:55 PM Chao Li <li.evan.chao@gmail.com> wrote:
>
>
>
> > On Jan 6, 2026, at 02:02, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > Hi all,
> >
> > Commit 29d0a77fa6 improved pg_upgrade to allow migrating logical
> > slots. Currently, to check if the slots are ready to be migrated, we
> > call binary_upgrade_logical_slot_has_caught_up() for every single
> > slot. This checks if there are any unconsumed WAL records. However, we
> > noticed a performance issue. If there are many slots (e.g., 100 or
> > more) or if there is a WAL backlog, checking all slots one by one
> > takes a long time.
> >
> > Here are some test results from my environment:
> > With an empty cluster: 1.55s
> > With 200 slots and 30MB backlog: 15.51s
> >
> > Commit 6d3d2e8e5 introduced parallel checks per database, but a single
> > job might still have to check too many slots, causing delays.
> >
> > Since binary_upgrade_logical_slot_has_caught_up() essentially checks
> > if any decodable record exists in the database, IIUC it is not
> > necessary to check every slot. We can optimize this by checking only
> > the slot with the minimum confirmed_flush_lsn. If that slot is caught
> > up, we can assume others are too. The attached patch implements this
> > optimization. With the patch, the test with 200 slots finished in
> > 2.512s. The execution time is now stable regardless of the number of
> > slots.
> >
> > One thing to note is that DecodeTXNNeedSkip() also considers
> > replication origin filters. Theoretically, a plugin could filter out
> > specific origins, which might lead to different results. However, this
> > is a very rare case. Even if it happens, it would just result in a
> > false positive (the upgrade fails safely), so the impact is minimal.
> > Therefore, the patch simplifies the check to be per-database instead
> > of per-slot.
> >
> > Feedback is very welcome.
> >
> > Regards,
> >
> > --
> > Masahiko Sawada
> > Amazon Web Services: https://aws.amazon.com
> > <v1-0001-pg_upgrade-Optimize-replication-slot-caught-up-ch.patch>
>
> Hi Masahiko-san,
>
> Basically I think the optimization idea makes sense to me. Reducing the check from O(slots × WAL) to O(databases ×
WAL)sounds a win. Just a few small comments: 

Thank you for reviewing the patch!

>
> 1
> ```
> +static void process_old_cluter_logical_slot_catchup_infos(DbInfo *dbinfo, PGresult *res, void *arg);
> ```
>
> Typo in the function name: cluter->cluster
>
> 2
> ···
> -               logical_slot_infos_query = get_old_cluster_logical_slot_infos_query();
> +               const char *slot_info_query = "SELECT slot_name, plugin, two_phase, failover, “
> ···
>
> logical_slot_infos_query is no longer used, should be removed.

Fixed.

>
> 3
> ```
> +                               "  ORDER BY confirmed_flush_lsn ASC "
> ```
>
> I am thinking, if confirmed_flush_lsn can be NULL? If that’s true, we may want to add “NULLS LAST”.

Given that the query is not executed when live-check, I think
confirmed_flush cannot be NULL. temporary should also not be true but
it's for consistency with slot_info_query.

>
> 4
> ```
> +       Assert(num_slots == dbinfo->slot_arr.nslots);
> +
> +       for (int i = 0; i < num_slots; i++)
> +       {
> +               char       *slotname;
> +               bool            caught_up;
> +
> +               slotname = PQgetvalue(res, i, PQfnumber(res, "slot_name"));
> +               caught_up = (strcmp(PQgetvalue(res, i, PQfnumber(res, "caught_up")), "t") == 0);
> +
> +               for (int slotnum = 0; slotnum < dbinfo->slot_arr.nslots; slotnum++)
> +               {
> ```
>
> As num_slots == dbinfo->slot_arr.nslots, this is still a O(num_slots^2) check. Given this patch’s goal is to
eliminatethe burden from large number of slots, maybe we can build a hash table to make the check faster. 
>

Or if we sort both queries in the same order we can simply update the
slots sequentially.

One problem I can see in this approach is that we could wrongly report
the invalid slots in invalid_logical_slots.txt. Even if the slot with
the oldest confirmed_flush_lsn has unconsumed decodable records, other
slots might have already been caught up. I'll think of how to deal
with this problem.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: pg_upgrade: optimize replication slot caught-up check

От
Masahiko Sawada
Дата:
On Tue, Jan 6, 2026 at 12:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Jan 5, 2026 at 6:55 PM Chao Li <li.evan.chao@gmail.com> wrote:
> >
> >
> >
> > > On Jan 6, 2026, at 02:02, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > Hi all,
> > >
> > > Commit 29d0a77fa6 improved pg_upgrade to allow migrating logical
> > > slots. Currently, to check if the slots are ready to be migrated, we
> > > call binary_upgrade_logical_slot_has_caught_up() for every single
> > > slot. This checks if there are any unconsumed WAL records. However, we
> > > noticed a performance issue. If there are many slots (e.g., 100 or
> > > more) or if there is a WAL backlog, checking all slots one by one
> > > takes a long time.
> > >
> > > Here are some test results from my environment:
> > > With an empty cluster: 1.55s
> > > With 200 slots and 30MB backlog: 15.51s
> > >
> > > Commit 6d3d2e8e5 introduced parallel checks per database, but a single
> > > job might still have to check too many slots, causing delays.
> > >
> > > Since binary_upgrade_logical_slot_has_caught_up() essentially checks
> > > if any decodable record exists in the database, IIUC it is not
> > > necessary to check every slot. We can optimize this by checking only
> > > the slot with the minimum confirmed_flush_lsn. If that slot is caught
> > > up, we can assume others are too. The attached patch implements this
> > > optimization. With the patch, the test with 200 slots finished in
> > > 2.512s. The execution time is now stable regardless of the number of
> > > slots.
> > >
> > > One thing to note is that DecodeTXNNeedSkip() also considers
> > > replication origin filters. Theoretically, a plugin could filter out
> > > specific origins, which might lead to different results. However, this
> > > is a very rare case. Even if it happens, it would just result in a
> > > false positive (the upgrade fails safely), so the impact is minimal.
> > > Therefore, the patch simplifies the check to be per-database instead
> > > of per-slot.
> > >
> > > Feedback is very welcome.
> > >
> > > Regards,
> > >
> > > --
> > > Masahiko Sawada
> > > Amazon Web Services: https://aws.amazon.com
> > > <v1-0001-pg_upgrade-Optimize-replication-slot-caught-up-ch.patch>
> >
> > Hi Masahiko-san,
> >
> > Basically I think the optimization idea makes sense to me. Reducing the check from O(slots × WAL) to O(databases ×
WAL)sounds a win. Just a few small comments: 
>
> Thank you for reviewing the patch!
>
> >
> > 1
> > ```
> > +static void process_old_cluter_logical_slot_catchup_infos(DbInfo *dbinfo, PGresult *res, void *arg);
> > ```
> >
> > Typo in the function name: cluter->cluster
> >
> > 2
> > ···
> > -               logical_slot_infos_query = get_old_cluster_logical_slot_infos_query();
> > +               const char *slot_info_query = "SELECT slot_name, plugin, two_phase, failover, “
> > ···
> >
> > logical_slot_infos_query is no longer used, should be removed.
>
> Fixed.
>
> >
> > 3
> > ```
> > +                               "  ORDER BY confirmed_flush_lsn ASC "
> > ```
> >
> > I am thinking, if confirmed_flush_lsn can be NULL? If that’s true, we may want to add “NULLS LAST”.
>
> Given that the query is not executed when live-check, I think
> confirmed_flush cannot be NULL. temporary should also not be true but
> it's for consistency with slot_info_query.
>
> >
> > 4
> > ```
> > +       Assert(num_slots == dbinfo->slot_arr.nslots);
> > +
> > +       for (int i = 0; i < num_slots; i++)
> > +       {
> > +               char       *slotname;
> > +               bool            caught_up;
> > +
> > +               slotname = PQgetvalue(res, i, PQfnumber(res, "slot_name"));
> > +               caught_up = (strcmp(PQgetvalue(res, i, PQfnumber(res, "caught_up")), "t") == 0);
> > +
> > +               for (int slotnum = 0; slotnum < dbinfo->slot_arr.nslots; slotnum++)
> > +               {
> > ```
> >
> > As num_slots == dbinfo->slot_arr.nslots, this is still a O(num_slots^2) check. Given this patch’s goal is to
eliminatethe burden from large number of slots, maybe we can build a hash table to make the check faster. 
> >
>
> Or if we sort both queries in the same order we can simply update the
> slots sequentially.
>
> One problem I can see in this approach is that we could wrongly report
> the invalid slots in invalid_logical_slots.txt. Even if the slot with
> the oldest confirmed_flush_lsn has unconsumed decodable records, other
> slots might have already been caught up. I'll think of how to deal
> with this problem.

I came up with a simple idea; instead of
binary_upgrade_logical_slot_has_caught_up() returning true/false, we
can have it return the last decodable WAL record's LSN, and consider
logical slots as caught-up if its confirmed_flush_lsn already passed
that LSN. This way, we can keep scanning at most one logical slot per
database and reporting the same contents as today. The function nwo
needs to scan all WAL backlogs but it's still much better than the
current method.

I've implemented this idea in the v2 patch with some updates:

- incorporated the review comments.
- added logic to support PG18 or older.
- added regression tests.

Feedback is very welcome.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

Re: pg_upgrade: optimize replication slot caught-up check

От
Masahiko Sawada
Дата:
On Wed, Jan 7, 2026 at 12:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Jan 6, 2026 at 12:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Jan 5, 2026 at 6:55 PM Chao Li <li.evan.chao@gmail.com> wrote:
> > >
> > >
> > >
> > > > On Jan 6, 2026, at 02:02, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > Hi all,
> > > >
> > > > Commit 29d0a77fa6 improved pg_upgrade to allow migrating logical
> > > > slots. Currently, to check if the slots are ready to be migrated, we
> > > > call binary_upgrade_logical_slot_has_caught_up() for every single
> > > > slot. This checks if there are any unconsumed WAL records. However, we
> > > > noticed a performance issue. If there are many slots (e.g., 100 or
> > > > more) or if there is a WAL backlog, checking all slots one by one
> > > > takes a long time.
> > > >
> > > > Here are some test results from my environment:
> > > > With an empty cluster: 1.55s
> > > > With 200 slots and 30MB backlog: 15.51s
> > > >
> > > > Commit 6d3d2e8e5 introduced parallel checks per database, but a single
> > > > job might still have to check too many slots, causing delays.
> > > >
> > > > Since binary_upgrade_logical_slot_has_caught_up() essentially checks
> > > > if any decodable record exists in the database, IIUC it is not
> > > > necessary to check every slot. We can optimize this by checking only
> > > > the slot with the minimum confirmed_flush_lsn. If that slot is caught
> > > > up, we can assume others are too. The attached patch implements this
> > > > optimization. With the patch, the test with 200 slots finished in
> > > > 2.512s. The execution time is now stable regardless of the number of
> > > > slots.
> > > >
> > > > One thing to note is that DecodeTXNNeedSkip() also considers
> > > > replication origin filters. Theoretically, a plugin could filter out
> > > > specific origins, which might lead to different results. However, this
> > > > is a very rare case. Even if it happens, it would just result in a
> > > > false positive (the upgrade fails safely), so the impact is minimal.
> > > > Therefore, the patch simplifies the check to be per-database instead
> > > > of per-slot.
> > > >
> > > > Feedback is very welcome.
> > > >
> > > > Regards,
> > > >
> > > > --
> > > > Masahiko Sawada
> > > > Amazon Web Services: https://aws.amazon.com
> > > > <v1-0001-pg_upgrade-Optimize-replication-slot-caught-up-ch.patch>
> > >
> > > Hi Masahiko-san,
> > >
> > > Basically I think the optimization idea makes sense to me. Reducing the check from O(slots × WAL) to O(databases
×WAL) sounds a win. Just a few small comments: 
> >
> > Thank you for reviewing the patch!
> >
> > >
> > > 1
> > > ```
> > > +static void process_old_cluter_logical_slot_catchup_infos(DbInfo *dbinfo, PGresult *res, void *arg);
> > > ```
> > >
> > > Typo in the function name: cluter->cluster
> > >
> > > 2
> > > ···
> > > -               logical_slot_infos_query = get_old_cluster_logical_slot_infos_query();
> > > +               const char *slot_info_query = "SELECT slot_name, plugin, two_phase, failover, “
> > > ···
> > >
> > > logical_slot_infos_query is no longer used, should be removed.
> >
> > Fixed.
> >
> > >
> > > 3
> > > ```
> > > +                               "  ORDER BY confirmed_flush_lsn ASC "
> > > ```
> > >
> > > I am thinking, if confirmed_flush_lsn can be NULL? If that’s true, we may want to add “NULLS LAST”.
> >
> > Given that the query is not executed when live-check, I think
> > confirmed_flush cannot be NULL. temporary should also not be true but
> > it's for consistency with slot_info_query.
> >
> > >
> > > 4
> > > ```
> > > +       Assert(num_slots == dbinfo->slot_arr.nslots);
> > > +
> > > +       for (int i = 0; i < num_slots; i++)
> > > +       {
> > > +               char       *slotname;
> > > +               bool            caught_up;
> > > +
> > > +               slotname = PQgetvalue(res, i, PQfnumber(res, "slot_name"));
> > > +               caught_up = (strcmp(PQgetvalue(res, i, PQfnumber(res, "caught_up")), "t") == 0);
> > > +
> > > +               for (int slotnum = 0; slotnum < dbinfo->slot_arr.nslots; slotnum++)
> > > +               {
> > > ```
> > >
> > > As num_slots == dbinfo->slot_arr.nslots, this is still a O(num_slots^2) check. Given this patch’s goal is to
eliminatethe burden from large number of slots, maybe we can build a hash table to make the check faster. 
> > >
> >
> > Or if we sort both queries in the same order we can simply update the
> > slots sequentially.
> >
> > One problem I can see in this approach is that we could wrongly report
> > the invalid slots in invalid_logical_slots.txt. Even if the slot with
> > the oldest confirmed_flush_lsn has unconsumed decodable records, other
> > slots might have already been caught up. I'll think of how to deal
> > with this problem.
>
> I came up with a simple idea; instead of
> binary_upgrade_logical_slot_has_caught_up() returning true/false, we
> can have it return the last decodable WAL record's LSN, and consider
> logical slots as caught-up if its confirmed_flush_lsn already passed
> that LSN. This way, we can keep scanning at most one logical slot per
> database and reporting the same contents as today. The function nwo
> needs to scan all WAL backlogs but it's still much better than the
> current method.
>
> I've implemented this idea in the v2 patch with some updates:
>
> - incorporated the review comments.
> - added logic to support PG18 or older.
> - added regression tests.
>
> Feedback is very welcome.
>

I've fixed the test failures reported by CI.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

Re: pg_upgrade: optimize replication slot caught-up check

От
Chao Li
Дата:

> On Jan 8, 2026, at 04:18, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Jan 6, 2026 at 12:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>
>> On Mon, Jan 5, 2026 at 6:55 PM Chao Li <li.evan.chao@gmail.com> wrote:
>>>
>>>
>>>
>>>> On Jan 6, 2026, at 02:02, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>>>
>>>> Hi all,
>>>>
>>>> Commit 29d0a77fa6 improved pg_upgrade to allow migrating logical
>>>> slots. Currently, to check if the slots are ready to be migrated, we
>>>> call binary_upgrade_logical_slot_has_caught_up() for every single
>>>> slot. This checks if there are any unconsumed WAL records. However, we
>>>> noticed a performance issue. If there are many slots (e.g., 100 or
>>>> more) or if there is a WAL backlog, checking all slots one by one
>>>> takes a long time.
>>>>
>>>> Here are some test results from my environment:
>>>> With an empty cluster: 1.55s
>>>> With 200 slots and 30MB backlog: 15.51s
>>>>
>>>> Commit 6d3d2e8e5 introduced parallel checks per database, but a single
>>>> job might still have to check too many slots, causing delays.
>>>>
>>>> Since binary_upgrade_logical_slot_has_caught_up() essentially checks
>>>> if any decodable record exists in the database, IIUC it is not
>>>> necessary to check every slot. We can optimize this by checking only
>>>> the slot with the minimum confirmed_flush_lsn. If that slot is caught
>>>> up, we can assume others are too. The attached patch implements this
>>>> optimization. With the patch, the test with 200 slots finished in
>>>> 2.512s. The execution time is now stable regardless of the number of
>>>> slots.
>>>>
>>>> One thing to note is that DecodeTXNNeedSkip() also considers
>>>> replication origin filters. Theoretically, a plugin could filter out
>>>> specific origins, which might lead to different results. However, this
>>>> is a very rare case. Even if it happens, it would just result in a
>>>> false positive (the upgrade fails safely), so the impact is minimal.
>>>> Therefore, the patch simplifies the check to be per-database instead
>>>> of per-slot.
>>>>
>>>> Feedback is very welcome.
>>>>
>>>> Regards,
>>>>
>>>> --
>>>> Masahiko Sawada
>>>> Amazon Web Services: https://aws.amazon.com
>>>> <v1-0001-pg_upgrade-Optimize-replication-slot-caught-up-ch.patch>
>>>
>>> Hi Masahiko-san,
>>>
>>> Basically I think the optimization idea makes sense to me. Reducing the check from O(slots × WAL) to O(databases ×
WAL)sounds a win. Just a few small comments: 
>>
>> Thank you for reviewing the patch!
>>
>>>
>>> 1
>>> ```
>>> +static void process_old_cluter_logical_slot_catchup_infos(DbInfo *dbinfo, PGresult *res, void *arg);
>>> ```
>>>
>>> Typo in the function name: cluter->cluster
>>>
>>> 2
>>> ···
>>> -               logical_slot_infos_query = get_old_cluster_logical_slot_infos_query();
>>> +               const char *slot_info_query = "SELECT slot_name, plugin, two_phase, failover, “
>>> ···
>>>
>>> logical_slot_infos_query is no longer used, should be removed.
>>
>> Fixed.
>>
>>>
>>> 3
>>> ```
>>> +                               "  ORDER BY confirmed_flush_lsn ASC "
>>> ```
>>>
>>> I am thinking, if confirmed_flush_lsn can be NULL? If that’s true, we may want to add “NULLS LAST”.
>>
>> Given that the query is not executed when live-check, I think
>> confirmed_flush cannot be NULL. temporary should also not be true but
>> it's for consistency with slot_info_query.
>>
>>>
>>> 4
>>> ```
>>> +       Assert(num_slots == dbinfo->slot_arr.nslots);
>>> +
>>> +       for (int i = 0; i < num_slots; i++)
>>> +       {
>>> +               char       *slotname;
>>> +               bool            caught_up;
>>> +
>>> +               slotname = PQgetvalue(res, i, PQfnumber(res, "slot_name"));
>>> +               caught_up = (strcmp(PQgetvalue(res, i, PQfnumber(res, "caught_up")), "t") == 0);
>>> +
>>> +               for (int slotnum = 0; slotnum < dbinfo->slot_arr.nslots; slotnum++)
>>> +               {
>>> ```
>>>
>>> As num_slots == dbinfo->slot_arr.nslots, this is still a O(num_slots^2) check. Given this patch’s goal is to
eliminatethe burden from large number of slots, maybe we can build a hash table to make the check faster. 
>>>
>>
>> Or if we sort both queries in the same order we can simply update the
>> slots sequentially.
>>
>> One problem I can see in this approach is that we could wrongly report
>> the invalid slots in invalid_logical_slots.txt. Even if the slot with
>> the oldest confirmed_flush_lsn has unconsumed decodable records, other
>> slots might have already been caught up. I'll think of how to deal
>> with this problem.
>
> I came up with a simple idea; instead of
> binary_upgrade_logical_slot_has_caught_up() returning true/false, we
> can have it return the last decodable WAL record's LSN, and consider
> logical slots as caught-up if its confirmed_flush_lsn already passed
> that LSN. This way, we can keep scanning at most one logical slot per
> database and reporting the same contents as today. The function nwo
> needs to scan all WAL backlogs but it's still much better than the
> current method.
>

Sounds better, but I need more time to look into the detail. I will spend time on this tomorrow.

> I've implemented this idea in the v2 patch with some updates:
>
> - incorporated the review comments.
> - added logic to support PG18 or older.
> - added regression tests.
>
> Feedback is very welcome.
>
> Regards,
>
> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com
> <v2-0001-pg_upgrade-Optimize-replication-slot-caught-up-ch.patch>

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







Re: pg_upgrade: optimize replication slot caught-up check

От
Chao Li
Дата:

> On Jan 8, 2026, at 05:08, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Jan 7, 2026 at 12:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>
>> On Tue, Jan 6, 2026 at 12:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>>
>>> On Mon, Jan 5, 2026 at 6:55 PM Chao Li <li.evan.chao@gmail.com> wrote:
>>>>
>>>>
>>>>
>>>>> On Jan 6, 2026, at 02:02, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> Commit 29d0a77fa6 improved pg_upgrade to allow migrating logical
>>>>> slots. Currently, to check if the slots are ready to be migrated, we
>>>>> call binary_upgrade_logical_slot_has_caught_up() for every single
>>>>> slot. This checks if there are any unconsumed WAL records. However, we
>>>>> noticed a performance issue. If there are many slots (e.g., 100 or
>>>>> more) or if there is a WAL backlog, checking all slots one by one
>>>>> takes a long time.
>>>>>
>>>>> Here are some test results from my environment:
>>>>> With an empty cluster: 1.55s
>>>>> With 200 slots and 30MB backlog: 15.51s
>>>>>
>>>>> Commit 6d3d2e8e5 introduced parallel checks per database, but a single
>>>>> job might still have to check too many slots, causing delays.
>>>>>
>>>>> Since binary_upgrade_logical_slot_has_caught_up() essentially checks
>>>>> if any decodable record exists in the database, IIUC it is not
>>>>> necessary to check every slot. We can optimize this by checking only
>>>>> the slot with the minimum confirmed_flush_lsn. If that slot is caught
>>>>> up, we can assume others are too. The attached patch implements this
>>>>> optimization. With the patch, the test with 200 slots finished in
>>>>> 2.512s. The execution time is now stable regardless of the number of
>>>>> slots.
>>>>>
>>>>> One thing to note is that DecodeTXNNeedSkip() also considers
>>>>> replication origin filters. Theoretically, a plugin could filter out
>>>>> specific origins, which might lead to different results. However, this
>>>>> is a very rare case. Even if it happens, it would just result in a
>>>>> false positive (the upgrade fails safely), so the impact is minimal.
>>>>> Therefore, the patch simplifies the check to be per-database instead
>>>>> of per-slot.
>>>>>
>>>>> Feedback is very welcome.
>>>>>
>>>>> Regards,
>>>>>
>>>>> --
>>>>> Masahiko Sawada
>>>>> Amazon Web Services: https://aws.amazon.com
>>>>> <v1-0001-pg_upgrade-Optimize-replication-slot-caught-up-ch.patch>
>>>>
>>>> Hi Masahiko-san,
>>>>
>>>> Basically I think the optimization idea makes sense to me. Reducing the check from O(slots × WAL) to O(databases ×
WAL)sounds a win. Just a few small comments: 
>>>
>>> Thank you for reviewing the patch!
>>>
>>>>
>>>> 1
>>>> ```
>>>> +static void process_old_cluter_logical_slot_catchup_infos(DbInfo *dbinfo, PGresult *res, void *arg);
>>>> ```
>>>>
>>>> Typo in the function name: cluter->cluster
>>>>
>>>> 2
>>>> ···
>>>> -               logical_slot_infos_query = get_old_cluster_logical_slot_infos_query();
>>>> +               const char *slot_info_query = "SELECT slot_name, plugin, two_phase, failover, “
>>>> ···
>>>>
>>>> logical_slot_infos_query is no longer used, should be removed.
>>>
>>> Fixed.
>>>
>>>>
>>>> 3
>>>> ```
>>>> +                               "  ORDER BY confirmed_flush_lsn ASC "
>>>> ```
>>>>
>>>> I am thinking, if confirmed_flush_lsn can be NULL? If that’s true, we may want to add “NULLS LAST”.
>>>
>>> Given that the query is not executed when live-check, I think
>>> confirmed_flush cannot be NULL. temporary should also not be true but
>>> it's for consistency with slot_info_query.
>>>
>>>>
>>>> 4
>>>> ```
>>>> +       Assert(num_slots == dbinfo->slot_arr.nslots);
>>>> +
>>>> +       for (int i = 0; i < num_slots; i++)
>>>> +       {
>>>> +               char       *slotname;
>>>> +               bool            caught_up;
>>>> +
>>>> +               slotname = PQgetvalue(res, i, PQfnumber(res, "slot_name"));
>>>> +               caught_up = (strcmp(PQgetvalue(res, i, PQfnumber(res, "caught_up")), "t") == 0);
>>>> +
>>>> +               for (int slotnum = 0; slotnum < dbinfo->slot_arr.nslots; slotnum++)
>>>> +               {
>>>> ```
>>>>
>>>> As num_slots == dbinfo->slot_arr.nslots, this is still a O(num_slots^2) check. Given this patch’s goal is to
eliminatethe burden from large number of slots, maybe we can build a hash table to make the check faster. 
>>>>
>>>
>>> Or if we sort both queries in the same order we can simply update the
>>> slots sequentially.
>>>
>>> One problem I can see in this approach is that we could wrongly report
>>> the invalid slots in invalid_logical_slots.txt. Even if the slot with
>>> the oldest confirmed_flush_lsn has unconsumed decodable records, other
>>> slots might have already been caught up. I'll think of how to deal
>>> with this problem.
>>
>> I came up with a simple idea; instead of
>> binary_upgrade_logical_slot_has_caught_up() returning true/false, we
>> can have it return the last decodable WAL record's LSN, and consider
>> logical slots as caught-up if its confirmed_flush_lsn already passed
>> that LSN. This way, we can keep scanning at most one logical slot per
>> database and reporting the same contents as today. The function nwo
>> needs to scan all WAL backlogs but it's still much better than the
>> current method.
>>
>> I've implemented this idea in the v2 patch with some updates:
>>
>> - incorporated the review comments.
>> - added logic to support PG18 or older.
>> - added regression tests.
>>
>> Feedback is very welcome.
>>
>
> I've fixed the test failures reported by CI.
>
> Regards,
>
> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com
> <v3-0001-pg_upgrade-Optimize-replication-slot-caught-up-ch.patch>

I just looked into v3. Basically, it now does a shared WAL scan to find the newest decodable LSN and uses that to
comparewith all slots’ confirmed_flush_lsn, which significantly reduces WAL scan effort when there are many slots. 

One thing I'm thinking about is that if all slots are far behind, the shared scan may still take a long time. Before
thischange, a scan could stop as soon as it found a pending WAL. So after the change, when there are only a few slots
andthey are far behind, the scan might end up doing more work than before. 

As a possible optimization, maybe we could also pass the newest confirmed_flush_lsn to the scan. Once it finds a
decodableWAL record newer than that confirmed_flush_lsn, we already know all slots are behind, so the scan could stop
atthat point. 

WRT the code change, I got a few comments:

1
···
+ * otherwise false. If last_pending_wal_p is set by the caller, it continues
+ * scanning WAL even after detecting a decodable WAL record, in order to
+ * get the last decodable WAL record's LSN.
  */
 bool
-LogicalReplicationSlotHasPendingWal(XLogRecPtr end_of_wal)
+LogicalReplicationSlotHasPendingWal(XLogRecPtr end_of_wal,
+                                    XLogRecPtr *last_pending_wal_p)
 {
     bool        has_pending_wal = false;

     Assert(MyReplicationSlot);

+    if (last_pending_wal_p != NULL)
+        *last_pending_wal_p = InvalidXLogRecPtr;
···

The header comment seems to conflict to the code. last_pending_wal_p is unconditionally set to InvalidXLogRecPtr, so
whatevera caller set is overwritten. I think you meant to say “if last_pending_wal_p is not NULL”. 

2
```
@@ -286,9 +287,9 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 {
     Name        slot_name;
     XLogRecPtr    end_of_wal;
-    bool        found_pending_wal;
+    XLogRecPtr    last_pending_wal;
```

The function header comment still says “returns true if …”, that should be updated.

Also, with the change, the function name becomes misleading, where “has” implies a boolean result, but now it will
returnthe newest docodeable wal when no catching up. The function name doesn’t reflect to the actual behavior. Looks
likethe function is only used by pg_upgrade, so maybe rename it. 

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







Re: pg_upgrade: optimize replication slot caught-up check

От
Masahiko Sawada
Дата:
On Thu, Jan 8, 2026 at 6:59 PM Chao Li <li.evan.chao@gmail.com> wrote:
>
>
>
> I just looked into v3. Basically, it now does a shared WAL scan to find the newest decodable LSN and uses that to
comparewith all slots’ confirmed_flush_lsn, which significantly reduces WAL scan effort when there are many slots. 

Thank you for reviewing the patch!

>
> One thing I'm thinking about is that if all slots are far behind, the shared scan may still take a long time. Before
thischange, a scan could stop as soon as it found a pending WAL. So after the change, when there are only a few slots
andthey are far behind, the scan might end up doing more work than before. 

That's a valid concern.

> As a possible optimization, maybe we could also pass the newest confirmed_flush_lsn to the scan. Once it finds a
decodableWAL record newer than that confirmed_flush_lsn, we already know all slots are behind, so the scan could stop
atthat point. 

Sounds like a reasonable idea. I'll give it a try and see how it's worthwhile.

>
> WRT the code change, I got a few comments:
>
> 1
> ···
> + * otherwise false. If last_pending_wal_p is set by the caller, it continues
> + * scanning WAL even after detecting a decodable WAL record, in order to
> + * get the last decodable WAL record's LSN.
>   */
>  bool
> -LogicalReplicationSlotHasPendingWal(XLogRecPtr end_of_wal)
> +LogicalReplicationSlotHasPendingWal(XLogRecPtr end_of_wal,
> +                                                                       XLogRecPtr *last_pending_wal_p)
>  {
>         bool            has_pending_wal = false;
>
>         Assert(MyReplicationSlot);
>
> +       if (last_pending_wal_p != NULL)
> +               *last_pending_wal_p = InvalidXLogRecPtr;
> ···
>
> The header comment seems to conflict to the code. last_pending_wal_p is unconditionally set to InvalidXLogRecPtr, so
whatevera caller set is overwritten. I think you meant to say “if last_pending_wal_p is not NULL”. 

Agreed.

>
> 2
> ```
> @@ -286,9 +287,9 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
>  {
>         Name            slot_name;
>         XLogRecPtr      end_of_wal;
> -       bool            found_pending_wal;
> +       XLogRecPtr      last_pending_wal;
> ```
>
> The function header comment still says “returns true if …”, that should be updated.
>
> Also, with the change, the function name becomes misleading, where “has” implies a boolean result, but now it will
returnthe newest docodeable wal when no catching up. The function name doesn’t reflect to the actual behavior. Looks
likethe function is only used by pg_upgrade, so maybe rename it. 

Agreed, I'll incorporate the comment in the next version patch.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: pg_upgrade: optimize replication slot caught-up check

От
Masahiko Sawada
Дата:
On Fri, Jan 9, 2026 at 10:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Jan 8, 2026 at 6:59 PM Chao Li <li.evan.chao@gmail.com> wrote:
> >
> >
> >
> > I just looked into v3. Basically, it now does a shared WAL scan to find the newest decodable LSN and uses that to
comparewith all slots’ confirmed_flush_lsn, which significantly reduces WAL scan effort when there are many slots. 
>
> Thank you for reviewing the patch!
>
> >
> > One thing I'm thinking about is that if all slots are far behind, the shared scan may still take a long time.
Beforethis change, a scan could stop as soon as it found a pending WAL. So after the change, when there are only a few
slotsand they are far behind, the scan might end up doing more work than before. 
>
> That's a valid concern.
>
> > As a possible optimization, maybe we could also pass the newest confirmed_flush_lsn to the scan. Once it finds a
decodableWAL record newer than that confirmed_flush_lsn, we already know all slots are behind, so the scan could stop
atthat point. 
>
> Sounds like a reasonable idea. I'll give it a try and see how it's worthwhile.
>
> >
> > WRT the code change, I got a few comments:
> >
> > 1
> > ···
> > + * otherwise false. If last_pending_wal_p is set by the caller, it continues
> > + * scanning WAL even after detecting a decodable WAL record, in order to
> > + * get the last decodable WAL record's LSN.
> >   */
> >  bool
> > -LogicalReplicationSlotHasPendingWal(XLogRecPtr end_of_wal)
> > +LogicalReplicationSlotHasPendingWal(XLogRecPtr end_of_wal,
> > +                                                                       XLogRecPtr *last_pending_wal_p)
> >  {
> >         bool            has_pending_wal = false;
> >
> >         Assert(MyReplicationSlot);
> >
> > +       if (last_pending_wal_p != NULL)
> > +               *last_pending_wal_p = InvalidXLogRecPtr;
> > ···
> >
> > The header comment seems to conflict to the code. last_pending_wal_p is unconditionally set to InvalidXLogRecPtr,
sowhatever a caller set is overwritten. I think you meant to say “if last_pending_wal_p is not NULL”. 
>
> Agreed.
>
> >
> > 2
> > ```
> > @@ -286,9 +287,9 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
> >  {
> >         Name            slot_name;
> >         XLogRecPtr      end_of_wal;
> > -       bool            found_pending_wal;
> > +       XLogRecPtr      last_pending_wal;
> > ```
> >
> > The function header comment still says “returns true if …”, that should be updated.
> >
> > Also, with the change, the function name becomes misleading, where “has” implies a boolean result, but now it will
returnthe newest docodeable wal when no catching up. The function name doesn’t reflect to the actual behavior. Looks
likethe function is only used by pg_upgrade, so maybe rename it. 
>
> Agreed, I'll incorporate the comment in the next version patch.
>

I've attached the updated patch that addressed all review comments.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

Re: pg_upgrade: optimize replication slot caught-up check

От
Chao Li
Дата:

> On Jan 10, 2026, at 05:32, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Jan 9, 2026 at 10:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>
>> On Thu, Jan 8, 2026 at 6:59 PM Chao Li <li.evan.chao@gmail.com> wrote:
>>>
>>>
>>>
>>> I just looked into v3. Basically, it now does a shared WAL scan to find the newest decodable LSN and uses that to
comparewith all slots’ confirmed_flush_lsn, which significantly reduces WAL scan effort when there are many slots. 
>>
>> Thank you for reviewing the patch!
>>
>>>
>>> One thing I'm thinking about is that if all slots are far behind, the shared scan may still take a long time.
Beforethis change, a scan could stop as soon as it found a pending WAL. So after the change, when there are only a few
slotsand they are far behind, the scan might end up doing more work than before. 
>>
>> That's a valid concern.
>>
>>> As a possible optimization, maybe we could also pass the newest confirmed_flush_lsn to the scan. Once it finds a
decodableWAL record newer than that confirmed_flush_lsn, we already know all slots are behind, so the scan could stop
atthat point. 
>>
>> Sounds like a reasonable idea. I'll give it a try and see how it's worthwhile.
>>
>>>
>>> WRT the code change, I got a few comments:
>>>
>>> 1
>>> ···
>>> + * otherwise false. If last_pending_wal_p is set by the caller, it continues
>>> + * scanning WAL even after detecting a decodable WAL record, in order to
>>> + * get the last decodable WAL record's LSN.
>>>  */
>>> bool
>>> -LogicalReplicationSlotHasPendingWal(XLogRecPtr end_of_wal)
>>> +LogicalReplicationSlotHasPendingWal(XLogRecPtr end_of_wal,
>>> +                                                                       XLogRecPtr *last_pending_wal_p)
>>> {
>>>        bool            has_pending_wal = false;
>>>
>>>        Assert(MyReplicationSlot);
>>>
>>> +       if (last_pending_wal_p != NULL)
>>> +               *last_pending_wal_p = InvalidXLogRecPtr;
>>> ···
>>>
>>> The header comment seems to conflict to the code. last_pending_wal_p is unconditionally set to InvalidXLogRecPtr,
sowhatever a caller set is overwritten. I think you meant to say “if last_pending_wal_p is not NULL”. 
>>
>> Agreed.
>>
>>>
>>> 2
>>> ```
>>> @@ -286,9 +287,9 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
>>> {
>>>        Name            slot_name;
>>>        XLogRecPtr      end_of_wal;
>>> -       bool            found_pending_wal;
>>> +       XLogRecPtr      last_pending_wal;
>>> ```
>>>
>>> The function header comment still says “returns true if …”, that should be updated.
>>>
>>> Also, with the change, the function name becomes misleading, where “has” implies a boolean result, but now it will
returnthe newest docodeable wal when no catching up. The function name doesn’t reflect to the actual behavior. Looks
likethe function is only used by pg_upgrade, so maybe rename it. 
>>
>> Agreed, I'll incorporate the comment in the next version patch.
>>
>
> I've attached the updated patch that addressed all review comments.
>
> Regards,
>
> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com
> <v4-0001-pg_upgrade-Optimize-replication-slot-caught-up-ch.patch>

A few comments on v4:

1
```
-     * slot is considered caught up is done by an upgrade function. This
-     * regards the slot as caught up if we don't find any decodable changes.
-     * See binary_upgrade_logical_slot_has_caught_up().
+     * slot is considered caught up is done by an upgrade function, unless the
+     * caller sets skip_caught_up_check. This regards the slot as caught up if
+     * we don't find any decodable changes. See
+     * binary_upgrade_logical_slot_has_caught_up().
```

binary_upgrade_logical_slot_has_caught_up has been renamed, so this commend needs to be updated.

2
```
+                    "temporary IS FALSE "
+                    "ORDER BY 1;",
+                    (skip_caught_up_check || user_opts.live_check) ? "FALSE" :
                     "(CASE WHEN invalidation_reason IS NOT NULL THEN FALSE "
                     "ELSE (SELECT pg_catalog.binary_upgrade_logical_slot_has_caught_up(slot_name)) "
                     "END)");
```

pg_catalog.binary_upgrade_logical_slot_has_caught_up has been renamed and it takes two parameters now.


3
```
+                if (last_pending_wal > scan_cutoff_lsn)
+                    break;
```

In LogicalReplicationSlotHasPendingWal() we early break when last_pending_wal > scan_cutoff_lsn, and later the SQL
check“confirmed_flush_lsn > last_pending_wal”. So there is an edge case, where last_pending_wal == scan_cutoff_lsn and
confirmed_flush_lsn== last_pending_wal, then neither early break nor caught up happens. 

So, I think we should use “>=“ for the both checks.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







Re: pg_upgrade: optimize replication slot caught-up check

От
Masahiko Sawada
Дата:
On Sun, Jan 11, 2026 at 7:31 PM Chao Li <li.evan.chao@gmail.com> wrote:
>
>
>
> > On Jan 10, 2026, at 05:32, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Fri, Jan 9, 2026 at 10:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >>
> >> On Thu, Jan 8, 2026 at 6:59 PM Chao Li <li.evan.chao@gmail.com> wrote:
> >>>
> >>>
> >>>
> >>> I just looked into v3. Basically, it now does a shared WAL scan to find the newest decodable LSN and uses that to
comparewith all slots’ confirmed_flush_lsn, which significantly reduces WAL scan effort when there are many slots. 
> >>
> >> Thank you for reviewing the patch!
> >>
> >>>
> >>> One thing I'm thinking about is that if all slots are far behind, the shared scan may still take a long time.
Beforethis change, a scan could stop as soon as it found a pending WAL. So after the change, when there are only a few
slotsand they are far behind, the scan might end up doing more work than before. 
> >>
> >> That's a valid concern.
> >>
> >>> As a possible optimization, maybe we could also pass the newest confirmed_flush_lsn to the scan. Once it finds a
decodableWAL record newer than that confirmed_flush_lsn, we already know all slots are behind, so the scan could stop
atthat point. 
> >>
> >> Sounds like a reasonable idea. I'll give it a try and see how it's worthwhile.
> >>
> >>>
> >>> WRT the code change, I got a few comments:
> >>>
> >>> 1
> >>> ···
> >>> + * otherwise false. If last_pending_wal_p is set by the caller, it continues
> >>> + * scanning WAL even after detecting a decodable WAL record, in order to
> >>> + * get the last decodable WAL record's LSN.
> >>>  */
> >>> bool
> >>> -LogicalReplicationSlotHasPendingWal(XLogRecPtr end_of_wal)
> >>> +LogicalReplicationSlotHasPendingWal(XLogRecPtr end_of_wal,
> >>> +                                                                       XLogRecPtr *last_pending_wal_p)
> >>> {
> >>>        bool            has_pending_wal = false;
> >>>
> >>>        Assert(MyReplicationSlot);
> >>>
> >>> +       if (last_pending_wal_p != NULL)
> >>> +               *last_pending_wal_p = InvalidXLogRecPtr;
> >>> ···
> >>>
> >>> The header comment seems to conflict to the code. last_pending_wal_p is unconditionally set to InvalidXLogRecPtr,
sowhatever a caller set is overwritten. I think you meant to say “if last_pending_wal_p is not NULL”. 
> >>
> >> Agreed.
> >>
> >>>
> >>> 2
> >>> ```
> >>> @@ -286,9 +287,9 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
> >>> {
> >>>        Name            slot_name;
> >>>        XLogRecPtr      end_of_wal;
> >>> -       bool            found_pending_wal;
> >>> +       XLogRecPtr      last_pending_wal;
> >>> ```
> >>>
> >>> The function header comment still says “returns true if …”, that should be updated.
> >>>
> >>> Also, with the change, the function name becomes misleading, where “has” implies a boolean result, but now it
willreturn the newest docodeable wal when no catching up. The function name doesn’t reflect to the actual behavior.
Lookslike the function is only used by pg_upgrade, so maybe rename it. 
> >>
> >> Agreed, I'll incorporate the comment in the next version patch.
> >>
> >
> > I've attached the updated patch that addressed all review comments.
> >
> > Regards,
> >
> > --
> > Masahiko Sawada
> > Amazon Web Services: https://aws.amazon.com
> > <v4-0001-pg_upgrade-Optimize-replication-slot-caught-up-ch.patch>
>
> A few comments on v4:
>
> 1
> ```
> -        * slot is considered caught up is done by an upgrade function. This
> -        * regards the slot as caught up if we don't find any decodable changes.
> -        * See binary_upgrade_logical_slot_has_caught_up().
> +        * slot is considered caught up is done by an upgrade function, unless the
> +        * caller sets skip_caught_up_check. This regards the slot as caught up if
> +        * we don't find any decodable changes. See
> +        * binary_upgrade_logical_slot_has_caught_up().
> ```
>
> binary_upgrade_logical_slot_has_caught_up has been renamed, so this commend needs to be updated.
>
> 2
> ```
> +                                       "temporary IS FALSE "
> +                                       "ORDER BY 1;",
> +                                       (skip_caught_up_check || user_opts.live_check) ? "FALSE" :
>                                         "(CASE WHEN invalidation_reason IS NOT NULL THEN FALSE "
>                                         "ELSE (SELECT
pg_catalog.binary_upgrade_logical_slot_has_caught_up(slot_name))" 
>                                         "END)");
> ```
>
> pg_catalog.binary_upgrade_logical_slot_has_caught_up has been renamed and it takes two parameters now.

binary_upgrade_logical_slot_has_caught_up() is still used when the old
cluster is PG18 or older, so we cannot use
binary_upgrade_check_logical_slot_pending_wal() here.

>
>
> 3
> ```
> +                               if (last_pending_wal > scan_cutoff_lsn)
> +                                       break;
> ```
>
> In LogicalReplicationSlotHasPendingWal() we early break when last_pending_wal > scan_cutoff_lsn, and later the SQL
check“confirmed_flush_lsn > last_pending_wal”. So there is an edge case, where last_pending_wal == scan_cutoff_lsn and
confirmed_flush_lsn== last_pending_wal, then neither early break nor caught up happens. 
>
> So, I think we should use “>=“ for the both checks.

Good catch, but I think we should use ">=" only in
LogicalReplicationSlotHasPendingWal(). We can terminate the scan early
if we find any decodable WAL whose LSN is >= confirmed_flush_lsn. If
scan_cutoff (refers to a slot's confirmed_flush_lsn) ==
last_pending_wal, the slot should not ignore the last_pending_wal.

I've attached the updated patch.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

Re: pg_upgrade: optimize replication slot caught-up check

От
Chao Li
Дата:

> On Jan 15, 2026, at 01:54, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Sun, Jan 11, 2026 at 7:31 PM Chao Li <li.evan.chao@gmail.com> wrote:
>>
>>
>>
>>> On Jan 10, 2026, at 05:32, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>>
>>> On Fri, Jan 9, 2026 at 10:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>>>
>>>> On Thu, Jan 8, 2026 at 6:59 PM Chao Li <li.evan.chao@gmail.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> I just looked into v3. Basically, it now does a shared WAL scan to find the newest decodable LSN and uses that to
comparewith all slots’ confirmed_flush_lsn, which significantly reduces WAL scan effort when there are many slots. 
>>>>
>>>> Thank you for reviewing the patch!
>>>>
>>>>>
>>>>> One thing I'm thinking about is that if all slots are far behind, the shared scan may still take a long time.
Beforethis change, a scan could stop as soon as it found a pending WAL. So after the change, when there are only a few
slotsand they are far behind, the scan might end up doing more work than before. 
>>>>
>>>> That's a valid concern.
>>>>
>>>>> As a possible optimization, maybe we could also pass the newest confirmed_flush_lsn to the scan. Once it finds a
decodableWAL record newer than that confirmed_flush_lsn, we already know all slots are behind, so the scan could stop
atthat point. 
>>>>
>>>> Sounds like a reasonable idea. I'll give it a try and see how it's worthwhile.
>>>>
>>>>>
>>>>> WRT the code change, I got a few comments:
>>>>>
>>>>> 1
>>>>> ···
>>>>> + * otherwise false. If last_pending_wal_p is set by the caller, it continues
>>>>> + * scanning WAL even after detecting a decodable WAL record, in order to
>>>>> + * get the last decodable WAL record's LSN.
>>>>> */
>>>>> bool
>>>>> -LogicalReplicationSlotHasPendingWal(XLogRecPtr end_of_wal)
>>>>> +LogicalReplicationSlotHasPendingWal(XLogRecPtr end_of_wal,
>>>>> +                                                                       XLogRecPtr *last_pending_wal_p)
>>>>> {
>>>>>       bool            has_pending_wal = false;
>>>>>
>>>>>       Assert(MyReplicationSlot);
>>>>>
>>>>> +       if (last_pending_wal_p != NULL)
>>>>> +               *last_pending_wal_p = InvalidXLogRecPtr;
>>>>> ···
>>>>>
>>>>> The header comment seems to conflict to the code. last_pending_wal_p is unconditionally set to InvalidXLogRecPtr,
sowhatever a caller set is overwritten. I think you meant to say “if last_pending_wal_p is not NULL”. 
>>>>
>>>> Agreed.
>>>>
>>>>>
>>>>> 2
>>>>> ```
>>>>> @@ -286,9 +287,9 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
>>>>> {
>>>>>       Name            slot_name;
>>>>>       XLogRecPtr      end_of_wal;
>>>>> -       bool            found_pending_wal;
>>>>> +       XLogRecPtr      last_pending_wal;
>>>>> ```
>>>>>
>>>>> The function header comment still says “returns true if …”, that should be updated.
>>>>>
>>>>> Also, with the change, the function name becomes misleading, where “has” implies a boolean result, but now it
willreturn the newest docodeable wal when no catching up. The function name doesn’t reflect to the actual behavior.
Lookslike the function is only used by pg_upgrade, so maybe rename it. 
>>>>
>>>> Agreed, I'll incorporate the comment in the next version patch.
>>>>
>>>
>>> I've attached the updated patch that addressed all review comments.
>>>
>>> Regards,
>>>
>>> --
>>> Masahiko Sawada
>>> Amazon Web Services: https://aws.amazon.com
>>> <v4-0001-pg_upgrade-Optimize-replication-slot-caught-up-ch.patch>
>>
>> A few comments on v4:
>>
>> 1
>> ```
>> -        * slot is considered caught up is done by an upgrade function. This
>> -        * regards the slot as caught up if we don't find any decodable changes.
>> -        * See binary_upgrade_logical_slot_has_caught_up().
>> +        * slot is considered caught up is done by an upgrade function, unless the
>> +        * caller sets skip_caught_up_check. This regards the slot as caught up if
>> +        * we don't find any decodable changes. See
>> +        * binary_upgrade_logical_slot_has_caught_up().
>> ```
>>
>> binary_upgrade_logical_slot_has_caught_up has been renamed, so this commend needs to be updated.
>>
>> 2
>> ```
>> +                                       "temporary IS FALSE "
>> +                                       "ORDER BY 1;",
>> +                                       (skip_caught_up_check || user_opts.live_check) ? "FALSE" :
>>                                        "(CASE WHEN invalidation_reason IS NOT NULL THEN FALSE "
>>                                        "ELSE (SELECT
pg_catalog.binary_upgrade_logical_slot_has_caught_up(slot_name))" 
>>                                        "END)");
>> ```
>>
>> pg_catalog.binary_upgrade_logical_slot_has_caught_up has been renamed and it takes two parameters now.
>
> binary_upgrade_logical_slot_has_caught_up() is still used when the old
> cluster is PG18 or older, so we cannot use
> binary_upgrade_check_logical_slot_pending_wal() here.
>
>>
>>
>> 3
>> ```
>> +                               if (last_pending_wal > scan_cutoff_lsn)
>> +                                       break;
>> ```
>>
>> In LogicalReplicationSlotHasPendingWal() we early break when last_pending_wal > scan_cutoff_lsn, and later the SQL
check“confirmed_flush_lsn > last_pending_wal”. So there is an edge case, where last_pending_wal == scan_cutoff_lsn and
confirmed_flush_lsn== last_pending_wal, then neither early break nor caught up happens. 
>>
>> So, I think we should use “>=“ for the both checks.
>
> Good catch, but I think we should use ">=" only in
> LogicalReplicationSlotHasPendingWal(). We can terminate the scan early
> if we find any decodable WAL whose LSN is >= confirmed_flush_lsn. If
> scan_cutoff (refers to a slot's confirmed_flush_lsn) ==
> last_pending_wal, the slot should not ignore the last_pending_wal.
>
> I've attached the updated patch.
>
> Regards,
>
> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com
> <v5-0001-pg_upgrade-Optimize-replication-slot-caught-up-ch.patch>

Hi Masahiko-san,

One typo, otherwise V5 looks good to me.

```
+         * using another query, it not during a live_check.
```
it not -> if not

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







Re: pg_upgrade: optimize replication slot caught-up check

От
shveta malik
Дата:
On Wed, Jan 14, 2026 at 11:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I've attached the updated patch.
>

Thank You for the patch. I like the idea of optimization. Few initial comments:

1)
+ * The query returns the slot names and their caught-up status in
+ * the same order as the results collected by
+ * get_old_cluster_logical_slot_infos(). If this query is changed,

I could not find the function get_old_cluster_logical_slot_infos(), do
you mean get_old_cluster_logical_slot_infos_query()?

2)
"  WHERE database = current_database() AND "
"    slot_type = 'logical' AND "

Is there a reason why database = current_database() is placed before
slot_type = 'logical'? I am not sure how the PostgreSQL optimizer and
executor will order these predicates, but from the first look,
slot_type = 'logical' appears cheaper and could be placed first,
consistent with the ordering used at other places.

3)
Shouldn’t we add a sanity check inside
get_old_cluster_logical_slot_infos_query() to ensure that when
skip_caught_up_check is true, we are on PostgreSQL 18 or lower? This
would make the function safer for future use if it's called elsewhere.
I understand the caller already performs a similar check, but I think
it's more appropriate here since we call
binary_upgrade_logical_slot_has_caught_up() from inside, which doesn’t
even exist on newer versions.

4)
+# Check the file content. While both test_slot1 and test_slot2 should
be reporting
+# that they have unconsumed WAL records, test_slot3 should not be reported as
+# it has caught up.

Can you please elaborate the reason behind test_slot3 not being
reported? Also mention in the comment if possible.

thanks
Shveta



Re: pg_upgrade: optimize replication slot caught-up check

От
shveta malik
Дата:
On Tue, Jan 20, 2026 at 12:08 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Wed, Jan 14, 2026 at 11:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached the updated patch.
> >
>
> Thank You for the patch. I like the idea of optimization. Few initial comments:
>
> 1)
> + * The query returns the slot names and their caught-up status in
> + * the same order as the results collected by
> + * get_old_cluster_logical_slot_infos(). If this query is changed,
>
> I could not find the function get_old_cluster_logical_slot_infos(), do
> you mean get_old_cluster_logical_slot_infos_query()?
>
> 2)
> "  WHERE database = current_database() AND "
> "    slot_type = 'logical' AND "
>
> Is there a reason why database = current_database() is placed before
> slot_type = 'logical'? I am not sure how the PostgreSQL optimizer and
> executor will order these predicates, but from the first look,
> slot_type = 'logical' appears cheaper and could be placed first,
> consistent with the ordering used at other places.
>
> 3)
> Shouldn’t we add a sanity check inside
> get_old_cluster_logical_slot_infos_query() to ensure that when
> skip_caught_up_check is true, we are on PostgreSQL 18 or lower? This
> would make the function safer for future use if it's called elsewhere.
> I understand the caller already performs a similar check, but I think
> it's more appropriate here since we call
> binary_upgrade_logical_slot_has_caught_up() from inside, which doesn’t
> even exist on newer versions.
>

A correction, the case I stated here is when 'skip_caught_up_check' is false.

> 4)
> +# Check the file content. While both test_slot1 and test_slot2 should
> be reporting
> +# that they have unconsumed WAL records, test_slot3 should not be reported as
> +# it has caught up.
>
> Can you please elaborate the reason behind test_slot3 not being
> reported? Also mention in the comment if possible.
>
> thanks
> Shveta



Re: pg_upgrade: optimize replication slot caught-up check

От
Masahiko Sawada
Дата:
On Mon, Jan 19, 2026 at 10:38 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Wed, Jan 14, 2026 at 11:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached the updated patch.
> >
>
> Thank You for the patch. I like the idea of optimization. Few initial comments:

Thank you for reviewing the patch!

>
> 1)
> + * The query returns the slot names and their caught-up status in
> + * the same order as the results collected by
> + * get_old_cluster_logical_slot_infos(). If this query is changed,
>
> I could not find the function get_old_cluster_logical_slot_infos(), do
> you mean get_old_cluster_logical_slot_infos_query()?

It seems an oversight in commit 6d3d2e8e541f0. I think it should be
get_db_rel_and_slot_infos().

>
> 2)
> "  WHERE database = current_database() AND "
> "    slot_type = 'logical' AND "
>
> Is there a reason why database = current_database() is placed before
> slot_type = 'logical'? I am not sure how the PostgreSQL optimizer and
> executor will order these predicates, but from the first look,
> slot_type = 'logical' appears cheaper and could be placed first,
> consistent with the ordering used at other places.

Changed.

>
> 3)
> Shouldn’t we add a sanity check inside
> get_old_cluster_logical_slot_infos_query() to ensure that when
> skip_caught_up_check is true, we are on PostgreSQL 18 or lower? This
> would make the function safer for future use if it's called elsewhere.
> I understand the caller already performs a similar check, but I think
> it's more appropriate here since we call
> binary_upgrade_logical_slot_has_caught_up() from inside, which doesn’t
> even exist on newer versions.

What kind of sanity check did you mean? We can have a check with
pg_fatal() but it seems almost the same to me even if pg_upgrade fails
with an error due to missing
binary_upgrade_logical_slot_has_caught_up().

>
> 4)
> +# Check the file content. While both test_slot1 and test_slot2 should
> be reporting
> +# that they have unconsumed WAL records, test_slot3 should not be reported as
> +# it has caught up.
>
> Can you please elaborate the reason behind test_slot3 not being
> reported? Also mention in the comment if possible.

We advance test_slot3 to the current WAL LSN before executing
pg_upgrade, so the test_slot3 should have consumed all pending WALs.
Please refer to the following changes:

 # Preparations for the subsequent test:
-# 1. Generate extra WAL records. At this point neither test_slot1 nor
-#   test_slot2 has consumed them.
+# 1. Generate extra WAL records. At this point none of slots has consumed them.
 #
 # 2. Advance the slot test_slot2 up to the current WAL location, but test_slot1
 #   still has unconsumed WAL records.
 #
 # 3. Emit a non-transactional message. This will cause test_slot2 to detect the
 #   unconsumed WAL record.
+#
+# 4. Advance the slot test_slots3 up to the current WAL location.
 $oldpub->start;
 $oldpub->safe_psql(
    'postgres', qq[
        CREATE TABLE tbl AS SELECT generate_series(1, 10) AS a;
        SELECT pg_replication_slot_advance('test_slot2', pg_current_wal_lsn());
-       SELECT count(*) FROM pg_logical_emit_message('false',
'prefix', 'This is a non-transactional message');
+       SELECT count(*) FROM pg_logical_emit_message('false',
'prefix', 'This is a non-transactional message', true);
+       SELECT pg_replication_slot_advance('test_slot3', pg_current_wal_lsn());

I believe that the following new comment explains the reason well:

+# Check the file content. While both test_slot1 and test_slot2 should
be reporting
+# that they have unconsumed WAL records, test_slot3 should not be reported as
+# it has caught up.

I've attached the updated patch.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

Re: pg_upgrade: optimize replication slot caught-up check

От
shveta malik
Дата:
On Thu, Jan 22, 2026 at 5:19 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Jan 19, 2026 at 10:38 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Wed, Jan 14, 2026 at 11:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > I've attached the updated patch.
> > >
> >
> > Thank You for the patch. I like the idea of optimization. Few initial comments:
>
> Thank you for reviewing the patch!
>
> >
> > 1)
> > + * The query returns the slot names and their caught-up status in
> > + * the same order as the results collected by
> > + * get_old_cluster_logical_slot_infos(). If this query is changed,
> >
> > I could not find the function get_old_cluster_logical_slot_infos(), do
> > you mean get_old_cluster_logical_slot_infos_query()?
>
> It seems an oversight in commit 6d3d2e8e541f0. I think it should be
> get_db_rel_and_slot_infos().
>
> >
> > 2)
> > "  WHERE database = current_database() AND "
> > "    slot_type = 'logical' AND "
> >
> > Is there a reason why database = current_database() is placed before
> > slot_type = 'logical'? I am not sure how the PostgreSQL optimizer and
> > executor will order these predicates, but from the first look,
> > slot_type = 'logical' appears cheaper and could be placed first,
> > consistent with the ordering used at other places.
>
> Changed.
>
> >
> > 3)
> > Shouldn’t we add a sanity check inside
> > get_old_cluster_logical_slot_infos_query() to ensure that when
> > skip_caught_up_check is true, we are on PostgreSQL 18 or lower? This
> > would make the function safer for future use if it's called elsewhere.
> > I understand the caller already performs a similar check, but I think
> > it's more appropriate here since we call
> > binary_upgrade_logical_slot_has_caught_up() from inside, which doesn’t
> > even exist on newer versions.
>
> What kind of sanity check did you mean? We can have a check with
> pg_fatal() but it seems almost the same to me even if pg_upgrade fails
> with an error due to missing
> binary_upgrade_logical_slot_has_caught_up().

I was referring to a development-level sanity check, something like:

/* skip_caught_up_check is required iff PG19 or newer */
Assert((GET_MAJOR_VERSION(cluster->major_version) >= 1900) ==
   skip_caught_up_check);

But performing this check requires access to the cluster version (or
cluster information), which this function currently does not have.
Given that, do you think it would make sense to pass the cluster as an
argument to this function in order to perform the sanity check here?

> >
> > 4)
> > +# Check the file content. While both test_slot1 and test_slot2 should
> > be reporting
> > +# that they have unconsumed WAL records, test_slot3 should not be reported as
> > +# it has caught up.
> >
> > Can you please elaborate the reason behind test_slot3 not being
> > reported? Also mention in the comment if possible.
>
> We advance test_slot3 to the current WAL LSN before executing
> pg_upgrade, so the test_slot3 should have consumed all pending WALs.
> Please refer to the following changes:

I understand the test, and the comments are clear to me. I also
understand that only test_slot3 is expected to be in the caught-up
state. My questions were specifically about the following points:
1)  Why do we expect 'slot3 caught-up' not to be mentioned in the LOG?
Is it simply because there is no corresponding logging in the code, or
is this behavior related to some aspect of your fix that I may have
missed?
2) In general, we do not validate the absence of LOG messages in
tests. Why is this considered a special case where such a check is
appropriate?

>  # Preparations for the subsequent test:
> -# 1. Generate extra WAL records. At this point neither test_slot1 nor
> -#   test_slot2 has consumed them.
> +# 1. Generate extra WAL records. At this point none of slots has consumed them.
>  #
>  # 2. Advance the slot test_slot2 up to the current WAL location, but test_slot1
>  #   still has unconsumed WAL records.
>  #
>  # 3. Emit a non-transactional message. This will cause test_slot2 to detect the
>  #   unconsumed WAL record.
> +#
> +# 4. Advance the slot test_slots3 up to the current WAL location.
>  $oldpub->start;
>  $oldpub->safe_psql(
>     'postgres', qq[
>         CREATE TABLE tbl AS SELECT generate_series(1, 10) AS a;
>         SELECT pg_replication_slot_advance('test_slot2', pg_current_wal_lsn());
> -       SELECT count(*) FROM pg_logical_emit_message('false',
> 'prefix', 'This is a non-transactional message');
> +       SELECT count(*) FROM pg_logical_emit_message('false',
> 'prefix', 'This is a non-transactional message', true);
> +       SELECT pg_replication_slot_advance('test_slot3', pg_current_wal_lsn());
>
> I believe that the following new comment explains the reason well:
>
> +# Check the file content. While both test_slot1 and test_slot2 should
> be reporting
> +# that they have unconsumed WAL records, test_slot3 should not be reported as
> +# it has caught up.
>
> I've attached the updated patch.
>
> Regards,
>
> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com



Re: pg_upgrade: optimize replication slot caught-up check

От
Masahiko Sawada
Дата:
On Wed, Jan 21, 2026 at 7:49 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Thu, Jan 22, 2026 at 5:19 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Jan 19, 2026 at 10:38 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Wed, Jan 14, 2026 at 11:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > I've attached the updated patch.
> > > >
> > >
> > > Thank You for the patch. I like the idea of optimization. Few initial comments:
> >
> > Thank you for reviewing the patch!
> >
> > >
> > > 1)
> > > + * The query returns the slot names and their caught-up status in
> > > + * the same order as the results collected by
> > > + * get_old_cluster_logical_slot_infos(). If this query is changed,
> > >
> > > I could not find the function get_old_cluster_logical_slot_infos(), do
> > > you mean get_old_cluster_logical_slot_infos_query()?
> >
> > It seems an oversight in commit 6d3d2e8e541f0. I think it should be
> > get_db_rel_and_slot_infos().
> >
> > >
> > > 2)
> > > "  WHERE database = current_database() AND "
> > > "    slot_type = 'logical' AND "
> > >
> > > Is there a reason why database = current_database() is placed before
> > > slot_type = 'logical'? I am not sure how the PostgreSQL optimizer and
> > > executor will order these predicates, but from the first look,
> > > slot_type = 'logical' appears cheaper and could be placed first,
> > > consistent with the ordering used at other places.
> >
> > Changed.
> >
> > >
> > > 3)
> > > Shouldn’t we add a sanity check inside
> > > get_old_cluster_logical_slot_infos_query() to ensure that when
> > > skip_caught_up_check is true, we are on PostgreSQL 18 or lower? This
> > > would make the function safer for future use if it's called elsewhere.
> > > I understand the caller already performs a similar check, but I think
> > > it's more appropriate here since we call
> > > binary_upgrade_logical_slot_has_caught_up() from inside, which doesn’t
> > > even exist on newer versions.
> >
> > What kind of sanity check did you mean? We can have a check with
> > pg_fatal() but it seems almost the same to me even if pg_upgrade fails
> > with an error due to missing
> > binary_upgrade_logical_slot_has_caught_up().
>
> I was referring to a development-level sanity check, something like:
>
> /* skip_caught_up_check is required iff PG19 or newer */
> Assert((GET_MAJOR_VERSION(cluster->major_version) >= 1900) ==
>    skip_caught_up_check);
>
> But performing this check requires access to the cluster version (or
> cluster information), which this function currently does not have.
> Given that, do you think it would make sense to pass the cluster as an
> argument to this function in order to perform the sanity check here?

Hmm, I think it's better not to have the same check in multiple
places, but it might make sense to have
get_old_cluster_logical_slot_infos_query() decide whether to use the
fast method. I've updated the patch accordingly, please review it.

>
> > >
> > > 4)
> > > +# Check the file content. While both test_slot1 and test_slot2 should
> > > be reporting
> > > +# that they have unconsumed WAL records, test_slot3 should not be reported as
> > > +# it has caught up.
> > >
> > > Can you please elaborate the reason behind test_slot3 not being
> > > reported? Also mention in the comment if possible.
> >
> > We advance test_slot3 to the current WAL LSN before executing
> > pg_upgrade, so the test_slot3 should have consumed all pending WALs.
> > Please refer to the following changes:
>
> I understand the test, and the comments are clear to me. I also
> understand that only test_slot3 is expected to be in the caught-up
> state. My questions were specifically about the following points:
> 1)  Why do we expect 'slot3 caught-up' not to be mentioned in the LOG?
> Is it simply because there is no corresponding logging in the code, or
> is this behavior related to some aspect of your fix that I may have
> missed?
>
> 2) In general, we do not validate the absence of LOG messages in
> tests. Why is this considered a special case where such a check is
> appropriate?

What LOG do you refer to? In these tests, we check the
invalid_logical_slots.txt file where pg_upgrade reports only invalid
slots (in terms of pg_upgrade). For test_slot3, it should not be
mentioned in that file as it has caught up. Given that the file has
only invalid slots, checking the absence of test_slot3 in the file
makes sense to me.


Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

Re: pg_upgrade: optimize replication slot caught-up check

От
Chao Li
Дата:

> On Jan 23, 2026, at 04:33, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Jan 21, 2026 at 7:49 PM shveta malik <shveta.malik@gmail.com> wrote:
>>
>> On Thu, Jan 22, 2026 at 5:19 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>>
>>> On Mon, Jan 19, 2026 at 10:38 PM shveta malik <shveta.malik@gmail.com> wrote:
>>>>
>>>> On Wed, Jan 14, 2026 at 11:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>>>>
>>>>> I've attached the updated patch.
>>>>>
>>>>
>>>> Thank You for the patch. I like the idea of optimization. Few initial comments:
>>>
>>> Thank you for reviewing the patch!
>>>
>>>>
>>>> 1)
>>>> + * The query returns the slot names and their caught-up status in
>>>> + * the same order as the results collected by
>>>> + * get_old_cluster_logical_slot_infos(). If this query is changed,
>>>>
>>>> I could not find the function get_old_cluster_logical_slot_infos(), do
>>>> you mean get_old_cluster_logical_slot_infos_query()?
>>>
>>> It seems an oversight in commit 6d3d2e8e541f0. I think it should be
>>> get_db_rel_and_slot_infos().
>>>
>>>>
>>>> 2)
>>>> "  WHERE database = current_database() AND "
>>>> "    slot_type = 'logical' AND "
>>>>
>>>> Is there a reason why database = current_database() is placed before
>>>> slot_type = 'logical'? I am not sure how the PostgreSQL optimizer and
>>>> executor will order these predicates, but from the first look,
>>>> slot_type = 'logical' appears cheaper and could be placed first,
>>>> consistent with the ordering used at other places.
>>>
>>> Changed.
>>>
>>>>
>>>> 3)
>>>> Shouldn’t we add a sanity check inside
>>>> get_old_cluster_logical_slot_infos_query() to ensure that when
>>>> skip_caught_up_check is true, we are on PostgreSQL 18 or lower? This
>>>> would make the function safer for future use if it's called elsewhere.
>>>> I understand the caller already performs a similar check, but I think
>>>> it's more appropriate here since we call
>>>> binary_upgrade_logical_slot_has_caught_up() from inside, which doesn’t
>>>> even exist on newer versions.
>>>
>>> What kind of sanity check did you mean? We can have a check with
>>> pg_fatal() but it seems almost the same to me even if pg_upgrade fails
>>> with an error due to missing
>>> binary_upgrade_logical_slot_has_caught_up().
>>
>> I was referring to a development-level sanity check, something like:
>>
>> /* skip_caught_up_check is required iff PG19 or newer */
>> Assert((GET_MAJOR_VERSION(cluster->major_version) >= 1900) ==
>>   skip_caught_up_check);
>>
>> But performing this check requires access to the cluster version (or
>> cluster information), which this function currently does not have.
>> Given that, do you think it would make sense to pass the cluster as an
>> argument to this function in order to perform the sanity check here?
>
> Hmm, I think it's better not to have the same check in multiple
> places, but it might make sense to have
> get_old_cluster_logical_slot_infos_query() decide whether to use the
> fast method. I've updated the patch accordingly, please review it.
>
>>
>>>>
>>>> 4)
>>>> +# Check the file content. While both test_slot1 and test_slot2 should
>>>> be reporting
>>>> +# that they have unconsumed WAL records, test_slot3 should not be reported as
>>>> +# it has caught up.
>>>>
>>>> Can you please elaborate the reason behind test_slot3 not being
>>>> reported? Also mention in the comment if possible.
>>>
>>> We advance test_slot3 to the current WAL LSN before executing
>>> pg_upgrade, so the test_slot3 should have consumed all pending WALs.
>>> Please refer to the following changes:
>>
>> I understand the test, and the comments are clear to me. I also
>> understand that only test_slot3 is expected to be in the caught-up
>> state. My questions were specifically about the following points:
>> 1)  Why do we expect 'slot3 caught-up' not to be mentioned in the LOG?
>> Is it simply because there is no corresponding logging in the code, or
>> is this behavior related to some aspect of your fix that I may have
>> missed?
>>
>> 2) In general, we do not validate the absence of LOG messages in
>> tests. Why is this considered a special case where such a check is
>> appropriate?
>
> What LOG do you refer to? In these tests, we check the
> invalid_logical_slots.txt file where pg_upgrade reports only invalid
> slots (in terms of pg_upgrade). For test_slot3, it should not be
> mentioned in that file as it has caught up. Given that the file has
> only invalid slots, checking the absence of test_slot3 in the file
> makes sense to me.
>
>
> Regards,
>
> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com
> <v7-0001-pg_upgrade-Optimize-replication-slot-caught-up-ch.patch>

I just went through v7, and overall looks good to me.

Only nitpick is:
```
+ *
+ * use_fast_caught_up_check is set to true on return if available in the given
+ * cluster.
```

Strictly speaking, use_fast_caught_up_check is a pointer, it cannot be set, it is *use_fast_caught_up_check that is set
totrue. So, please add a star sign in front of use_fast_caught_up_check. 

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







Re: pg_upgrade: optimize replication slot caught-up check

От
shveta malik
Дата:
On Fri, Jan 23, 2026 at 2:04 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Jan 21, 2026 at 7:49 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Thu, Jan 22, 2026 at 5:19 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Mon, Jan 19, 2026 at 10:38 PM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > On Wed, Jan 14, 2026 at 11:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > I've attached the updated patch.
> > > > >
> > > >
> > > > Thank You for the patch. I like the idea of optimization. Few initial comments:
> > >
> > > Thank you for reviewing the patch!
> > >
> > > >
> > > > 1)
> > > > + * The query returns the slot names and their caught-up status in
> > > > + * the same order as the results collected by
> > > > + * get_old_cluster_logical_slot_infos(). If this query is changed,
> > > >
> > > > I could not find the function get_old_cluster_logical_slot_infos(), do
> > > > you mean get_old_cluster_logical_slot_infos_query()?
> > >
> > > It seems an oversight in commit 6d3d2e8e541f0. I think it should be
> > > get_db_rel_and_slot_infos().
> > >
> > > >
> > > > 2)
> > > > "  WHERE database = current_database() AND "
> > > > "    slot_type = 'logical' AND "
> > > >
> > > > Is there a reason why database = current_database() is placed before
> > > > slot_type = 'logical'? I am not sure how the PostgreSQL optimizer and
> > > > executor will order these predicates, but from the first look,
> > > > slot_type = 'logical' appears cheaper and could be placed first,
> > > > consistent with the ordering used at other places.
> > >
> > > Changed.
> > >
> > > >
> > > > 3)
> > > > Shouldn’t we add a sanity check inside
> > > > get_old_cluster_logical_slot_infos_query() to ensure that when
> > > > skip_caught_up_check is true, we are on PostgreSQL 18 or lower? This
> > > > would make the function safer for future use if it's called elsewhere.
> > > > I understand the caller already performs a similar check, but I think
> > > > it's more appropriate here since we call
> > > > binary_upgrade_logical_slot_has_caught_up() from inside, which doesn’t
> > > > even exist on newer versions.
> > >
> > > What kind of sanity check did you mean? We can have a check with
> > > pg_fatal() but it seems almost the same to me even if pg_upgrade fails
> > > with an error due to missing
> > > binary_upgrade_logical_slot_has_caught_up().
> >
> > I was referring to a development-level sanity check, something like:
> >
> > /* skip_caught_up_check is required iff PG19 or newer */
> > Assert((GET_MAJOR_VERSION(cluster->major_version) >= 1900) ==
> >    skip_caught_up_check);
> >
> > But performing this check requires access to the cluster version (or
> > cluster information), which this function currently does not have.
> > Given that, do you think it would make sense to pass the cluster as an
> > argument to this function in order to perform the sanity check here?
>
> Hmm, I think it's better not to have the same check in multiple
> places, but it might make sense to have
> get_old_cluster_logical_slot_infos_query() decide whether to use the
> fast method. I've updated the patch accordingly, please review it.
>

Okay, looks good. Just one minor thing:

+ * Note that binary_upgrade_logical_slot_has_caught_up() is available only
+ * PG18 or older. For PG19 or newer *use_fast_caught_up_check should be
+ * set true, and use binary_upgrade_check_logical_slot_pending_wal()
+ * instead in the separate query (see slot_caught_up_info_query).

Shall we tweak it slightly:

 * Note that binary_upgrade_logical_slot_has_caught_up() is available
 * only in PG18 and earlier. For PG19 and later, set *use_fast_caught_up_check
 * to true and use binary_upgrade_check_logical_slot_pending_wal() instead,
 * in a separate query (see slot_caught_up_info_query in
get_db_rel_and_slot_infos()).


> >
> > > >
> > > > 4)
> > > > +# Check the file content. While both test_slot1 and test_slot2 should
> > > > be reporting
> > > > +# that they have unconsumed WAL records, test_slot3 should not be reported as
> > > > +# it has caught up.
> > > >
> > > > Can you please elaborate the reason behind test_slot3 not being
> > > > reported? Also mention in the comment if possible.
> > >
> > > We advance test_slot3 to the current WAL LSN before executing
> > > pg_upgrade, so the test_slot3 should have consumed all pending WALs.
> > > Please refer to the following changes:
> >
> > I understand the test, and the comments are clear to me. I also
> > understand that only test_slot3 is expected to be in the caught-up
> > state. My questions were specifically about the following points:
> > 1)  Why do we expect 'slot3 caught-up' not to be mentioned in the LOG?
> > Is it simply because there is no corresponding logging in the code, or
> > is this behavior related to some aspect of your fix that I may have
> > missed?
> >
> > 2) In general, we do not validate the absence of LOG messages in
> > tests. Why is this considered a special case where such a check is
> > appropriate?
>
> What LOG do you refer to? In these tests, we check the
> invalid_logical_slots.txt file where pg_upgrade reports only invalid
> slots (in terms of pg_upgrade). For test_slot3, it should not be
> mentioned in that file as it has caught up. Given that the file has
> only invalid slots, checking the absence of test_slot3 in the file
> makes sense to me.
>

Okay, I get the intent now. Thanks!

Other than the comment suggestion above, the patch LGTM.

thanks
Shveta



Re: pg_upgrade: optimize replication slot caught-up check

От
Masahiko Sawada
Дата:
On Thu, Jan 22, 2026 at 11:03 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Fri, Jan 23, 2026 at 2:04 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Jan 21, 2026 at 7:49 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Thu, Jan 22, 2026 at 5:19 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Mon, Jan 19, 2026 at 10:38 PM shveta malik <shveta.malik@gmail.com> wrote:
> > > > >
> > > > > On Wed, Jan 14, 2026 at 11:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > >
> > > > > > I've attached the updated patch.
> > > > > >
> > > > >
> > > > > Thank You for the patch. I like the idea of optimization. Few initial comments:
> > > >
> > > > Thank you for reviewing the patch!
> > > >
> > > > >
> > > > > 1)
> > > > > + * The query returns the slot names and their caught-up status in
> > > > > + * the same order as the results collected by
> > > > > + * get_old_cluster_logical_slot_infos(). If this query is changed,
> > > > >
> > > > > I could not find the function get_old_cluster_logical_slot_infos(), do
> > > > > you mean get_old_cluster_logical_slot_infos_query()?
> > > >
> > > > It seems an oversight in commit 6d3d2e8e541f0. I think it should be
> > > > get_db_rel_and_slot_infos().
> > > >
> > > > >
> > > > > 2)
> > > > > "  WHERE database = current_database() AND "
> > > > > "    slot_type = 'logical' AND "
> > > > >
> > > > > Is there a reason why database = current_database() is placed before
> > > > > slot_type = 'logical'? I am not sure how the PostgreSQL optimizer and
> > > > > executor will order these predicates, but from the first look,
> > > > > slot_type = 'logical' appears cheaper and could be placed first,
> > > > > consistent with the ordering used at other places.
> > > >
> > > > Changed.
> > > >
> > > > >
> > > > > 3)
> > > > > Shouldn’t we add a sanity check inside
> > > > > get_old_cluster_logical_slot_infos_query() to ensure that when
> > > > > skip_caught_up_check is true, we are on PostgreSQL 18 or lower? This
> > > > > would make the function safer for future use if it's called elsewhere.
> > > > > I understand the caller already performs a similar check, but I think
> > > > > it's more appropriate here since we call
> > > > > binary_upgrade_logical_slot_has_caught_up() from inside, which doesn’t
> > > > > even exist on newer versions.
> > > >
> > > > What kind of sanity check did you mean? We can have a check with
> > > > pg_fatal() but it seems almost the same to me even if pg_upgrade fails
> > > > with an error due to missing
> > > > binary_upgrade_logical_slot_has_caught_up().
> > >
> > > I was referring to a development-level sanity check, something like:
> > >
> > > /* skip_caught_up_check is required iff PG19 or newer */
> > > Assert((GET_MAJOR_VERSION(cluster->major_version) >= 1900) ==
> > >    skip_caught_up_check);
> > >
> > > But performing this check requires access to the cluster version (or
> > > cluster information), which this function currently does not have.
> > > Given that, do you think it would make sense to pass the cluster as an
> > > argument to this function in order to perform the sanity check here?
> >
> > Hmm, I think it's better not to have the same check in multiple
> > places, but it might make sense to have
> > get_old_cluster_logical_slot_infos_query() decide whether to use the
> > fast method. I've updated the patch accordingly, please review it.
> >
>
> Okay, looks good. Just one minor thing:
>
> + * Note that binary_upgrade_logical_slot_has_caught_up() is available only
> + * PG18 or older. For PG19 or newer *use_fast_caught_up_check should be
> + * set true, and use binary_upgrade_check_logical_slot_pending_wal()
> + * instead in the separate query (see slot_caught_up_info_query).
>
> Shall we tweak it slightly:
>
>  * Note that binary_upgrade_logical_slot_has_caught_up() is available
>  * only in PG18 and earlier. For PG19 and later, set *use_fast_caught_up_check
>  * to true and use binary_upgrade_check_logical_slot_pending_wal() instead,
>  * in a separate query (see slot_caught_up_info_query in
> get_db_rel_and_slot_infos()).
>
>
> > >
> > > > >
> > > > > 4)
> > > > > +# Check the file content. While both test_slot1 and test_slot2 should
> > > > > be reporting
> > > > > +# that they have unconsumed WAL records, test_slot3 should not be reported as
> > > > > +# it has caught up.
> > > > >
> > > > > Can you please elaborate the reason behind test_slot3 not being
> > > > > reported? Also mention in the comment if possible.
> > > >
> > > > We advance test_slot3 to the current WAL LSN before executing
> > > > pg_upgrade, so the test_slot3 should have consumed all pending WALs.
> > > > Please refer to the following changes:
> > >
> > > I understand the test, and the comments are clear to me. I also
> > > understand that only test_slot3 is expected to be in the caught-up
> > > state. My questions were specifically about the following points:
> > > 1)  Why do we expect 'slot3 caught-up' not to be mentioned in the LOG?
> > > Is it simply because there is no corresponding logging in the code, or
> > > is this behavior related to some aspect of your fix that I may have
> > > missed?
> > >
> > > 2) In general, we do not validate the absence of LOG messages in
> > > tests. Why is this considered a special case where such a check is
> > > appropriate?
> >
> > What LOG do you refer to? In these tests, we check the
> > invalid_logical_slots.txt file where pg_upgrade reports only invalid
> > slots (in terms of pg_upgrade). For test_slot3, it should not be
> > mentioned in that file as it has caught up. Given that the file has
> > only invalid slots, checking the absence of test_slot3 in the file
> > makes sense to me.
> >
>
> Okay, I get the intent now. Thanks!
>
> Other than the comment suggestion above, the patch LGTM.
>

Thank you for viewing the patch!

I'm going to push the patch after incorporating these review comments,
barring any objections.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: pg_upgrade: optimize replication slot caught-up check

От
Amit Kapila
Дата:
On Fri, Jan 23, 2026 at 2:04 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>

I haven't reviewed v7 in detail but while glancing, I noticed a few
minor comments:

1.
+ * Returns the last LSN decodable WAL record's LSN if found, otherwise
+ * returns InvalidXLogRecPtr.
  */
-bool
-LogicalReplicationSlotHasPendingWal(XLogRecPtr end_of_wal)
+XLogRecPtr
+LogicalReplicationSlotHasPendingWal(XLogRecPtr end_of_wal,
+ XLogRecPtr scan_cutoff_lsn)

The function name suggests that it will return boolean (due to 'Has'
in its name) but after this change that is not true.

2.
We
+ * also use the maximum confirmed_flush_lsn as an early scan
+ * cutoff; finding a decodable WAL record beyond this point
+ * implies that no slot has caught up.
+ *

In this comment, it is not clear if the maximum confirmed_flush_lsn is
among all logical slots (of current database) or what?

--
With Regards,
Amit Kapila.



Re: pg_upgrade: optimize replication slot caught-up check

От
Masahiko Sawada
Дата:
On Tue, Jan 27, 2026 at 3:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Jan 23, 2026 at 2:04 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
>
> I haven't reviewed v7 in detail but while glancing, I noticed a few
> minor comments:
>
> 1.
> + * Returns the last LSN decodable WAL record's LSN if found, otherwise
> + * returns InvalidXLogRecPtr.
>   */
> -bool
> -LogicalReplicationSlotHasPendingWal(XLogRecPtr end_of_wal)
> +XLogRecPtr
> +LogicalReplicationSlotHasPendingWal(XLogRecPtr end_of_wal,
> + XLogRecPtr scan_cutoff_lsn)
>
> The function name suggests that it will return boolean (due to 'Has'
> in its name) but after this change that is not true.
>
> 2.
> We
> + * also use the maximum confirmed_flush_lsn as an early scan
> + * cutoff; finding a decodable WAL record beyond this point
> + * implies that no slot has caught up.
> + *
>
> In this comment, it is not clear if the maximum confirmed_flush_lsn is
> among all logical slots (of current database) or what?
>

Thank you for the comments!

I've incorporated these comments I got so far.

Amit, are you planning to review the patch in detail? If so, I want to
incorporate comments before the push.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

Re: pg_upgrade: optimize replication slot caught-up check

От
Masahiko Sawada
Дата:
On Tue, Jan 27, 2026 at 12:04 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Jan 27, 2026 at 3:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Jan 23, 2026 at 2:04 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> >
> > I haven't reviewed v7 in detail but while glancing, I noticed a few
> > minor comments:
> >
> > 1.
> > + * Returns the last LSN decodable WAL record's LSN if found, otherwise
> > + * returns InvalidXLogRecPtr.
> >   */
> > -bool
> > -LogicalReplicationSlotHasPendingWal(XLogRecPtr end_of_wal)
> > +XLogRecPtr
> > +LogicalReplicationSlotHasPendingWal(XLogRecPtr end_of_wal,
> > + XLogRecPtr scan_cutoff_lsn)
> >
> > The function name suggests that it will return boolean (due to 'Has'
> > in its name) but after this change that is not true.
> >
> > 2.
> > We
> > + * also use the maximum confirmed_flush_lsn as an early scan
> > + * cutoff; finding a decodable WAL record beyond this point
> > + * implies that no slot has caught up.
> > + *
> >
> > In this comment, it is not clear if the maximum confirmed_flush_lsn is
> > among all logical slots (of current database) or what?
> >
>
> Thank you for the comments!
>
> I've incorporated these comments I got so far.
>
> Amit, are you planning to review the patch in detail? If so, I want to
> incorporate comments before the push.
>

I missed fixing one place. Attached the new version.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

Re: pg_upgrade: optimize replication slot caught-up check

От
Amit Kapila
Дата:
On Wed, Jan 28, 2026 at 2:06 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I missed fixing one place. Attached the new version.
>

One question/comment on following change:
+ bool use_fast_caught_up_check;
+
+ logical_slot_infos_query = get_old_cluster_logical_slot_infos_query(cluster,
+ &use_fast_caught_up_check);
+
  upgrade_task_add_step(task,
    logical_slot_infos_query,
    process_old_cluster_logical_slot_infos,
    true, NULL);
+
+ /*
+ * Check whether slots have consumed all WAL records efficiently by
+ * using another query, if not during a live_check.
+ */
+ if (use_fast_caught_up_check && !user_opts.live_check)
+ {

Won't this lead to two steps to set caught_up for slots in PG19 and
following versions? If so, is it possible to use just one step even
for PG19 and following versions?

--
With Regards,
Amit Kapila.



Re: pg_upgrade: optimize replication slot caught-up check

От
Masahiko Sawada
Дата:
On Wed, Jan 28, 2026 at 10:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Jan 28, 2026 at 2:06 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I missed fixing one place. Attached the new version.
> >
>
> One question/comment on following change:
> + bool use_fast_caught_up_check;
> +
> + logical_slot_infos_query = get_old_cluster_logical_slot_infos_query(cluster,
> + &use_fast_caught_up_check);
> +
>   upgrade_task_add_step(task,
>     logical_slot_infos_query,
>     process_old_cluster_logical_slot_infos,
>     true, NULL);
> +
> + /*
> + * Check whether slots have consumed all WAL records efficiently by
> + * using another query, if not during a live_check.
> + */
> + if (use_fast_caught_up_check && !user_opts.live_check)
> + {
>
> Won't this lead to two steps to set caught_up for slots in PG19 and
> following versions? If so, is it possible to use just one step even
> for PG19 and following versions?

Yes, it seems like a good simplification. I've updated the patch accordingly.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

Re: pg_upgrade: optimize replication slot caught-up check

От
shveta malik
Дата:
On Fri, Jan 30, 2026 at 2:15 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Jan 28, 2026 at 10:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Jan 28, 2026 at 2:06 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > I missed fixing one place. Attached the new version.
> > >
> >
> > One question/comment on following change:
> > + bool use_fast_caught_up_check;
> > +
> > + logical_slot_infos_query = get_old_cluster_logical_slot_infos_query(cluster,
> > + &use_fast_caught_up_check);
> > +
> >   upgrade_task_add_step(task,
> >     logical_slot_infos_query,
> >     process_old_cluster_logical_slot_infos,
> >     true, NULL);
> > +
> > + /*
> > + * Check whether slots have consumed all WAL records efficiently by
> > + * using another query, if not during a live_check.
> > + */
> > + if (use_fast_caught_up_check && !user_opts.live_check)
> > + {
> >
> > Won't this lead to two steps to set caught_up for slots in PG19 and
> > following versions? If so, is it possible to use just one step even
> > for PG19 and following versions?
>
> Yes, it seems like a good simplification. I've updated the patch accordingly.
>

At first glance it looks like a simplification, but on closer look, it
actually makes the code harder to follow and more prone to errors if
someone modifies it in the future.

The check_caught_up CTE is appended only when both conditions are met:

+ if (use_fast_caught_up_check && !user_opts.live_check)
+ appendPQExpBuffer(&query,
+   "WITH check_caught_up AS ( "

But  the reference to check_caught_up in the FROM clause is appended
when only one condition is met:

+ appendPQExpBuffer(&query,
+   "invalidation_reason IS NOT NULL as invalid "
+   "FROM pg_catalog.pg_replication_slots %s"
+   "WHERE slot_type = 'logical' AND "
+   "database = current_database() AND "
+   "temporary IS FALSE "
+   "ORDER BY 1;",
+   use_fast_caught_up_check ? ", check_caught_up " : "");

IIUC, if use_fast_caught_up_check is true while user_opts.live_check
is also true, the query will reference check_caught_up without
defining it, resulting in an incorrectly constructed query. Or am I
missing something here?

thanks
Shveta



Re: pg_upgrade: optimize replication slot caught-up check

От
Amit Kapila
Дата:
On Fri, Jan 30, 2026 at 9:45 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Fri, Jan 30, 2026 at 2:15 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Jan 28, 2026 at 10:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Wed, Jan 28, 2026 at 2:06 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > I missed fixing one place. Attached the new version.
> > > >
> > >
> > > One question/comment on following change:
> > > + bool use_fast_caught_up_check;
> > > +
> > > + logical_slot_infos_query = get_old_cluster_logical_slot_infos_query(cluster,
> > > + &use_fast_caught_up_check);
> > > +
> > >   upgrade_task_add_step(task,
> > >     logical_slot_infos_query,
> > >     process_old_cluster_logical_slot_infos,
> > >     true, NULL);
> > > +
> > > + /*
> > > + * Check whether slots have consumed all WAL records efficiently by
> > > + * using another query, if not during a live_check.
> > > + */
> > > + if (use_fast_caught_up_check && !user_opts.live_check)
> > > + {
> > >
> > > Won't this lead to two steps to set caught_up for slots in PG19 and
> > > following versions? If so, is it possible to use just one step even
> > > for PG19 and following versions?
> >
> > Yes, it seems like a good simplification. I've updated the patch accordingly.
> >
>
> At first glance it looks like a simplification, but on closer look, it
> actually makes the code harder to follow and more prone to errors if
> someone modifies it in the future.
>

I think that is primarily because of the way code is arranged by the
patch. I think it would be better to construct a complete query
separately for fast and non-fast checks. There will be some repeated
parts but the chances of mistakes will be less and it would be easier
to follow.

One minor point:
* Fetch the logical replication slot information. The check whether the
- * slot is considered caught up is done by an upgrade function. This
- * regards the slot as caught up if we don't find any decodable changes.
- * See binary_upgrade_logical_slot_has_caught_up().
+ * slot is considered caught up is done by an upgrade function, unless the
+ * fast check is available on the cluster.

Isn't the caught up check done by an upgrade function both for fast
and non-fast cases? If so, this comment needs to be improved to make
it clear.

--
With Regards,
Amit Kapila.



Re: pg_upgrade: optimize replication slot caught-up check

От
shveta malik
Дата:
On Fri, Jan 30, 2026 at 10:01 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Jan 30, 2026 at 9:45 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Fri, Jan 30, 2026 at 2:15 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Wed, Jan 28, 2026 at 10:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Wed, Jan 28, 2026 at 2:06 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > I missed fixing one place. Attached the new version.
> > > > >
> > > >
> > > > One question/comment on following change:
> > > > + bool use_fast_caught_up_check;
> > > > +
> > > > + logical_slot_infos_query = get_old_cluster_logical_slot_infos_query(cluster,
> > > > + &use_fast_caught_up_check);
> > > > +
> > > >   upgrade_task_add_step(task,
> > > >     logical_slot_infos_query,
> > > >     process_old_cluster_logical_slot_infos,
> > > >     true, NULL);
> > > > +
> > > > + /*
> > > > + * Check whether slots have consumed all WAL records efficiently by
> > > > + * using another query, if not during a live_check.
> > > > + */
> > > > + if (use_fast_caught_up_check && !user_opts.live_check)
> > > > + {
> > > >
> > > > Won't this lead to two steps to set caught_up for slots in PG19 and
> > > > following versions? If so, is it possible to use just one step even
> > > > for PG19 and following versions?
> > >
> > > Yes, it seems like a good simplification. I've updated the patch accordingly.
> > >
> >
> > At first glance it looks like a simplification, but on closer look, it
> > actually makes the code harder to follow and more prone to errors if
> > someone modifies it in the future.
> >
>
> I think that is primarily because of the way code is arranged by the
> patch. I think it would be better to construct a complete query
> separately for fast and non-fast checks. There will be some repeated
> parts but the chances of mistakes will be less and it would be easier
> to follow.
>

+1

> One minor point:
> * Fetch the logical replication slot information. The check whether the
> - * slot is considered caught up is done by an upgrade function. This
> - * regards the slot as caught up if we don't find any decodable changes.
> - * See binary_upgrade_logical_slot_has_caught_up().
> + * slot is considered caught up is done by an upgrade function, unless the
> + * fast check is available on the cluster.
>
> Isn't the caught up check done by an upgrade function both for fast
> and non-fast cases? If so, this comment needs to be improved to make
> it clear.
>
> --
> With Regards,
> Amit Kapila.



Re: pg_upgrade: optimize replication slot caught-up check

От
Masahiko Sawada
Дата:
On Thu, Jan 29, 2026 at 8:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Jan 30, 2026 at 9:45 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Fri, Jan 30, 2026 at 2:15 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Wed, Jan 28, 2026 at 10:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Wed, Jan 28, 2026 at 2:06 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > I missed fixing one place. Attached the new version.
> > > > >
> > > >
> > > > One question/comment on following change:
> > > > + bool use_fast_caught_up_check;
> > > > +
> > > > + logical_slot_infos_query = get_old_cluster_logical_slot_infos_query(cluster,
> > > > + &use_fast_caught_up_check);
> > > > +
> > > >   upgrade_task_add_step(task,
> > > >     logical_slot_infos_query,
> > > >     process_old_cluster_logical_slot_infos,
> > > >     true, NULL);
> > > > +
> > > > + /*
> > > > + * Check whether slots have consumed all WAL records efficiently by
> > > > + * using another query, if not during a live_check.
> > > > + */
> > > > + if (use_fast_caught_up_check && !user_opts.live_check)
> > > > + {
> > > >
> > > > Won't this lead to two steps to set caught_up for slots in PG19 and
> > > > following versions? If so, is it possible to use just one step even
> > > > for PG19 and following versions?
> > >
> > > Yes, it seems like a good simplification. I've updated the patch accordingly.
> > >
> >
> > At first glance it looks like a simplification, but on closer look, it
> > actually makes the code harder to follow and more prone to errors if
> > someone modifies it in the future.
> >
>
> I think that is primarily because of the way code is arranged by the
> patch. I think it would be better to construct a complete query
> separately for fast and non-fast checks. There will be some repeated
> parts but the chances of mistakes will be less and it would be easier
> to follow.

Agreed. I've updated that function accordingly.

>
> One minor point:
> * Fetch the logical replication slot information. The check whether the
> - * slot is considered caught up is done by an upgrade function. This
> - * regards the slot as caught up if we don't find any decodable changes.
> - * See binary_upgrade_logical_slot_has_caught_up().
> + * slot is considered caught up is done by an upgrade function, unless the
> + * fast check is available on the cluster.
>
> Isn't the caught up check done by an upgrade function both for fast
> and non-fast cases? If so, this comment needs to be improved to make
> it clear.

Fixed.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

Re: pg_upgrade: optimize replication slot caught-up check

От
shveta malik
Дата:
On Fri, Jan 30, 2026 at 1:34 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Jan 29, 2026 at 8:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Jan 30, 2026 at 9:45 AM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Fri, Jan 30, 2026 at 2:15 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Wed, Jan 28, 2026 at 10:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Wed, Jan 28, 2026 at 2:06 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > >
> > > > > > I missed fixing one place. Attached the new version.
> > > > > >
> > > > >
> > > > > One question/comment on following change:
> > > > > + bool use_fast_caught_up_check;
> > > > > +
> > > > > + logical_slot_infos_query = get_old_cluster_logical_slot_infos_query(cluster,
> > > > > + &use_fast_caught_up_check);
> > > > > +
> > > > >   upgrade_task_add_step(task,
> > > > >     logical_slot_infos_query,
> > > > >     process_old_cluster_logical_slot_infos,
> > > > >     true, NULL);
> > > > > +
> > > > > + /*
> > > > > + * Check whether slots have consumed all WAL records efficiently by
> > > > > + * using another query, if not during a live_check.
> > > > > + */
> > > > > + if (use_fast_caught_up_check && !user_opts.live_check)
> > > > > + {
> > > > >
> > > > > Won't this lead to two steps to set caught_up for slots in PG19 and
> > > > > following versions? If so, is it possible to use just one step even
> > > > > for PG19 and following versions?
> > > >
> > > > Yes, it seems like a good simplification. I've updated the patch accordingly.
> > > >
> > >
> > > At first glance it looks like a simplification, but on closer look, it
> > > actually makes the code harder to follow and more prone to errors if
> > > someone modifies it in the future.
> > >
> >
> > I think that is primarily because of the way code is arranged by the
> > patch. I think it would be better to construct a complete query
> > separately for fast and non-fast checks. There will be some repeated
> > parts but the chances of mistakes will be less and it would be easier
> > to follow.
>
> Agreed. I've updated that function accordingly.

LGTM.

> >
> > One minor point:
> > * Fetch the logical replication slot information. The check whether the
> > - * slot is considered caught up is done by an upgrade function. This
> > - * regards the slot as caught up if we don't find any decodable changes.
> > - * See binary_upgrade_logical_slot_has_caught_up().
> > + * slot is considered caught up is done by an upgrade function, unless the
> > + * fast check is available on the cluster.
> >
> > Isn't the caught up check done by an upgrade function both for fast
> > and non-fast cases? If so, this comment needs to be improved to make
> > it clear.
>
> Fixed.
>

thanks
Shveta



Re: pg_upgrade: optimize replication slot caught-up check

От
Masahiko Sawada
Дата:
On Fri, Jan 30, 2026 at 6:09 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Fri, Jan 30, 2026 at 1:34 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Jan 29, 2026 at 8:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Fri, Jan 30, 2026 at 9:45 AM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > On Fri, Jan 30, 2026 at 2:15 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > On Wed, Jan 28, 2026 at 10:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > >
> > > > > > On Wed, Jan 28, 2026 at 2:06 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > > >
> > > > > > > I missed fixing one place. Attached the new version.
> > > > > > >
> > > > > >
> > > > > > One question/comment on following change:
> > > > > > + bool use_fast_caught_up_check;
> > > > > > +
> > > > > > + logical_slot_infos_query = get_old_cluster_logical_slot_infos_query(cluster,
> > > > > > + &use_fast_caught_up_check);
> > > > > > +
> > > > > >   upgrade_task_add_step(task,
> > > > > >     logical_slot_infos_query,
> > > > > >     process_old_cluster_logical_slot_infos,
> > > > > >     true, NULL);
> > > > > > +
> > > > > > + /*
> > > > > > + * Check whether slots have consumed all WAL records efficiently by
> > > > > > + * using another query, if not during a live_check.
> > > > > > + */
> > > > > > + if (use_fast_caught_up_check && !user_opts.live_check)
> > > > > > + {
> > > > > >
> > > > > > Won't this lead to two steps to set caught_up for slots in PG19 and
> > > > > > following versions? If so, is it possible to use just one step even
> > > > > > for PG19 and following versions?
> > > > >
> > > > > Yes, it seems like a good simplification. I've updated the patch accordingly.
> > > > >
> > > >
> > > > At first glance it looks like a simplification, but on closer look, it
> > > > actually makes the code harder to follow and more prone to errors if
> > > > someone modifies it in the future.
> > > >
> > >
> > > I think that is primarily because of the way code is arranged by the
> > > patch. I think it would be better to construct a complete query
> > > separately for fast and non-fast checks. There will be some repeated
> > > parts but the chances of mistakes will be less and it would be easier
> > > to follow.
> >
> > Agreed. I've updated that function accordingly.
>
> LGTM.

Thank you for reviewing the patch!

I'm going to push the patch unless there are further review comments.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: pg_upgrade: optimize replication slot caught-up check

От
Masahiko Sawada
Дата:
On Wed, Feb 4, 2026 at 9:41 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Jan 30, 2026 at 6:09 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Fri, Jan 30, 2026 at 1:34 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Thu, Jan 29, 2026 at 8:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Fri, Jan 30, 2026 at 9:45 AM shveta malik <shveta.malik@gmail.com> wrote:
> > > > >
> > > > > On Fri, Jan 30, 2026 at 2:15 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > >
> > > > > > On Wed, Jan 28, 2026 at 10:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > > >
> > > > > > > On Wed, Jan 28, 2026 at 2:06 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > > > >
> > > > > > > > I missed fixing one place. Attached the new version.
> > > > > > > >
> > > > > > >
> > > > > > > One question/comment on following change:
> > > > > > > + bool use_fast_caught_up_check;
> > > > > > > +
> > > > > > > + logical_slot_infos_query = get_old_cluster_logical_slot_infos_query(cluster,
> > > > > > > + &use_fast_caught_up_check);
> > > > > > > +
> > > > > > >   upgrade_task_add_step(task,
> > > > > > >     logical_slot_infos_query,
> > > > > > >     process_old_cluster_logical_slot_infos,
> > > > > > >     true, NULL);
> > > > > > > +
> > > > > > > + /*
> > > > > > > + * Check whether slots have consumed all WAL records efficiently by
> > > > > > > + * using another query, if not during a live_check.
> > > > > > > + */
> > > > > > > + if (use_fast_caught_up_check && !user_opts.live_check)
> > > > > > > + {
> > > > > > >
> > > > > > > Won't this lead to two steps to set caught_up for slots in PG19 and
> > > > > > > following versions? If so, is it possible to use just one step even
> > > > > > > for PG19 and following versions?
> > > > > >
> > > > > > Yes, it seems like a good simplification. I've updated the patch accordingly.
> > > > > >
> > > > >
> > > > > At first glance it looks like a simplification, but on closer look, it
> > > > > actually makes the code harder to follow and more prone to errors if
> > > > > someone modifies it in the future.
> > > > >
> > > >
> > > > I think that is primarily because of the way code is arranged by the
> > > > patch. I think it would be better to construct a complete query
> > > > separately for fast and non-fast checks. There will be some repeated
> > > > parts but the chances of mistakes will be less and it would be easier
> > > > to follow.
> > >
> > > Agreed. I've updated that function accordingly.
> >
> > LGTM.
>
> Thank you for reviewing the patch!
>
> I'm going to push the patch unless there are further review comments.
>

Pushed.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com