Обсуждение: BUG #18089: Orphaned Rows During PostgreSQL Data Migration

Поиск
Список
Период
Сортировка

BUG #18089: Orphaned Rows During PostgreSQL Data Migration

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      18089
Logged by:          Root Cause
Email address:      rootcause000@gmail.com
PostgreSQL version: Unsupported/Unknown
Operating system:   Microsoft Windows Server 2019 Standard
Description:

Version - PostgreSQL 10.21, compiled by Visual C++ build 1800, 64-bit

Platform - Windows

When migrating data from PostgreSQL 10.21 to 14.7, during the dump process,
we encountered an issue with orphaned rows. Upon checking the source
database, we confirmed that the table has all the necessary constraints in
place. However, it appears that orphaned rows are still present.

GenDataDB=> \d+ nf_intz;
                                       Table "public.nf_intz"
      Column       |  Type  | Collation | Nullable |      Default      |
Storage  | Stats target | Description
-------------------+--------+-----------+----------+-------------------+----------+--------------+-------------
     intz_id       | bigint |           | not null |                   |
plain    |              |
Indexes:
    "nf_intz_pk" PRIMARY KEY, btree (intz_id)
    "nf_intz_fk1_idx" btree (intz_id)
Foreign-key constraints:
    "nf_intz_fk1" FOREIGN KEY (intz_id) REFERENCES id_genz(uq_id) ON DELETE
CASCADE


GenDataDB=> \d+ id_genz;
                                  Table "public.id_genz"
 Column |  Type  | Collation | Nullable | Default | Storage | Stats target |
Description
--------+--------+-----------+----------+---------+---------+--------------+-------------
  uq_id | bigint |           | not null |         | plain   |
|
Indexes:
    "id_genz_pk" PRIMARY KEY, btree (uq_id)
Referenced by:
    TABLE "nf_intz" CONSTRAINT "nf_intz_fk1" FOREIGN KEY (intz_id)
REFERENCES id_genz(uq_id) ON DELETE CASCADE


GenDataDB=> select intz_id from nf_intz where intz_id not in (select uq_id
from id_genz);
  intz_id
-----------
  5000001
  5000002
  5000003
  5000004
(4 rows)


GenDataDB=> reindex table nf_intz;
REINDEX
GenDataDB=> reindex table id_genz;
REINDEX
GenDataDB=> select intz_id from nf_intz where intz_id not in (select uq_id
from id_genz);
  intz_id
-----------
  5000001
  5000002
  5000003
  5000004
(4 rows)


GenDataDB=>

For now, we have proceeded with the data migration by adding the missing
entries to the parent table. Nevertheless, please advise if there is still a
possibility of encountering orphaned rows even when constraints are in
place. Is there a way to prevent such occurrences in the future?


Re: BUG #18089: Orphaned Rows During PostgreSQL Data Migration

От
David Rowley
Дата:
On Wed, 6 Sept 2023 at 02:02, PG Bug reporting form
<noreply@postgresql.org> wrote:
> When migrating data from PostgreSQL 10.21 to 14.7, during the dump process,
> we encountered an issue with orphaned rows. Upon checking the source
> database, we confirmed that the table has all the necessary constraints in
> place. However, it appears that orphaned rows are still present.


> GenDataDB=> reindex table nf_intz;
> REINDEX
> GenDataDB=> reindex table id_genz;
> REINDEX
> GenDataDB=> select intz_id from nf_intz where intz_id not in (select uq_id
> from id_genz);
>   intz_id
> -----------
>   5000001
>   5000002
>   5000003
>   5000004
> (4 rows)

If this is the same system as bug report #18084 [1], then it's
probable that it's caused by index corruption.  If the foreign key is
set up to cascade deletes then if the referencing table's index
that'll be selected by the planner to perform the cascade delete is
corrupted, having missing records, then the referencing records won't
be deleted as the index scan may not find the records to delete
despite them existing in the table's heap.

> For now, we have proceeded with the data migration by adding the missing
> entries to the parent table. Nevertheless, please advise if there is still a
> possibility of encountering orphaned rows even when constraints are in
> place. Is there a way to prevent such occurrences in the future?

The same advice I gave in [2] applies here too.  I'd say your key
learnings here should be to upgrade when minor versions are released
and follow the advice given in the release notes.  The fact that
you're running 10.21 gives some indication to us that this wasn't a
priority. Version 10 went EOL 10 months ago and 10.21 is 16 months old
now.

I suggest you go off and read over the release notes starting at 10.1,
or whichever the first version of 10.x you used and try to determine
which bugs that have been fixed may apply to you.

David

[1] https://www.postgresql.org/message-id/18084-f96f8e0e21fc3bea%40postgresql.org
[2] https://www.postgresql.org/message-id/CAApHDvq2-Qiy5RFGznYimAG1kKxcqb6uqm8-3qkznyxUxpa8Fg@mail.gmail.com