Обсуждение: Assistance Needed: Issue with pg_upgrade and --link option

Поиск
Список
Период
Сортировка

Assistance Needed: Issue with pg_upgrade and --link option

От
Pradeep Kumar
Дата:
Dear Postgres Hackers,

I hope this email finds you well. I am currently facing an issue while performing an upgrade using the pg_upgrade utility with the --link option. I was under the impression that the --link option would create hard links between the old and new cluster's data files, but it appears that the entire old cluster data was copied to the new cluster, resulting in a significant increase in the new cluster's size.

Here are the details of my scenario:
- PostgreSQL version: [Old Version: Postgres 11.4  | New Version: Postgres 14.0]
- Command used for pg_upgrade: [~/pg_upgrade_testing/postgres_14/bin/pg_upgrade -b ~/pg_upgrade_testing/postgres_11.4/bin -B ~/pg_upgrade_testing/postgres_14/bin -d ~/pg_upgrade_testing/postgres_11.4/replica_db2 -D ~/pg_upgrade_testing/postgres_14/new_pg  -r -k 
- Paths to the old and new data directories: [~/pg_upgrade_testing/postgres_11.4/replica_db2] [~/pg_upgrade_testing/postgres_14/new_pg]
- OS information: [Ubuntu 22.04.2 linux]

However, after executing the pg_upgrade command with the --link option, I observed that the size of the new cluster is much larger than expected. I expected the --link option to create hard links instead of duplicating the data files.

I am seeking assistance to understand the following:
1. Is my understanding of the --link option correct?
2. Is there any additional configuration or step required to properly utilize the --link option?
3. Are there any limitations or considerations specific to my PostgreSQL version or file system that I should be aware of?

Any guidance, clarification, or troubleshooting steps you can provide would be greatly appreciated. I want to ensure that I am utilizing the --link option correctly and optimize the upgrade process.

Best regards,
Pradeep Kumar

Re: Assistance Needed: Issue with pg_upgrade and --link option

От
Laurenz Albe
Дата:
On Wed, 2023-06-28 at 11:49 +0530, Pradeep Kumar wrote:
> I was under the impression that the --link option would create hard links between the
> old and new cluster's data files, but it appears that the entire old cluster data was
> copied to the new cluster, resulting in a significant increase in the new cluster's size.

Please provide some numbers, ideally

  du -sk <old_data_directory> <new_data_directory>

Yours,
Laurenz Albe



Re: Assistance Needed: Issue with pg_upgrade and --link option

От
Peter Eisentraut
Дата:
On 28.06.23 08:24, Laurenz Albe wrote:
> On Wed, 2023-06-28 at 11:49 +0530, Pradeep Kumar wrote:
>> I was under the impression that the --link option would create hard links between the
>> old and new cluster's data files, but it appears that the entire old cluster data was
>> copied to the new cluster, resulting in a significant increase in the new cluster's size.
> 
> Please provide some numbers, ideally
> 
>    du -sk <old_data_directory> <new_data_directory>

I don't think you can observe the effects of the --link option this way. 
  It would just give you the full size count for both directories, even 
though the point to the same underlying inodes.

To see the effect, you could perhaps use `df` to see how much overall 
disk space the upgrade step eats up.




Re: Assistance Needed: Issue with pg_upgrade and --link option

От
Pradeep Kumar
Дата:
Sure,
du -sk ~/pradeep_test/pg_upgrade_testing/postgres_11.4/master ~/pradeep_test/pg_upgrade_testing/postgres_14/new_pg
11224524 /home/test/pradeep_test/pg_upgrade_testing/postgres_11.4/master
41952 /home/test/pradeep_test/pg_upgrade_testing/postgres_14/new_pg

On Wed, Jun 28, 2023 at 11:54 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Wed, 2023-06-28 at 11:49 +0530, Pradeep Kumar wrote:
> I was under the impression that the --link option would create hard links between the
> old and new cluster's data files, but it appears that the entire old cluster data was
> copied to the new cluster, resulting in a significant increase in the new cluster's size.

Please provide some numbers, ideally

  du -sk <old_data_directory> <new_data_directory>

Yours,
Laurenz Albe

Re: Assistance Needed: Issue with pg_upgrade and --link option

От
Pradeep Kumar
Дата:
This is my numbers.
 df  ~/pradeep_test/pg_upgrade_testing/postgres_11.4/master ~/pradeep_test/pg_upgrade_testing/postgres_14/new_pg
Filesystem                  1K-blocks      Used Available Use% Mounted on
/dev/mapper/nvme0n1p4_crypt 375161856 102253040 270335920  28% /home
/dev/mapper/nvme0n1p4_crypt 375161856 102253040 270335920  28% /home

On Wed, Jun 28, 2023 at 3:14 PM Peter Eisentraut <peter@eisentraut.org> wrote:
On 28.06.23 08:24, Laurenz Albe wrote:
> On Wed, 2023-06-28 at 11:49 +0530, Pradeep Kumar wrote:
>> I was under the impression that the --link option would create hard links between the
>> old and new cluster's data files, but it appears that the entire old cluster data was
>> copied to the new cluster, resulting in a significant increase in the new cluster's size.
>
> Please provide some numbers, ideally
>
>    du -sk <old_data_directory> <new_data_directory>

I don't think you can observe the effects of the --link option this way.
  It would just give you the full size count for both directories, even
though the point to the same underlying inodes.

To see the effect, you could perhaps use `df` to see how much overall
disk space the upgrade step eats up.

Re: Assistance Needed: Issue with pg_upgrade and --link option

От
Laurenz Albe
Дата:
On Wed, 2023-06-28 at 15:40 +0530, Pradeep Kumar wrote:
> > > I was under the impression that the --link option would create hard links between the
> > > old and new cluster's data files, but it appears that the entire old cluster data was
> > > copied to the new cluster, resulting in a significant increase in the new cluster's size.
> >
> > Please provide some numbers, ideally
> >
> >   du -sk <old_data_directory> <new_data_directory>
>
> du -sk ~/pradeep_test/pg_upgrade_testing/postgres_11.4/master ~/pradeep_test/pg_upgrade_testing/postgres_14/new_pg
> 11224524 /home/test/pradeep_test/pg_upgrade_testing/postgres_11.4/master
> 41952 /home/test/pradeep_test/pg_upgrade_testing/postgres_14/new_pg

That looks fine.  The files exist only once, and the 41MB that only exist in
the new data directory are catalog data and other stuff that is different
on the new cluster.

Yours,
Laurenz Albe



Re: Assistance Needed: Issue with pg_upgrade and --link option

От
Peter Eisentraut
Дата:
On 28.06.23 12:46, Laurenz Albe wrote:
> On Wed, 2023-06-28 at 15:40 +0530, Pradeep Kumar wrote:
>>>> I was under the impression that the --link option would create hard links between the
>>>> old and new cluster's data files, but it appears that the entire old cluster data was
>>>> copied to the new cluster, resulting in a significant increase in the new cluster's size.
>>>
>>> Please provide some numbers, ideally
>>>
>>>    du -sk <old_data_directory> <new_data_directory>
>>
>> du -sk ~/pradeep_test/pg_upgrade_testing/postgres_11.4/master ~/pradeep_test/pg_upgrade_testing/postgres_14/new_pg
>> 11224524 /home/test/pradeep_test/pg_upgrade_testing/postgres_11.4/master
>> 41952 /home/test/pradeep_test/pg_upgrade_testing/postgres_14/new_pg
> 
> That looks fine.  The files exist only once, and the 41MB that only exist in
> the new data directory are catalog data and other stuff that is different
> on the new cluster.

Interesting, so it actually does count files with multiple hardlinks 
only once.