Обсуждение: 010_pg_basebackup.pl vs multiple filesystems

Поиск
Список
Период
Сортировка

010_pg_basebackup.pl vs multiple filesystems

От
Andres Freund
Дата:
Hi,

While working on [1] I encountered the issue that, on github-actions,
010_pg_basebackup.pl fails on windows.

The reason for that is that github actions uses two drives, with TMP/TEMP
located on c:, the tested code on d:.  This causes the following code to fail:

  # Create a temporary directory in the system location.
  my $sys_tempdir = PostgreSQL::Test::Utils::tempdir_short;

  # On Windows use the short location to avoid path length issues.
  # Elsewhere use $tempdir to avoid file system boundary issues with moving.
  my $tmploc = $windows_os ? $sys_tempdir : $tempdir;

  rename("$pgdata/pg_replslot", "$tmploc/pg_replslot")
    or BAIL_OUT "could not move $pgdata/pg_replslot";
  dir_symlink("$tmploc/pg_replslot", "$pgdata/pg_replslot")
    or BAIL_OUT "could not symlink to $pgdata/pg_replslot";

It's not possible to move (vs copy) a file between two filesystems, on most
operating systems. Which thus leads to:

[23:02:03.114](0.288s) Bail out!  could not move
D:\a\winpgbuild\winpgbuild\postgresql-17beta2\build/testrun/pg_basebackup/010_pg_basebackup\data/t_010_pg_basebackup_main_data/pgdata/pg_replslot


This logic was added in

commit e213de8e785aac4e2ebc44282b8dc0fcc74834e8
Author: Andrew Dunstan <andrew@dunslane.net>
Date:   2023-07-08 11:21:58 -0400

    Use shorter location for pg_replslot in pg_basebackup test

and revised in

commit e9f15bc9db7564a29460d089c0917590bc13fffc
Author: Andrew Dunstan <andrew@dunslane.net>
Date:   2023-07-08 12:34:25 -0400

    Fix tmpdir issues with commit e213de8e78

The latter deals with precisely this issue, for !windows.


I've temporarily worked around this by setting TMP/TEMP to something else, but
ISTM we need a better solution.

Greetings,

Andres Freund

[1] https://www.postgresql.org/message-id/CA%2BOCxowQhMHFNRLTsXNuJpC96KRtSPHYKJuOS%3Db-Zrwmy-P4-g%40mail.gmail.com
[2] https://postgr.es/m/666ac55b-3400-fb2c-2cea-0281bf36a53c%40dunslane.net



Re: 010_pg_basebackup.pl vs multiple filesystems

От
Andrew Dunstan
Дата:
On 2024-07-07 Su 3:02 AM, Andres Freund wrote:
> Hi,
>
> While working on [1] I encountered the issue that, on github-actions,
> 010_pg_basebackup.pl fails on windows.
>
> The reason for that is that github actions uses two drives, with TMP/TEMP
> located on c:, the tested code on d:.  This causes the following code to fail:
>
>    # Create a temporary directory in the system location.
>    my $sys_tempdir = PostgreSQL::Test::Utils::tempdir_short;
>
>    # On Windows use the short location to avoid path length issues.
>    # Elsewhere use $tempdir to avoid file system boundary issues with moving.
>    my $tmploc = $windows_os ? $sys_tempdir : $tempdir;
>
>    rename("$pgdata/pg_replslot", "$tmploc/pg_replslot")
>      or BAIL_OUT "could not move $pgdata/pg_replslot";
>    dir_symlink("$tmploc/pg_replslot", "$pgdata/pg_replslot")
>      or BAIL_OUT "could not symlink to $pgdata/pg_replslot";
>
> It's not possible to move (vs copy) a file between two filesystems, on most
> operating systems. Which thus leads to:
>
> [23:02:03.114](0.288s) Bail out!  could not move
D:\a\winpgbuild\winpgbuild\postgresql-17beta2\build/testrun/pg_basebackup/010_pg_basebackup\data/t_010_pg_basebackup_main_data/pgdata/pg_replslot
>
>
> This logic was added in
>
> commit e213de8e785aac4e2ebc44282b8dc0fcc74834e8
> Author: Andrew Dunstan <andrew@dunslane.net>
> Date:   2023-07-08 11:21:58 -0400
>
>      Use shorter location for pg_replslot in pg_basebackup test
>
> and revised in
>
> commit e9f15bc9db7564a29460d089c0917590bc13fffc
> Author: Andrew Dunstan <andrew@dunslane.net>
> Date:   2023-07-08 12:34:25 -0400
>
>      Fix tmpdir issues with commit e213de8e78
>
> The latter deals with precisely this issue, for !windows.
>
>
> I've temporarily worked around this by setting TMP/TEMP to something else, but
> ISTM we need a better solution.


I'll be happy to hear of one. I agree it's a mess.  Maybe we could test 
that the temp directory is on the same device on Windows and skip the 
test if not? You could still get the test to run by setting TMPDIR 
and/or friends.


cheers


andrew




--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: 010_pg_basebackup.pl vs multiple filesystems

От
Andrew Dunstan
Дата:
On 2024-07-07 Su 7:28 AM, Andrew Dunstan wrote:
>
> On 2024-07-07 Su 3:02 AM, Andres Freund wrote:
>> Hi,
>>
>> While working on [1] I encountered the issue that, on github-actions,
>> 010_pg_basebackup.pl fails on windows.
>>
>> The reason for that is that github actions uses two drives, with 
>> TMP/TEMP
>> located on c:, the tested code on d:.  This causes the following code 
>> to fail:
>>
>>    # Create a temporary directory in the system location.
>>    my $sys_tempdir = PostgreSQL::Test::Utils::tempdir_short;
>>
>>    # On Windows use the short location to avoid path length issues.
>>    # Elsewhere use $tempdir to avoid file system boundary issues with 
>> moving.
>>    my $tmploc = $windows_os ? $sys_tempdir : $tempdir;
>>
>>    rename("$pgdata/pg_replslot", "$tmploc/pg_replslot")
>>      or BAIL_OUT "could not move $pgdata/pg_replslot";
>>    dir_symlink("$tmploc/pg_replslot", "$pgdata/pg_replslot")
>>      or BAIL_OUT "could not symlink to $pgdata/pg_replslot";
>>
>> It's not possible to move (vs copy) a file between two filesystems, 
>> on most
>> operating systems. Which thus leads to:
>>
>> [23:02:03.114](0.288s) Bail out!  could not move 
>>
D:\a\winpgbuild\winpgbuild\postgresql-17beta2\build/testrun/pg_basebackup/010_pg_basebackup\data/t_010_pg_basebackup_main_data/pgdata/pg_replslot
>>
>>
>> This logic was added in
>>
>> commit e213de8e785aac4e2ebc44282b8dc0fcc74834e8
>> Author: Andrew Dunstan <andrew@dunslane.net>
>> Date:   2023-07-08 11:21:58 -0400
>>
>>      Use shorter location for pg_replslot in pg_basebackup test
>>
>> and revised in
>>
>> commit e9f15bc9db7564a29460d089c0917590bc13fffc
>> Author: Andrew Dunstan <andrew@dunslane.net>
>> Date:   2023-07-08 12:34:25 -0400
>>
>>      Fix tmpdir issues with commit e213de8e78
>>
>> The latter deals with precisely this issue, for !windows.
>>
>>
>> I've temporarily worked around this by setting TMP/TEMP to something 
>> else, but
>> ISTM we need a better solution.
>
>
> I'll be happy to hear of one. I agree it's a mess.  Maybe we could 
> test that the temp directory is on the same device on Windows and skip 
> the test if not? You could still get the test to run by setting TMPDIR 
> and/or friends.
>
>
>

Maybe we should just not try to rename the directory. Looking at the 
test I'm pretty sure the directory should be empty. Instead of trying to 
move it, let's just remove it, and recreate it in the tmp location.

Something like


diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl 
b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index 489dde4adf..c0c334c6fc 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -363,8 +363,8 @@ my $sys_tempdir = 
PostgreSQL::Test::Utils::tempdir_short;
  # Elsewhere use $tempdir to avoid file system boundary issues with moving.
  my $tmploc = $windows_os ? $sys_tempdir : $tempdir;

-rename("$pgdata/pg_replslot", "$tmploc/pg_replslot")
-  or BAIL_OUT "could not move $pgdata/pg_replslot";
+rmtree("$pgdata/pg_replslot");
+mkdir ("$tmploc/pg_replslot", 0700);
  dir_symlink("$tmploc/pg_replslot", "$pgdata/pg_replslot")
    or BAIL_OUT "could not symlink to $pgdata/pg_replslot";



cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: 010_pg_basebackup.pl vs multiple filesystems

От
Andres Freund
Дата:
Hi,

On 2024-07-07 09:10:48 -0400, Andrew Dunstan wrote:
> On 2024-07-07 Su 7:28 AM, Andrew Dunstan wrote:
> > I'll be happy to hear of one. I agree it's a mess.  Maybe we could test
> > that the temp directory is on the same device on Windows and skip the
> > test if not? You could still get the test to run by setting TMPDIR
> > and/or friends.

> Maybe we should just not try to rename the directory. Looking at the test
> I'm pretty sure the directory should be empty. Instead of trying to move it,
> let's just remove it, and recreate it in the tmp location.

Good catch, yes, that'd be much better!


> diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
> b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
> index 489dde4adf..c0c334c6fc 100644
> --- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
> +++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
> @@ -363,8 +363,8 @@ my $sys_tempdir =
> PostgreSQL::Test::Utils::tempdir_short;
>  # Elsewhere use $tempdir to avoid file system boundary issues with moving.
>  my $tmploc = $windows_os ? $sys_tempdir : $tempdir;

The comment would need a bit of editing, I guess. I think we should consider
just getting rid of the os-dependant switch now, it shouldn't be needed
anymore?


> -rename("$pgdata/pg_replslot", "$tmploc/pg_replslot")
> -  or BAIL_OUT "could not move $pgdata/pg_replslot";
> +rmtree("$pgdata/pg_replslot");

Should this perhaps be an rmdir, to ensure that we're not removing something
we don't want (e.g. somebody adding an earlier test for slots that then gets
broken by the rmtree)?

Greetings,

Andres Freund



Re: 010_pg_basebackup.pl vs multiple filesystems

От
Robert Haas
Дата:
On Sun, Jul 7, 2024 at 3:02 AM Andres Freund <andres@anarazel.de> wrote:
> While working on [1] I encountered the issue that, on github-actions,
> 010_pg_basebackup.pl fails on windows.
>
> The reason for that is that github actions uses two drives, with TMP/TEMP
> located on c:, the tested code on d:.  This causes the following code to fail:
>
>   # Create a temporary directory in the system location.
>   my $sys_tempdir = PostgreSQL::Test::Utils::tempdir_short;

Whatever we end up doing about this, it would be a good idea to check
the other places that use tempdir_short and see if they also need
adjustment.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: 010_pg_basebackup.pl vs multiple filesystems

От
Andrew Dunstan
Дата:
On 2024-07-08 Mo 4:45 PM, Robert Haas wrote:
> On Sun, Jul 7, 2024 at 3:02 AM Andres Freund <andres@anarazel.de> wrote:
>> While working on [1] I encountered the issue that, on github-actions,
>> 010_pg_basebackup.pl fails on windows.
>>
>> The reason for that is that github actions uses two drives, with TMP/TEMP
>> located on c:, the tested code on d:.  This causes the following code to fail:
>>
>>    # Create a temporary directory in the system location.
>>    my $sys_tempdir = PostgreSQL::Test::Utils::tempdir_short;
> Whatever we end up doing about this, it would be a good idea to check
> the other places that use tempdir_short and see if they also need
> adjustment.


I don't think it's a problem. There are lots of systems that have 
tempdir on a different device. That's why we previously saw lots of 
errors from this code, resulting in the present incomplete workaround. 
The fact that we haven't seen such errors from other tests means we 
should be ok.


cheers


andrew

-- 

Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: 010_pg_basebackup.pl vs multiple filesystems

От
Andres Freund
Дата:
Hi,

On 2024-07-08 16:45:42 -0400, Robert Haas wrote:
> On Sun, Jul 7, 2024 at 3:02 AM Andres Freund <andres@anarazel.de> wrote:
> > While working on [1] I encountered the issue that, on github-actions,
> > 010_pg_basebackup.pl fails on windows.
> >
> > The reason for that is that github actions uses two drives, with TMP/TEMP
> > located on c:, the tested code on d:.  This causes the following code to fail:
> >
> >   # Create a temporary directory in the system location.
> >   my $sys_tempdir = PostgreSQL::Test::Utils::tempdir_short;
> 
> Whatever we end up doing about this, it would be a good idea to check
> the other places that use tempdir_short and see if they also need
> adjustment.

FWIW, this was the only test that failed due to TMP being on a separate
partition.

I couldn't quite enable all tests (PG_TEST_EXTRA including libpq_encryption
fails due to some gssapi issue, kerberos fails due to paths being wrong / not
set up for windows, load_balance because I haven't set up hostnames), but I
don't think any of the issues around those tests are related.

I did also grep for other uses of tmpdir_short, they all seem to be
differently motivated.

I also looked for uses of rename() in tests. While
src/bin/pg_verifybackup/t/007_wal.pl does move stuff around, it afaict is
always going to be on the same filesystem.

Greetings,

Andres Freund



Re: 010_pg_basebackup.pl vs multiple filesystems

От
Andrew Dunstan
Дата:
On 2024-07-08 Mo 4:31 PM, Andres Freund wrote:
> Hi,
>
> On 2024-07-07 09:10:48 -0400, Andrew Dunstan wrote:
>> On 2024-07-07 Su 7:28 AM, Andrew Dunstan wrote:
>>> I'll be happy to hear of one. I agree it's a mess.  Maybe we could test
>>> that the temp directory is on the same device on Windows and skip the
>>> test if not? You could still get the test to run by setting TMPDIR
>>> and/or friends.
>> Maybe we should just not try to rename the directory. Looking at the test
>> I'm pretty sure the directory should be empty. Instead of trying to move it,
>> let's just remove it, and recreate it in the tmp location.
> Good catch, yes, that'd be much better!
>
>
>> diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
>> b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
>> index 489dde4adf..c0c334c6fc 100644
>> --- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
>> +++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
>> @@ -363,8 +363,8 @@ my $sys_tempdir =
>> PostgreSQL::Test::Utils::tempdir_short;
>>   # Elsewhere use $tempdir to avoid file system boundary issues with moving.
>>   my $tmploc = $windows_os ? $sys_tempdir : $tempdir;
> The comment would need a bit of editing, I guess. I think we should consider
> just getting rid of the os-dependant switch now, it shouldn't be needed
> anymore?
>
>
>> -rename("$pgdata/pg_replslot", "$tmploc/pg_replslot")
>> -  or BAIL_OUT "could not move $pgdata/pg_replslot";
>> +rmtree("$pgdata/pg_replslot");
> Should this perhaps be an rmdir, to ensure that we're not removing something
> we don't want (e.g. somebody adding an earlier test for slots that then gets
> broken by the rmtree)?
>


OK, done like this.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: 010_pg_basebackup.pl vs multiple filesystems

От
Andres Freund
Дата:
On 2024-07-08 17:45:16 -0400, Andrew Dunstan wrote:
> OK, done like this.

Thanks!