Обсуждение: drop tablespace failed when location contains .. on win32

Поиск
Список
Период
Сортировка

drop tablespace failed when location contains .. on win32

От
"wangsh.fnst@fujitsu.com"
Дата:
Hi,

I find a problem related to tablespace on win32(server2019).

> postgres=# create tablespace tbs location 'C:\Users\postgres\postgres_install\aa\..\aa';
> CREATE TABLESPACE
> postgres=# create table tbl(col int) tablespace tbs;
> ERROR:  could not stat directory "pg_tblspc/16384/PG_15_202109061/12754": Invalid argument
> postgres=# drop tablespace tbs;
> WARNING:  could not open directory "pg_tblspc/16384/PG_15_202109061": No such file or directory
> ERROR:  could not stat file "pg_tblspc/16384": Invalid argument

I find that canonicalize_path() only remove the trailing '..',  in this case,  '..' is not removed , and
pgsymlink succeed.

But, in fact, if I double click the dir (%PGDATA%\pg_tblspac\16387),  the error message is shown:
> The filename, directory name, or volume label syntax is incorrect.

Since the pgsymlink() seems right and I'm not sure I can change the action of canonicalize_path,
so I want to add a error check(patch is attached).

Any comment ?

Regards,
Shenhao Wang



Вложения

Re: drop tablespace failed when location contains .. on win32

От
Andrew Dunstan
Дата:
On 9/8/21 6:16 AM, wangsh.fnst@fujitsu.com wrote:
> Hi,
>
> I find a problem related to tablespace on win32(server2019).
>
>> postgres=# create tablespace tbs location 'C:\Users\postgres\postgres_install\aa\..\aa';
>> CREATE TABLESPACE
>> postgres=# create table tbl(col int) tablespace tbs;
>> ERROR:  could not stat directory "pg_tblspc/16384/PG_15_202109061/12754": Invalid argument
>> postgres=# drop tablespace tbs;
>> WARNING:  could not open directory "pg_tblspc/16384/PG_15_202109061": No such file or directory
>> ERROR:  could not stat file "pg_tblspc/16384": Invalid argument
> I find that canonicalize_path() only remove the trailing '..',  in this case,  '..' is not removed , and 
> pgsymlink succeed.


That seems like a bug. It's not very canonical :-)


cheers


andrew

-- 

Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: drop tablespace failed when location contains .. on win32

От
Michael Paquier
Дата:
On Wed, Sep 08, 2021 at 08:54:25AM -0400, Andrew Dunstan wrote:
> That seems like a bug. It's not very canonical :-)

Yes, better to fix that.  I fear that more places are impacted than
just the tablespace code paths.
--
Michael

Вложения

RE: drop tablespace failed when location contains .. on win32

От
"wangsh.fnst@fujitsu.com"
Дата:
Hi,

> -----Original Message-----
> From: Michael Paquier <michael@paquier.xyz>
>
> On Wed, Sep 08, 2021 at 08:54:25AM -0400, Andrew Dunstan wrote:
> > That seems like a bug. It's not very canonical :-)
>
> Yes, better to fix that.  I fear that more places are impacted than
> just the tablespace code paths.
> --
> Michael

Do you mean changing the action of canonicalize_path(), like remove all the (..) ?

I'm willing to fix this problem.

Regards
Shenhao Wang



Re: drop tablespace failed when location contains .. on win32

От
Michael Paquier
Дата:
On Thu, Sep 09, 2021 at 02:35:52AM +0000, wangsh.fnst@fujitsu.com wrote:
> Do you mean changing the action of canonicalize_path(), like remove all the (..) ?
>
> I'm willing to fix this problem.

Looking at canonicalize_path(), we have already some logic around
pending_strips to remove paths when we find a "/.." in the path, so
that's a matter of adjusting this area to trim properly the previous
directory.

On *nix platforms, we don't apply this much caution either, say a
simple /tmp/path/../path/ results in this same path used in the link
from pg_tblspc.  But we are speaking about Windows here, and junction
points.

Based on the lack of complains over the years, that does not seem
really worth backpatching.  Just my 2c on this point.
--
Michael

Вложения

Re: drop tablespace failed when location contains .. on win32

От
Kyotaro Horiguchi
Дата:
At Thu, 9 Sep 2021 12:44:45 +0900, Michael Paquier <michael@paquier.xyz> wrote in 
> On Thu, Sep 09, 2021 at 02:35:52AM +0000, wangsh.fnst@fujitsu.com wrote:
> > Do you mean changing the action of canonicalize_path(), like remove all the (..) ?
> > 
> > I'm willing to fix this problem.
> 
> Looking at canonicalize_path(), we have already some logic around
> pending_strips to remove paths when we find a "/.." in the path, so
> that's a matter of adjusting this area to trim properly the previous
> directory.
> 
> On *nix platforms, we don't apply this much caution either, say a
> simple /tmp/path/../path/ results in this same path used in the link
> from pg_tblspc.  But we are speaking about Windows here, and junction
> points.
> 
> Based on the lack of complains over the years, that does not seem
> really worth backpatching.  Just my 2c on this point.

Reading the first complaint, I remember I proposed that as a part of a
larger patch.

https://www.postgresql.org/message-id/20190425.170855.39056106.horiguchi.kyotaro%40lab.ntt.co.jp

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: drop tablespace failed when location contains .. on win32

От
Andrew Dunstan
Дата:
On 9/8/21 11:44 PM, Michael Paquier wrote:
> On Thu, Sep 09, 2021 at 02:35:52AM +0000, wangsh.fnst@fujitsu.com wrote:
>> Do you mean changing the action of canonicalize_path(), like remove all the (..) ?
>>
>> I'm willing to fix this problem.
> Looking at canonicalize_path(), we have already some logic around
> pending_strips to remove paths when we find a "/.." in the path, so
> that's a matter of adjusting this area to trim properly the previous
> directory.
>
> On *nix platforms, we don't apply this much caution either, say a
> simple /tmp/path/../path/ results in this same path used in the link
> from pg_tblspc.  But we are speaking about Windows here, and junction
> points.
>
> Based on the lack of complains over the years, that does not seem
> really worth backpatching.  Just my 2c on this point.



Maybe, although it's arguably a bug.


I think I would say that we should remove any "." or ".." element in the
path except at the beginning, and in the case of ".." also remove the
preceding element, unless someone can convince me that there's a problem
with that.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




RE: drop tablespace failed when location contains .. on win32

От
"wangsh.fnst@fujitsu.com"
Дата:
Hi,

> -----Original Message-----
> From: Andrew Dunstan <andrew@dunslane.net>
> Sent: Thursday, September 9, 2021 8:30 PM

> I think I would say that we should remove any "." or ".." element in the
> path except at the beginning, and in the case of ".." also remove the
> preceding element, unless someone can convince me that there's a problem
> with that.

These WIP patches try to remove all the '.' or '..' in the path except at
the beginning.

0001 is a small fix, because I find that is_absolute_path is not appropriate,
see comment in skip_drive:
> * On Windows, a path may begin with "C:" or "//network/".

But this modification will lead to a regress test failure on Windows:
> -- Will fail with bad path
> CREATE TABLESPACE regress_badspace LOCATION '/no/such/location';
> -ERROR:  directory "/no/such/location" does not exist
> +ERROR:  tablespace location must be an absolute path

Do you think this modification is necessary ?

Rest of the modification is in 0002. I think this patch need more test and review.

0003 is a test extension for me to check the action of canonicalize_path.
Do you think is necessary to add this test extension(and some test scripts) to master ?
If necessary,  maybe I can use the taptest to test the action of canonicalize_path
in Linux and Windows.


I will add this to next commitfest after further test .

Regards.
Shenhao Wang

Вложения

Re: drop tablespace failed when location contains .. on win32

От
Michael Paquier
Дата:
On Sun, Sep 12, 2021 at 07:33:23AM +0000, wangsh.fnst@fujitsu.com wrote:
> 0001 is a small fix, because I find that is_absolute_path is not appropriate,
> see comment in skip_drive:
> > * On Windows, a path may begin with "C:" or "//network/".

 #define is_absolute_path(filename) \
 ( \
-       IS_DIR_SEP((filename)[0]) || \
+       (IS_DIR_SEP((filename)[0]) && IS_DIR_SEP((filename)[1])) || \
        (isalpha((unsigned char) ((filename)[0])) && (filename)[1] == ':' && \
         IS_DIR_SEP((filename)[2])) \
With this change you would consider a path beginning with "/foo/.." as
not being an absolute path, but that's not correct.  Or am I missing
 something obvious?

> 0003 is a test extension for me to check the action of canonicalize_path.
> Do you think is necessary to add this test extension(and some test scripts) to master ?
> If necessary,  maybe I can use the taptest to test the action of canonicalize_path
> in Linux and Windows.

I am not sure that this is worth the cycles spent on, so I would
discard it.  This will help a lot in reviewing this patch, for sure.
And you could add some regression tests to show how much testing you
have done, for both WIN32 and non-WIN32.  I do that from time to time,
and was actually thinking to test this API this way.
--
Michael

Вложения

Re: drop tablespace failed when location contains .. on win32

От
Kyotaro Horiguchi
Дата:
At Mon, 13 Sep 2021 16:06:52 +0900, Michael Paquier <michael@paquier.xyz> wrote in 
> On Sun, Sep 12, 2021 at 07:33:23AM +0000, wangsh.fnst@fujitsu.com wrote:
> > 0001 is a small fix, because I find that is_absolute_path is not appropriate, 
> > see comment in skip_drive:
> > > * On Windows, a path may begin with "C:" or "//network/".
> 
>  #define is_absolute_path(filename) \
>  ( \
> -       IS_DIR_SEP((filename)[0]) || \
> +       (IS_DIR_SEP((filename)[0]) && IS_DIR_SEP((filename)[1])) || \
>         (isalpha((unsigned char) ((filename)[0])) && (filename)[1] == ':' && \
>          IS_DIR_SEP((filename)[2])) \
> With this change you would consider a path beginning with "/foo/.." as
> not being an absolute path, but that's not correct.  Or am I missing
>  something obvious?

Mmm. I haven't thought that so seriously, but '/hoge/foo/bar' doesn't
seem to be an absolute path on Windows since it lacks
"<dirver-letter>:" or "//hostname" part.  If we're on drive D:,
"/Program\ Files" doesn't mean "C:\Program\ Files" but "D:\Program\
Files".

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



RE: drop tablespace failed when location contains .. on win32

От
"wangsh.fnst@fujitsu.com"
Дата:
Hi,

> -----Original Message-----
> From: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
> Sent: Monday, September 13, 2021 4:36 PM
> To: michael@paquier.xyz

> Mmm. I haven't thought that so seriously, but '/hoge/foo/bar' doesn't
> seem to be an absolute path on Windows since it lacks
> "<dirver-letter>:" or "//hostname" part.  If we're on drive D:,
> "/Program\ Files" doesn't mean "C:\Program\ Files" but "D:\Program\
> Files".

I don't know this.  After some test, I think it's better to consider '/hoge/foo/bar'
as a absolute path.

0001 and 0002 are the are the bugfix patches.
0003 is the test patch what I have tested on Linux and Windows.

Waiting for some comment.
Add to the commitfest: https://commitfest.postgresql.org/35/3331/

Regards,
Shenhao Wang


Вложения

Re: drop tablespace failed when location contains .. on win32

От
Tom Lane
Дата:
"wangsh.fnst@fujitsu.com" <wangsh.fnst@fujitsu.com> writes:
> I don't know this.  After some test, I think it's better to consider '/hoge/foo/bar'
> as a absolute path.

Agreed.  I think we are considering "absolute path" here as a
syntactic concept; Windows' weird rules about drive letters
don't really matter for the purposes of path canonicalization.

> 0001 and 0002 are the are the bugfix patches.
> 0003 is the test patch what I have tested on Linux and Windows.
> Waiting for some comment.

I tried to read 0001 but really couldn't make sense of the logic
at all, because it's seriously underdocumented.  At minimum you
need an API spec comment for canonicalize_path_sub, explaining
what it's supposed to do and why.  This is a significant rewrite
of what was already tricky code, so we can't skimp on
documentation.  I'd put some effort into choosing more descriptive
names, too ("sub" doesn't mean much, especially here where it's
not clear if it means "subroutine" or "path component").

I did notice that you dropped the separate step to collapse
adjacent separators (i.e, reduce "foo//bar" to "foo/bar"), which
seems like probably a bad idea.  I think such cases might confuse
canonicalize_path_sub, and even if it manages to do the right
thing, that requirement will complicate its invariants won't it?

Another thing I happened to notice is that join_path_components
is going out of its way to not generate "foo/./bar", but if
we are fixing canonicalize_path to be able to delete the "./",
that seems like a waste of code now.

I am not entirely convinced that 0002 isn't re-introducing the
security hole that the existing code seeks to plug.  That one
is going to require more justification.

I concur with the upthread comments that there's little chance
we'll commit 0003 as-is; the code-to-benefit ratio is too high.
Instead, you might consider adding test_canonicalize_path in
src/test/regress/regress.c, and then adding a smaller number
of regression test cases using that.

            regards, tom lane



Re: drop tablespace failed when location contains .. on win32

От
Michael Paquier
Дата:
On Wed, Nov 10, 2021 at 05:43:31PM -0500, Tom Lane wrote:
> Another thing I happened to notice is that join_path_components
> is going out of its way to not generate "foo/./bar", but if
> we are fixing canonicalize_path to be able to delete the "./",
> that seems like a waste of code now.
>
> I am not entirely convinced that 0002 isn't re-introducing the
> security hole that the existing code seeks to plug.  That one
> is going to require more justification.

At the same time, do we have any need for doing 0002 at all if
we do 0001?  The paths are canonicalized before checking them in
path_contains_parent_reference().

> I concur with the upthread comments that there's little chance
> we'll commit 0003 as-is; the code-to-benefit ratio is too high.
> Instead, you might consider adding test_canonicalize_path in
> src/test/regress/regress.c, and then adding a smaller number
> of regression test cases using that.

Sounds like a good idea to me.  I would move these in misc.source for
anything that require an absolute path.

0001 is indeed in need of more comments and documentation so as one
does not get lost if reading through this code in the future.  Changes
in trim_directory(), for example, should explain what is returned and
why.

+       isabs = is_absolute_path(path);
+       tmppath = strdup(path);
If possible, it would be nice to cut any need for malloc() allocations
in this code.
--
Michael

Вложения

Re: drop tablespace failed when location contains .. on win32

От
Julien Rouhaud
Дата:
Hi,

This patch is a wanted bugfix and has been waiting for an update for 2 months.

Do you plan to send a new version soon?



RE: drop tablespace failed when location contains .. on win32

От
"wangsh.fnst@fujitsu.com"
Дата:
Hi

> This patch is a wanted bugfix and has been waiting for an update for 2 months.
>
> Do you plan to send a new version soon?

Yes, I will send a new version before next weekend

Regards

Shenhao Wang



Re: drop tablespace failed when location contains .. on win32

От
Michael Paquier
Дата:
On Tue, Jan 18, 2022 at 01:08:01AM +0000, wangsh.fnst@fujitsu.com wrote:
> Yes, I will send a new version before next weekend

Thanks!
--
Michael

Вложения

RE: drop tablespace failed when location contains .. on win32

От
"wangsh.fnst@fujitsu.com"
Дата:
Hi,

The new version is attached.

Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I tried to read 0001 but really couldn't make sense of the logic
> at all, because it's seriously underdocumented.  At minimum you
> need an API spec comment for canonicalize_path_sub, explaining
> what it's supposed to do and why.

I have added some comments, but I'm not sure these comments are enough
or easy understand.

> I did notice that you dropped the separate step to collapse
> adjacent separators (i.e, reduce "foo//bar" to "foo/bar"), which
> seems like probably a bad idea.

Add these sources back.

Michael Paquier <michael@paquier.xyz> wrote:
> for example, should explain what is returned and
> why.
> +       isabs = is_absolute_path(path);
> +       tmppath = strdup(path);
> If possible, it would be nice to cut any need for malloc() allocations
> in this code.

Thank you for advice. In this version, I do not use the malloc().

> > I concur with the upthread comments that there's little chance
> > we'll commit 0003 as-is; the code-to-benefit ratio is too high.
> > Instead, you might consider adding test_canonicalize_path in
> > src/test/regress/regress.c, and then adding a smaller number
> > of regression test cases using that.
>
> Sounds like a good idea to me.  I would move these in misc.source for
> anything that require an absolute path.

I'm not fully understand this. So, I do not change the test patch.

Regards,
Shenhao Wang

Вложения

Re: drop tablespace failed when location contains .. on win32

От
Michael Paquier
Дата:
On Mon, Jan 24, 2022 at 11:21:12AM +0000, wangsh.fnst@fujitsu.com wrote:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I concur with the upthread comments that there's little chance
>>> we'll commit 0003 as-is; the code-to-benefit ratio is too high.
>>> Instead, you might consider adding test_canonicalize_path in
>>> src/test/regress/regress.c, and then adding a smaller number
>>> of regression test cases using that.
>>
>> Sounds like a good idea to me.  I would move these in misc.source for
>> anything that require an absolute path.
>
> I'm not fully understand this. So, I do not change the test patch.

In order to make the tests cheap, there is no need to have a separate
module in src/test/modules/ as your patch 0002 is doing.  Instead, you
should move the C code of your SQL function test_canonicalize_path()
to src/test/regress/regress.c, then add some tests in
src/test/regress/sql/, with a SQL function created in the test script
that feeds from what would be added to regress.so.

Please note that my previous comment has become incorrect as of
dc9c3b0, that has removed the concept of input/output files in the
regression tests, but you can do the same with a \getenv to get access
to absolute paths for the tests.  There are many examples in the tree
for that, one is copy.sql.
--
Michael

Вложения

Re: drop tablespace failed when location contains .. on win32

От
Tom Lane
Дата:
Michael Paquier <michael@paquier.xyz> writes:
> In order to make the tests cheap, there is no need to have a separate
> module in src/test/modules/ as your patch 0002 is doing.  Instead, you
> should move the C code of your SQL function test_canonicalize_path()
> to src/test/regress/regress.c, then add some tests in
> src/test/regress/sql/, with a SQL function created in the test script
> that feeds from what would be added to regress.so.

Here's a revised patch version that does it like that.  I also
reviewed and simplified the canonicalize_path logic.  I think
this is committable.

(I suspect that adminpack's checks for unsafe file names could
now be simplified substantially, because many of the corner cases
it worries about are no longer possible, as evidenced by the change
in error message there.  I've not pursued that, however.)

            regards, tom lane

diff --git a/contrib/adminpack/expected/adminpack.out b/contrib/adminpack/expected/adminpack.out
index edf3ebfcba..76aafe6316 100644
--- a/contrib/adminpack/expected/adminpack.out
+++ b/contrib/adminpack/expected/adminpack.out
@@ -51,7 +51,7 @@ SELECT pg_file_write(current_setting('data_directory') || '/test_file4', 'test4'
 (1 row)

 SELECT pg_file_write(current_setting('data_directory') || '/../test_file4', 'test4', false);
-ERROR:  reference to parent directory ("..") not allowed
+ERROR:  absolute path not allowed
 RESET ROLE;
 REVOKE EXECUTE ON FUNCTION pg_file_write(text,text,bool) FROM regress_user1;
 REVOKE pg_read_all_settings FROM regress_user1;
diff --git a/src/port/path.c b/src/port/path.c
index 69bb8fe40b..e8b0364b13 100644
--- a/src/port/path.c
+++ b/src/port/path.c
@@ -46,8 +46,9 @@

 static void make_relative_path(char *ret_path, const char *target_path,
                                const char *bin_path, const char *my_exec_path);
-static void trim_directory(char *path);
+static char *trim_directory(char *path);
 static void trim_trailing_separator(char *path);
+static char *append_subdir_to_path(char *path, char *subdir);


 /*
@@ -222,13 +223,9 @@ join_path_components(char *ret_path,
         strlcpy(ret_path, head, MAXPGPATH);

     /*
-     * Remove any leading "." in the tail component.
-     *
-     * Note: we used to try to remove ".." as well, but that's tricky to get
-     * right; now we just leave it to be done by canonicalize_path() later.
+     * We used to try to simplify some cases involving "." and "..", but now
+     * we just leave that to be done by canonicalize_path() later.
      */
-    while (tail[0] == '.' && IS_DIR_SEP(tail[1]))
-        tail += 2;

     if (*tail)
     {
@@ -241,14 +238,27 @@ join_path_components(char *ret_path,
 }


+/* State machine states for canonicalize_path */
+typedef enum
+{
+    ABSOLUTE_PATH_INIT,            /* Just past the leading '/' (and Windows
+                                 * drive name if any) of an absolute path */
+    ABSOLUTE_WITH_N_DEPTH,        /* We collected 'pathdepth' directories in an
+                                 * absolute path */
+    RELATIVE_PATH_INIT,            /* At start of a relative path */
+    RELATIVE_WITH_N_DEPTH,        /* We collected 'pathdepth' directories in a
+                                 * relative path */
+    RELATIVE_WITH_PARENT_REF    /* Relative path containing only double-dots */
+} canonicalize_state;
+
 /*
  *    Clean up path by:
  *        o  make Win32 path use Unix slashes
  *        o  remove trailing quote on Win32
  *        o  remove trailing slash
- *        o  remove duplicate adjacent separators
- *        o  remove trailing '.'
- *        o  process trailing '..' ourselves
+ *        o  remove duplicate (adjacent) separators
+ *        o  remove '.' (unless path reduces to only '.')
+ *        o  process '..' ourselves
  */
 void
 canonicalize_path(char *path)
@@ -256,8 +266,11 @@ canonicalize_path(char *path)
     char       *p,
                *to_p;
     char       *spath;
+    char       *parsed;
+    char       *unparse;
     bool        was_sep = false;
-    int            pending_strips;
+    canonicalize_state state;
+    int            pathdepth = 0;    /* counts collected regular directory names */

 #ifdef WIN32

@@ -308,60 +321,172 @@ canonicalize_path(char *path)
     *to_p = '\0';

     /*
-     * Remove any trailing uses of "." and process ".." ourselves
+     * Remove any uses of "." and process ".." ourselves
      *
      * Note that "/../.." should reduce to just "/", while "../.." has to be
-     * kept as-is.  In the latter case we put back mistakenly trimmed ".."
-     * components below.  Also note that we want a Windows drive spec to be
-     * visible to trim_directory(), but it's not part of the logic that's
-     * looking at the name components; hence distinction between path and
-     * spath.
+     * kept as-is.  Also note that we want a Windows drive spec to be visible
+     * to trim_directory(), but it's not part of the logic that's looking at
+     * the name components; hence distinction between path and spath.
+     *
+     * This loop overwrites the path in-place.  This is safe since we'll never
+     * make the path longer.  "unparse" points to where we are reading the
+     * path, "parse" to where we are writing.
      */
     spath = skip_drive(path);
-    pending_strips = 0;
-    for (;;)
+    if (*spath == '\0')
+        return;                    /* empty path is returned as-is */
+
+    if (IS_DIR_SEP(*spath))
     {
-        int            len = strlen(spath);
+        state = ABSOLUTE_PATH_INIT;
+        /* Skip the leading slash for absolute path */
+        parsed = unparse = (spath + 1);
+    }
+    else
+    {
+        state = RELATIVE_PATH_INIT;
+        parsed = unparse = spath;
+    }

-        if (len >= 2 && strcmp(spath + len - 2, "/.") == 0)
-            trim_directory(path);
-        else if (strcmp(spath, ".") == 0)
+    while (*unparse != '\0')
+    {
+        char       *unparse_next;
+        bool        is_double_dot;
+
+        /* Split off this dir name, and set unparse_next to the next one */
+        unparse_next = unparse;
+        while (*unparse_next && !IS_DIR_SEP(*unparse_next))
+            unparse_next++;
+        if (*unparse_next != '\0')
+            *unparse_next++ = '\0';
+
+        /* Identify type of this dir name */
+        if (strcmp(unparse, ".") == 0)
         {
-            /* Want to leave "." alone, but "./.." has to become ".." */
-            if (pending_strips > 0)
-                *spath = '\0';
-            break;
+            /* We can ignore "." components in all cases */
+            unparse = unparse_next;
+            continue;
         }
-        else if ((len >= 3 && strcmp(spath + len - 3, "/..") == 0) ||
-                 strcmp(spath, "..") == 0)
+
+        if (strcmp(unparse, "..") == 0)
+            is_double_dot = true;
+        else
         {
-            trim_directory(path);
-            pending_strips++;
+            /* adjacent separators were eliminated above */
+            Assert(*unparse != '\0');
+            is_double_dot = false;
         }
-        else if (pending_strips > 0 && *spath != '\0')
+
+        switch (state)
         {
-            /* trim a regular directory name canceled by ".." */
-            trim_directory(path);
-            pending_strips--;
-            /* foo/.. should become ".", not empty */
-            if (*spath == '\0')
-                strcpy(spath, ".");
+            case ABSOLUTE_PATH_INIT:
+                /* We can ignore ".." immediately after / */
+                if (!is_double_dot)
+                {
+                    /* Append first dir name (we already have leading slash) */
+                    parsed = append_subdir_to_path(parsed, unparse);
+                    state = ABSOLUTE_WITH_N_DEPTH;
+                    pathdepth++;
+                }
+                break;
+            case ABSOLUTE_WITH_N_DEPTH:
+                if (is_double_dot)
+                {
+                    /* trim_directory won't remove the leading slash. */
+                    *parsed = '\0';
+                    parsed = trim_directory(path);
+                    if (--pathdepth == 0)
+                        state = ABSOLUTE_PATH_INIT;
+                }
+                else
+                {
+                    /* Append normal dir */
+                    *parsed++ = '/';
+                    parsed = append_subdir_to_path(parsed, unparse);
+                    pathdepth++;
+                }
+                break;
+            case RELATIVE_PATH_INIT:
+                if (is_double_dot)
+                {
+                    /* Append irreducible double-dot (..) */
+                    parsed = append_subdir_to_path(parsed, unparse);
+                    state = RELATIVE_WITH_PARENT_REF;
+                }
+                else
+                {
+                    /* Append normal dir */
+                    parsed = append_subdir_to_path(parsed, unparse);
+                    state = RELATIVE_WITH_N_DEPTH;
+                    pathdepth++;
+                }
+                break;
+            case RELATIVE_WITH_N_DEPTH:
+                if (is_double_dot)
+                {
+                    /* Remove last parsed dir */
+                    *parsed = '\0';
+                    parsed = trim_directory(path);
+                    if (--pathdepth == 0)
+                    {
+                        /*
+                         * If the output path is now empty, we're back to the
+                         * INIT state.  However, we could have processed a
+                         * path like "../dir/.." and now be down to "..", in
+                         * which case enter the correct state for that.
+                         */
+                        if (parsed == spath)
+                            state = RELATIVE_PATH_INIT;
+                        else
+                            state = RELATIVE_WITH_PARENT_REF;
+                    }
+                }
+                else
+                {
+                    /* Append normal dir */
+                    *parsed++ = '/';
+                    parsed = append_subdir_to_path(parsed, unparse);
+                    pathdepth++;
+                }
+                break;
+            case RELATIVE_WITH_PARENT_REF:
+                if (is_double_dot)
+                {
+                    /* Append next irreducible double-dot (..) */
+                    *parsed++ = '/';
+                    parsed = append_subdir_to_path(parsed, unparse);
+                }
+                else
+                {
+                    /* Append normal dir */
+                    *parsed++ = '/';
+                    parsed = append_subdir_to_path(parsed, unparse);
+
+                    /*
+                     * We can now start counting normal dirs.  But if later
+                     * double-dots make us remove this dir again, we'd better
+                     * revert to RELATIVE_WITH_PARENT_REF not INIT state.
+                     */
+                    state = RELATIVE_WITH_N_DEPTH;
+                    pathdepth = 1;
+                }
+                break;
         }
-        else
-            break;
-    }

-    if (pending_strips > 0)
-    {
-        /*
-         * We could only get here if path is now totally empty (other than a
-         * possible drive specifier on Windows). We have to put back one or
-         * more ".."'s that we took off.
-         */
-        while (--pending_strips > 0)
-            strcat(path, "../");
-        strcat(path, "..");
+        unparse = unparse_next;
     }
+
+    /*
+     * If our output path is empty at this point, insert ".".  We don't want
+     * to do this any earlier because it'd result in an extra dot in corner
+     * cases such as "../dir/..".  Since we rejected the wholly-empty-path
+     * case above, there is certainly room.
+     */
+    if (parsed == spath)
+        *parsed++ = '.';
+
+    /* And finally, ensure the output path is nul-terminated. */
+    *parsed = '\0';
 }

 /*
@@ -865,8 +990,11 @@ get_parent_directory(char *path)
  *    Trim trailing directory from path, that is, remove any trailing slashes,
  *    the last pathname component, and the slash just ahead of it --- but never
  *    remove a leading slash.
+ *
+ * For the convenience of canonicalize_path, the path's new end location
+ * is returned.
  */
-static void
+static char *
 trim_directory(char *path)
 {
     char       *p;
@@ -874,7 +1002,7 @@ trim_directory(char *path)
     path = skip_drive(path);

     if (path[0] == '\0')
-        return;
+        return path;

     /* back up over trailing slash(es) */
     for (p = path + strlen(path) - 1; IS_DIR_SEP(*p) && p > path; p--)
@@ -889,6 +1017,7 @@ trim_directory(char *path)
     if (p == path && IS_DIR_SEP(*p))
         p++;
     *p = '\0';
+    return p;
 }


@@ -908,3 +1037,25 @@ trim_trailing_separator(char *path)
         for (p--; p > path && IS_DIR_SEP(*p); p--)
             *p = '\0';
 }
+
+/*
+ *    append_subdir_to_path
+ *
+ * Append the currently-considered subdirectory name to the output
+ * path in canonicalize_path.  Return the new end location of the
+ * output path.
+ *
+ * Since canonicalize_path updates the path in-place, we must use
+ * memmove not memcpy, and we don't yet terminate the path with '\0'.
+ */
+static char *
+append_subdir_to_path(char *path, char *subdir)
+{
+    size_t        len = strlen(subdir);
+
+    /* No need to copy data if path and subdir are the same. */
+    if (path != subdir)
+        memmove(path, subdir, len);
+
+    return path + len;
+}
diff --git a/src/test/regress/expected/create_function_1.out b/src/test/regress/expected/create_function_1.out
index 5345ed0840..6141b7060b 100644
--- a/src/test/regress/expected/create_function_1.out
+++ b/src/test/regress/expected/create_function_1.out
@@ -28,3 +28,7 @@ CREATE FUNCTION int44out(city_budget)
    AS :'regresslib'
    LANGUAGE C STRICT IMMUTABLE;
 NOTICE:  argument type city_budget is only a shell
+CREATE FUNCTION test_canonicalize_path(text)
+   RETURNS text
+   AS :'regresslib'
+   LANGUAGE C STRICT IMMUTABLE;
diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index e2c77d0ac8..cce92b014f 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -133,6 +133,129 @@ ERROR:  function num_nulls() does not exist
 LINE 1: SELECT num_nulls();
                ^
 HINT:  No function matches the given name and argument types. You might need to add explicit type casts.
+--
+-- canonicalize_path()
+--
+SELECT * FROM test_canonicalize_path('/');
+ test_canonicalize_path
+------------------------
+ /
+(1 row)
+
+SELECT * FROM test_canonicalize_path('/./abc/def/');
+ test_canonicalize_path
+------------------------
+ /abc/def
+(1 row)
+
+SELECT * FROM test_canonicalize_path('/./../abc/def');
+ test_canonicalize_path
+------------------------
+ /abc/def
+(1 row)
+
+SELECT * FROM test_canonicalize_path('/./../../abc/def/');
+ test_canonicalize_path
+------------------------
+ /abc/def
+(1 row)
+
+SELECT * FROM test_canonicalize_path('/abc/.././def/ghi');
+ test_canonicalize_path
+------------------------
+ /def/ghi
+(1 row)
+
+SELECT * FROM test_canonicalize_path('/abc/./../def/ghi//');
+ test_canonicalize_path
+------------------------
+ /def/ghi
+(1 row)
+
+SELECT * FROM test_canonicalize_path('/abc/def/../../../../ghi/jkl');
+ test_canonicalize_path
+------------------------
+ /ghi/jkl
+(1 row)
+
+SELECT * FROM test_canonicalize_path('.');
+ test_canonicalize_path
+------------------------
+ .
+(1 row)
+
+SELECT * FROM test_canonicalize_path('./');
+ test_canonicalize_path
+------------------------
+ .
+(1 row)
+
+SELECT * FROM test_canonicalize_path('./abc/..');
+ test_canonicalize_path
+------------------------
+ .
+(1 row)
+
+SELECT * FROM test_canonicalize_path('abc/../');
+ test_canonicalize_path
+------------------------
+ .
+(1 row)
+
+SELECT * FROM test_canonicalize_path('abc/../def');
+ test_canonicalize_path
+------------------------
+ def
+(1 row)
+
+SELECT * FROM test_canonicalize_path('..');
+ test_canonicalize_path
+------------------------
+ ..
+(1 row)
+
+SELECT * FROM test_canonicalize_path('../abc/def');
+ test_canonicalize_path
+------------------------
+ ../abc/def
+(1 row)
+
+SELECT * FROM test_canonicalize_path('../abc/..');
+ test_canonicalize_path
+------------------------
+ ..
+(1 row)
+
+SELECT * FROM test_canonicalize_path('../abc/../def');
+ test_canonicalize_path
+------------------------
+ ../def
+(1 row)
+
+SELECT * FROM test_canonicalize_path('../abc/../../def/ghi');
+ test_canonicalize_path
+------------------------
+ ../../def/ghi
+(1 row)
+
+SELECT * FROM test_canonicalize_path('./abc/./def/.');
+ test_canonicalize_path
+------------------------
+ abc/def
+(1 row)
+
+SELECT * FROM test_canonicalize_path('./abc/././def/.');
+ test_canonicalize_path
+------------------------
+ abc/def
+(1 row)
+
+SELECT * FROM test_canonicalize_path('./abc/./def/.././ghi/../../../jkl/mno');
+ test_canonicalize_path
+------------------------
+ ../jkl/mno
+(1 row)
+
 --
 -- pg_log_backend_memory_contexts()
 --
diff --git a/src/test/regress/regress.c b/src/test/regress/regress.c
index eefbd0f0df..0802fb9136 100644
--- a/src/test/regress/regress.c
+++ b/src/test/regress/regress.c
@@ -531,6 +531,16 @@ int44out(PG_FUNCTION_ARGS)
     PG_RETURN_CSTRING(result);
 }

+PG_FUNCTION_INFO_V1(test_canonicalize_path);
+Datum
+test_canonicalize_path(PG_FUNCTION_ARGS)
+{
+    char       *path = text_to_cstring(PG_GETARG_TEXT_PP(0));
+
+    canonicalize_path(path);
+    PG_RETURN_TEXT_P(cstring_to_text(path));
+}
+
 PG_FUNCTION_INFO_V1(make_tuple_indirect);
 Datum
 make_tuple_indirect(PG_FUNCTION_ARGS)
diff --git a/src/test/regress/sql/create_function_1.sql b/src/test/regress/sql/create_function_1.sql
index 4170b16fe6..34cc7c6efc 100644
--- a/src/test/regress/sql/create_function_1.sql
+++ b/src/test/regress/sql/create_function_1.sql
@@ -29,3 +29,8 @@ CREATE FUNCTION int44out(city_budget)
    RETURNS cstring
    AS :'regresslib'
    LANGUAGE C STRICT IMMUTABLE;
+
+CREATE FUNCTION test_canonicalize_path(text)
+   RETURNS text
+   AS :'regresslib'
+   LANGUAGE C STRICT IMMUTABLE;
diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql
index 1159f6b585..448114808d 100644
--- a/src/test/regress/sql/misc_functions.sql
+++ b/src/test/regress/sql/misc_functions.sql
@@ -30,6 +30,31 @@ SELECT num_nulls(VARIADIC '{}'::int[]);
 SELECT num_nonnulls();
 SELECT num_nulls();

+--
+-- canonicalize_path()
+--
+
+SELECT * FROM test_canonicalize_path('/');
+SELECT * FROM test_canonicalize_path('/./abc/def/');
+SELECT * FROM test_canonicalize_path('/./../abc/def');
+SELECT * FROM test_canonicalize_path('/./../../abc/def/');
+SELECT * FROM test_canonicalize_path('/abc/.././def/ghi');
+SELECT * FROM test_canonicalize_path('/abc/./../def/ghi//');
+SELECT * FROM test_canonicalize_path('/abc/def/../../../../ghi/jkl');
+SELECT * FROM test_canonicalize_path('.');
+SELECT * FROM test_canonicalize_path('./');
+SELECT * FROM test_canonicalize_path('./abc/..');
+SELECT * FROM test_canonicalize_path('abc/../');
+SELECT * FROM test_canonicalize_path('abc/../def');
+SELECT * FROM test_canonicalize_path('..');
+SELECT * FROM test_canonicalize_path('../abc/def');
+SELECT * FROM test_canonicalize_path('../abc/..');
+SELECT * FROM test_canonicalize_path('../abc/../def');
+SELECT * FROM test_canonicalize_path('../abc/../../def/ghi');
+SELECT * FROM test_canonicalize_path('./abc/./def/.');
+SELECT * FROM test_canonicalize_path('./abc/././def/.');
+SELECT * FROM test_canonicalize_path('./abc/./def/.././ghi/../../../jkl/mno');
+
 --
 -- pg_log_backend_memory_contexts()
 --

Re: drop tablespace failed when location contains .. on win32

От
Michael Paquier
Дата:
On Sun, Jan 30, 2022 at 04:50:03PM -0500, Tom Lane wrote:
> Here's a revised patch version that does it like that.  I also
> reviewed and simplified the canonicalize_path logic.  I think
> this is committable.

Thanks for the updated version.  The range of the tests looks fine
enough, and the CF bot does not complain.  The code is
straight-forward and pretty clear in terms of the handling of ".",
".." and the N-depth handling necessary.

Should we have tests for WIN32 (aka for driver letters and "//")?
This could be split into its own separate test file to limit the
damage with the alternate outputs, and the original complain was from
there.

> (I suspect that adminpack's checks for unsafe file names could
> now be simplified substantially, because many of the corner cases
> it worries about are no longer possible, as evidenced by the change
> in error message there.  I've not pursued that, however.)

Fine by me to let this part for later.
--
Michael

Вложения

Re: drop tablespace failed when location contains .. on win32

От
Tom Lane
Дата:
Michael Paquier <michael@paquier.xyz> writes:
> Should we have tests for WIN32 (aka for driver letters and "//")?
> This could be split into its own separate test file to limit the
> damage with the alternate outputs, and the original complain was from
> there.

I thought about it and concluded that the value couldn't justify
the pain-in-the-neck factor of adding a platform-specific variant
result file.  skip_drive() is pretty simple and decoupled from what
we're trying to test here, plus it hasn't changed in decades and
is unlikely to do so in future.

            regards, tom lane



Re: drop tablespace failed when location contains .. on win32

От
Andrew Dunstan
Дата:
On 1/30/22 16:50, Tom Lane wrote:
> Michael Paquier <michael@paquier.xyz> writes:
>> In order to make the tests cheap, there is no need to have a separate
>> module in src/test/modules/ as your patch 0002 is doing.  Instead, you
>> should move the C code of your SQL function test_canonicalize_path()
>> to src/test/regress/regress.c, then add some tests in
>> src/test/regress/sql/, with a SQL function created in the test script
>> that feeds from what would be added to regress.so.
> Here's a revised patch version that does it like that.  I also
> reviewed and simplified the canonicalize_path logic.  I think
> this is committable.


LGTM


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: drop tablespace failed when location contains .. on win32

От
Tom Lane
Дата:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 1/30/22 16:50, Tom Lane wrote:
>> Here's a revised patch version that does it like that.  I also
>> reviewed and simplified the canonicalize_path logic.  I think
>> this is committable.

> LGTM

Pushed, thanks for looking.

I think I'm also going to have a look at simplifying some of the
dependent code, just because it feels weird to leave that unfinished.
In particular, Shenhao-san suggested upthread that we could remove
path_contains_parent_reference().  I complained about that at the
time, but I hadn't quite absorbed the fact that an absolute path
is now *guaranteed* not to have any ".." after canonicalize_path.
So the existing calls in adminpack.c and genfile.c are certainly
dead code.  We probably want to keep path_contains_parent_reference()
in case some extension is using it, but seeing that its API spec
already requires the input to be canonicalized, it could be simplified
to just check for ".." at the start.

            regards, tom lane