Обсуждение: [PATCH] COPY vs \copy HINT

Поиск
Список
Период
Сортировка

[PATCH] COPY vs \copy HINT

От
Craig Ringer
Дата:
Hi all

I see this sort of question quite a bit:


where the user wonders why

COPY gemeenten 
  FROM 'D:\CBS_woningcijfers_2014.csv'
  DELIMITER ';' CSV

fails with 

ERROR:  could not open file "D:\CBS_woningcijfers_2014.csv" for reading: No such file or directory'

and as usual, it's because the path is on their local host not the Pg server.

I think we should emit a HINT here, something like:

ERROR:  could not open file "D:\CBS_woningcijfers_2014.csv" for reading: No such file or directory'
HINT:  Paths for COPY are on the PostgreSQL server, not the client. You may want psql's \copy or a driver COPY ... FROM STDIN wrapper


as a usability improvement. Trivial patch attached. 

I'm not sure how to avoid the need to translate the string multiple times, or if the tooling will automatically flatten them. I haven't bothered with the stat() failure or isdir cases, since they seem less likely to be the cause of users being confused between client and server.

Sample output:

postgres=# COPY x FROM '/tmp/somepath';
ERROR:  could not open file "/tmp/somepath" for reading: No such file or directory
HINT:  Paths for COPY are on the PostgreSQL server, not the client. You may want psql's \copy or a driver COPY ... FROM STDIN wrapper

postgres=# COPY x TO '/root/nopermissions';
ERROR:  could not open file "/root/nopermissions" for writing: Permission denied
HINT:  Paths for COPY are on the PostgreSQL server, not the client. You may want psql's \copy or a driver COPY ... FROM STDIN wrapper

postgres=# COPY x TO 'relpath';
ERROR:  relative path not allowed for COPY to file
HINT:  Paths for COPY are on the PostgreSQL server, not the client. You may want psql's \copy or a driver COPY ... FROM STDIN wrapper



--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения

Re: [PATCH] COPY vs \copy HINT

От
Christoph Berg
Дата:
Re: Craig Ringer 2016-08-12 <CAMsr+YEqtD97qPEzQDqrCt5QiqPbWP_X4hmvy2pQzWC0GWiyPA@mail.gmail.com>
> I think we should emit a HINT here, something like:
>
> ERROR:  could not open file "D:\CBS_woningcijfers_2014.csv" for reading: No
> such file or directory'
> HINT:  Paths for COPY are on the PostgreSQL server, not the client. You may
> want psql's \copy or a driver COPY ... FROM STDIN wrapper

+1 on the idea.

> postgres=# COPY x TO '/root/nopermissions';
> ERROR:  could not open file "/root/nopermissions" for writing: Permission
> denied
> HINT:  Paths for COPY are on the PostgreSQL server, not the client. You may
> want psql's \copy or a driver COPY ... FROM STDIN wrapper

TO STDOUT.

Also, I vaguely get what you wanted to say with "a driver ...
wrapper", but it's pretty nonsensical if one doesn't know about the
protocol details. I don't have a better suggestion now, but I think it
needs rephrasing.

Christoph

Re: [PATCH] COPY vs \copy HINT

От
Craig Ringer
Дата:
On 12 August 2016 at 16:34, Christoph Berg <myon@debian.org> wrote:
 
> postgres=# COPY x TO '/root/nopermissions';
> ERROR:  could not open file "/root/nopermissions" for writing: Permission
> denied
> HINT:  Paths for COPY are on the PostgreSQL server, not the client. You may
> want psql's \copy or a driver COPY ... FROM STDIN wrapper

TO STDOUT.

Ahem, whoops. It's the simple things sometimes...

Attached amended.
 
Also, I vaguely get what you wanted to say with "a driver ...
wrapper", but it's pretty nonsensical if one doesn't know about the
protocol details. I don't have a better suggestion now, but I think it
needs rephrasing.

I don't like it either, but didn't come up with anything better. The problem is that every driver calls it something different.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
Вложения

Re: [PATCH] COPY vs \copy HINT

От
Tom Lane
Дата:
Craig Ringer <craig@2ndquadrant.com> writes:
> On 12 August 2016 at 16:34, Christoph Berg <myon@debian.org> wrote:
>> Also, I vaguely get what you wanted to say with "a driver ...
>> wrapper", but it's pretty nonsensical if one doesn't know about the
>> protocol details. I don't have a better suggestion now, but I think it
>> needs rephrasing.

> I don't like it either, but didn't come up with anything better. The
> problem is that every driver calls it something different.

A few thoughts on this patch:

1. I don't really think the HINT is appropriate for the not-absolute-path
case.

2. I don't think it's appropriate for all possible cases of AllocateFile
failure either, eg surely not for EACCES or similar situations where we
did find a file.  Maybe print it only for ENOENT?  (See for example
report_newlocale_failure() for technique.)

3. As for the wording, maybe you could do it like this:

HINT:  COPY copies to[from] a file on the PostgreSQL server, not on the
client.  You may want a client-side facility such as psql's \copy.

That avoids trying to invent a name for other implementations.
        regards, tom lane



Re: [PATCH] COPY vs \copy HINT

От
Craig Ringer
Дата:
On 2 September 2016 at 04:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Craig Ringer <craig@2ndquadrant.com> writes:
>> On 12 August 2016 at 16:34, Christoph Berg <myon@debian.org> wrote:
>>> Also, I vaguely get what you wanted to say with "a driver ...
>>> wrapper", but it's pretty nonsensical if one doesn't know about the
>>> protocol details. I don't have a better suggestion now, but I think it
>>> needs rephrasing.
>
>> I don't like it either, but didn't come up with anything better. The
>> problem is that every driver calls it something different.
>
> A few thoughts on this patch:

Thanks for the review. I'll leave the first change pending your
comments, the others are made, though I'm not completely sure we
should restrict it to ENOENT.

> 1. I don't really think the HINT is appropriate for the not-absolute-path
> case.

Why? If the user runs

# COPY sometable FROM 'localfile.csv' WITH (FORMAT CSV);
ERROR:  could not open file "localfile.csv" for reading: No such file
or directory

they're going to be just as confused by this error as by:

# COPY batch_demo FROM '/tmp/localfile.csv' WITH (FORMAT CSV);
ERROR:  could not open file "/tmp/localfile.csv" for reading: No such
file or directory

so I don't really see the rationale for this change.

> 2. I don't think it's appropriate for all possible cases of AllocateFile
> failure either, eg surely not for EACCES or similar situations where we
> did find a file.  Maybe print it only for ENOENT?  (See for example
> report_newlocale_failure() for technique.)

I thought about that but figured it didn't really matter too much,
when thinking about examples like

# COPY batch_demo FROM '/root/secret.csv' WITH (FORMAT CSV);
ERROR:  could not open file "/root/secret.csv" for reading: Permission denied

or whatever, where the user doesn't understand why they can't read the
file given that their local client has permission to do so.

I don't feel strongly about this and think that the error on ENOENT is
by far the most important, so I'll adjust it per your recommendation.

> 3. As for the wording, maybe you could do it like this:
>
> HINT:  COPY copies to[from] a file on the PostgreSQL server, not on the
> client.  You may want a client-side facility such as psql's \copy.
>
> That avoids trying to invent a name for other implementations.

I like that wording a lot more, thanks. Adopted.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения

Re: [PATCH] COPY vs \copy HINT

От
Christoph Berg
Дата:
Re: Craig Ringer 2016-09-02 <CAMsr+YFr6Sk=bbU2yCORN7z9foHF0cqx29vk5B49DswQ6EkVxg@mail.gmail.com>
> I thought about that but figured it didn't really matter too much,
> when thinking about examples like
> 
> # COPY batch_demo FROM '/root/secret.csv' WITH (FORMAT CSV);
> ERROR:  could not open file "/root/secret.csv" for reading: Permission denied
> 
> or whatever, where the user doesn't understand why they can't read the
> file given that their local client has permission to do so.
> 
> I don't feel strongly about this and think that the error on ENOENT is
> by far the most important, so I'll adjust it per your recommendation.

Couldn't you just add EACCESS to the check as well?

> > 3. As for the wording, maybe you could do it like this:
> >
> > HINT:  COPY copies to[from] a file on the PostgreSQL server, not on the
> > client.  You may want a client-side facility such as psql's \copy.
> >
> > That avoids trying to invent a name for other implementations.
> 
> I like that wording a lot more, thanks. Adopted.

Same here, thanks!

Christoph



Re: [PATCH] COPY vs \copy HINT

От
Craig Ringer
Дата:
On 2 September 2016 at 17:05, Christoph Berg <myon@debian.org> wrote:
> Re: Craig Ringer 2016-09-02 <CAMsr+YFr6Sk=bbU2yCORN7z9foHF0cqx29vk5B49DswQ6EkVxg@mail.gmail.com>
>> I thought about that but figured it didn't really matter too much,
>> when thinking about examples like
>>
>> # COPY batch_demo FROM '/root/secret.csv' WITH (FORMAT CSV);
>> ERROR:  could not open file "/root/secret.csv" for reading: Permission denied
>>
>> or whatever, where the user doesn't understand why they can't read the
>> file given that their local client has permission to do so.
>>
>> I don't feel strongly about this and think that the error on ENOENT is
>> by far the most important, so I'll adjust it per your recommendation.
>
> Couldn't you just add EACCESS to the check as well?

Yeah, probably. Those are really the only two failures I think it's
worth reporting for.


-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: [PATCH] COPY vs \copy HINT

От
Tom Lane
Дата:
Craig Ringer <craig@2ndquadrant.com> writes:
> On 2 September 2016 at 04:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> 1. I don't really think the HINT is appropriate for the not-absolute-path
>> case.

> Why? If the user runs
> # COPY sometable FROM 'localfile.csv' WITH (FORMAT CSV);
> ERROR:  could not open file "localfile.csv" for reading: No such file
> or directory
> they're going to be just as confused by this error as by:
> # COPY batch_demo FROM '/tmp/localfile.csv' WITH (FORMAT CSV);
> ERROR:  could not open file "/tmp/localfile.csv" for reading: No such
> file or directory
> so I don't really see the rationale for this change.

I think you misinterpreted what I was complaining about; it's not the
FROM case but the TO case, specifically this addition:
            if (!is_absolute_path(filename))                ereport(ERROR,
(errcode(ERRCODE_INVALID_NAME),
-                      errmsg("relative path not allowed for COPY to file")));
+                         errmsg("relative path not allowed for COPY to file"),
+                         errhint("COPY copies to a file on the PostgreSQL server, not on the client. "
+                                 "You may want a client-side facility such as psql's \\copy.")));

The reason I'm not happy about that is that the hint is completely
unrelated to the error condition, and I think that such a hint is likely
to be more confusing than helpful.  What's likely to happen if we leave
it alone is that someone who's confused about client vs. server will do

=# COPY sometable TO 'localfile.csv' WITH (FORMAT CSV);
ERROR:  relative path not allowed for COPY to file

[ user mutters something about hopeless pedants and writes an absolute
path: ]

=# COPY sometable TO '/home/joe/localfile.csv' WITH (FORMAT CSV);
ERROR:  could not open file "/home/joe/localfile.csv" for writing: No such file or directory

At this point, it's appropriate to give the HINT, but I think doing it
for the first case is just confusing.  There's an opportunity cost to
hints like this, which is that if they are in fact wildly unrelated
to the real problem, they are more likely to send the user down a blind
alley than lead his thoughts to the right answer.  What you want to do
is essentially cutting one step out of the process by jumping to the
conclusion that if the user gave a relative path, it's because he's
confused about client vs. server, and I don't see how that follows.

>> 2. I don't think it's appropriate for all possible cases of AllocateFile
>> failure either, eg surely not for EACCES or similar situations where we
>> did find a file.  Maybe print it only for ENOENT?

> I thought about that but figured it didn't really matter too much,
> when thinking about examples like
> # COPY batch_demo FROM '/root/secret.csv' WITH (FORMAT CSV);
> ERROR:  could not open file "/root/secret.csv" for reading: Permission denied
> or whatever, where the user doesn't understand why they can't read the
> file given that their local client has permission to do so.

Meh.  Hinting in this case could be helpful only if the user in fact had
identical directory structures on client and server and a data file in the
right place on both.  That doesn't seem terribly likely, and again I argue
that the downsides of giving an irrelevant hint outweigh the occasional
benefit when it happens to be right.  I mean, taking this position just
one step further, we should annotate every COPY data error with this same
hint, arguing that maybe the user has the right data on one machine and
slightly wrong data on the other machine.  Once in a blue moon that would
indeed be the answer, but most of the time it would be wrong, and users
would soon start getting annoyed with the hint --- maybe to the point
where they automatically ignore that hint even when it's right.

In short, I like hints as long as they're carefully targeted, but if
they're not they are likely to be seen as a net negative by users.
I think giving this hint for the ENOENT case is probably fine, but
I'm much less convinced about other cases.

One additional case that I could believe is useful is whatever error
code Windows gives when you specify a drive letter that's not in use.
(Maybe that's ENOENT but I bet not.)
        regards, tom lane



Re: [PATCH] COPY vs \copy HINT

От
Tom Lane
Дата:
I wrote:
> Craig Ringer <craig@2ndquadrant.com> writes:
>> I thought about that but figured it didn't really matter too much,
>> when thinking about examples like
>> # COPY batch_demo FROM '/root/secret.csv' WITH (FORMAT CSV);
>> ERROR:  could not open file "/root/secret.csv" for reading: Permission denied
>> or whatever, where the user doesn't understand why they can't read the
>> file given that their local client has permission to do so.

> Meh.  Hinting in this case could be helpful only if the user in fact had
> identical directory structures on client and server and a data file in the
> right place on both.

So my consciousness was raised just now by an example of exactly this
scenario over in pgsql-novice.  What I forgot was that the client may
in fact be on the same machine as the server, in which case EACCES
is pretty much exactly what you'd expect.  So we probably do want to
hint for that case, but the hint wording I previously suggested no
longer seems like le mot juste ... it needs to cover the idea that
the client and server are different processes on the same machine.

I don't suppose there's any easy way for COPY to distinguish local
from remote connections --- if it could, that might influence both
the errnos to hint for and the phrasing of the hint.
        regards, tom lane



Re: [PATCH] COPY vs \copy HINT

От
Craig Ringer
Дата:
On 4 September 2016 at 23:33, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> So my consciousness was raised just now by an example of exactly this
> scenario over in pgsql-novice.  What I forgot was that the client may
> in fact be on the same machine as the server, in which case EACCES
> is pretty much exactly what you'd expect.

Yep. Also in cases with common paths, like /root or whatever.

> So we probably do want to
> hint for that case, but the hint wording I previously suggested no
> longer seems like le mot juste ... it needs to cover the idea that
> the client and server are different processes on the same machine.

Yeah, may be. *We* know that in this case "client" and "server" still
applies since client and server can be on the same host, but the
people who needs this hint may not understand that. Though there's
only so much we can do or should try to do in a HINT.

I think the most important bit is pointing them at \copy, so it's still useful.

To cover the same-host case we could try something like:

   COPY runs on the PostgreSQL server, using the PostgreSQL server's
directories and permissions, it doesn't run on the client.

... but I think that's actually less helpful for the users who'll need this.

We could say "COPY runs as the PostgreSQL server" but that's the kind
of hint that mostly helps people who already understand it and don't
actually the hint.

(BTW, whoever came up with "EACCES" needs to go spend time with the
creat() system call somewhere dark and smelly).

> I don't suppose there's any easy way for COPY to distinguish local
> from remote connections

Not that I see, since "local" can be unix socket or tcp to localhost.
Not cleanly anyway.

I don't think it matters. Many hosts have enough paths in common that
in practice the hint on EACCES will be useful anyway. It'd be nice to
work in something about running with the permissions of the PostgreSQL
server, but I don't see a way to do that without making it all more
complex.

I don't think it's worth tons of time anyway. This will be better than
what we have, lets do it.


I'm fairly happy with the wording you came up with:

    "COPY copies to a file on the PostgreSQL server, not on the client. "
    "You may want a client-side facility such as psql's \\copy."

and am inclined to suggest going ahead with the existing wording. I
agree that removing the part for "relative path not allowed for COPY
to file" is reasonable, so I've attached an update that does so and
warns on EACCES too.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения

Re: [PATCH] COPY vs \copy HINT

От
Craig Ringer
Дата:
On 5 September 2016 at 09:05, Craig Ringer <craig@2ndquadrant.com> wrote:

>I've attached an update that does so and
> warns on EACCES too.

... this time, with required parens.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения

Re: [PATCH] COPY vs \copy HINT

От
Christoph Berg
Дата:
Re: Craig Ringer 2016-09-05 <CAMsr+YHP=hTUKpuHK4LOAmWE_JEe-A281QL0uni_gw-V7jQv2w@mail.gmail.com>
> To cover the same-host case we could try something like:
> 
>    COPY runs on the PostgreSQL server, using the PostgreSQL server's
> directories and permissions, it doesn't run on the client.
> 
> ... but I think that's actually less helpful for the users who'll need this.

The part about the server permissions might be useful to hint at.

> > I don't suppose there's any easy way for COPY to distinguish local
> > from remote connections
> 
> Not that I see, since "local" can be unix socket or tcp to localhost.
> Not cleanly anyway.
> 
> I don't think it matters. Many hosts have enough paths in common that
> in practice the hint on EACCES will be useful anyway. It'd be nice to
> work in something about running with the permissions of the PostgreSQL
> server, but I don't see a way to do that without making it all more
> complex.
> 
> I don't think it's worth tons of time anyway. This will be better than
> what we have, lets do it.

It's probably not helpful trying to distinguish local and remote
connections, the set of possible problems is diverse and this is just
one of them.

> I'm fairly happy with the wording you came up with:
> 
>     "COPY copies to a file on the PostgreSQL server, not on the client. "
>     "You may want a client-side facility such as psql's \\copy."

What about
    "COPY TO instructs the PostgreSQL server to write to a file on the server. "    "You may want a client-side
facilitysuch as psql's \\copy."
 
    "COPY FROM instructs the PostgreSQL server to read from a file on the server. "    "You may want a client-side
facilitysuch as psql's \\copy."
 

This stresses more explicitly that the file is opened by the server
process. (I'm not particularly happy that it says "server" in there
twice, but couldn't think of anything else.)

Christoph



Re: [PATCH] COPY vs \copy HINT

От
Craig Ringer
Дата:
On 5 September 2016 at 16:32, Christoph Berg <myon@debian.org> wrote:

> The part about the server permissions might be useful to hint at.

> What about
>
>      "COPY TO instructs the PostgreSQL server to write to a file on the server. "
>      "You may want a client-side facility such as psql's \\copy."
>
>      "COPY FROM instructs the PostgreSQL server to read from a file on the server. "
>      "You may want a client-side facility such as psql's \\copy."
>
> This stresses more explicitly that the file is opened by the server
> process. (I'm not particularly happy that it says "server" in there
> twice, but couldn't think of anything else.)

Tom, any preference here?

I'm probably inclined to go for your original wording and accept that
it's just too hard to hint at the client/server process split in a
single short message.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: [PATCH] COPY vs \copy HINT

От
Tom Lane
Дата:
Craig Ringer <craig@2ndquadrant.com> writes:
> Tom, any preference here?
> I'm probably inclined to go for your original wording and accept that
> it's just too hard to hint at the client/server process split in a
> single short message.

I think my original wording is pretty hopeless for the single-machine
case: "COPY copies to a file on the PostgreSQL server, not on the client".
If the server and client are the same machine, that's content-free.

Christoph's idea isn't bad.  We could tweak it to:
COPY TO instructs the PostgreSQL server process to write a file.
COPY FROM instructs the PostgreSQL server process to read a file.

This seems to cover both the wrong-machine and wrong-process-identity
cases, and as a bonus it might help if you've had a brain fade about
which COPY direction is write vs. read.

(I think we're all in agreement that the second sentence of the hint
is fine, so leaving that out of it.)

Comments?
        regards, tom lane



Re: [PATCH] COPY vs \copy HINT

От
Christoph Berg
Дата:
Re: Tom Lane 2016-09-06 <17637.1473192771@sss.pgh.pa.us>
> Christoph's idea isn't bad.  We could tweak it to:
>
>     COPY TO instructs the PostgreSQL server process to write a file.
>
>     COPY FROM instructs the PostgreSQL server process to read a file.
>
> This seems to cover both the wrong-machine and wrong-process-identity
> cases, and as a bonus it might help if you've had a brain fade about
> which COPY direction is write vs. read.
>
> (I think we're all in agreement that the second sentence of the hint
> is fine, so leaving that out of it.)
>
> Comments?

I like your new version, it's crisp and transports the right message.

Christoph

Re: [PATCH] COPY vs \copy HINT

От
Craig Ringer
Дата:
On 7 September 2016 at 04:19, Christoph Berg <myon@debian.org> wrote:

> I like your new version, it's crisp and transports the right message.

OK, updated with Tom's tweaked version of Christoph's wording per
discussion. Thanks all.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения

Re: [PATCH] COPY vs \copy HINT

От
Tom Lane
Дата:
Craig Ringer <craig@2ndquadrant.com> writes:
> On 7 September 2016 at 04:19, Christoph Berg <myon@debian.org> wrote:
>> I like your new version, it's crisp and transports the right message.

> OK, updated with Tom's tweaked version of Christoph's wording per
> discussion. Thanks all.

Pushed with minor stylistic adjustment (narrowing the scope of
save_errno).
        regards, tom lane