Обсуждение: \dn [PATTERN] handling not quite right...
I haven't looked in great detail into why this is happpening, but it
seems as though processNamePattern() doesn't handle ?'s correctly in
the negative lookahead context correctly.
1) \dn [pattern] strips ?'s and replaces them with periods. This may
be intentional (as the comment in describe.c suggests, converting input
from shell-style wildcards gets converted into regexp notation), but is
quite annoying. Ex:
test=# \dn foo(?!_log|_shadow)
********* QUERY **********
SELECT n.nspname AS "Name",
u.usename AS "Owner"
FROM pg_catalog.pg_namespace n LEFT JOIN pg_catalog.pg_user u
ON n.nspowner=u.usesysid
WHERE (n.nspname NOT LIKE 'pg\\_temp\\_%' OR
n.nspname = (pg_catalog.current_schemas(true))[1])
AND n.nspname ~ '^foo(.!_log|_shadow)$'
ORDER BY 1;
**************************
Which is incorrect, IMHO. Instead the last bit of the query should be:
AND n.nspname ~ '^foo(?!_log|_shadow)$'
2) This brings up a large deficiency with the way that \d? [pattern]
handling is done in psql(1). It'd be slick if there was a way to have
psql's pattern routine look at the first non-whitespace character or
two to change change the structure of the query. Something like \dn
!.*_shadow% would change the RE operator from ~ to !~ and \dn %bar%
would translate to LIKE('bar%'). Doing the regexp equiv of
!LIKE('%_shadow') isn't trivial because '^.*(?!_shadow)$' doesn't
return the expected result for various reasons. Oh! This'd be a "gun
pointed at foot" feature, but having the first character being an =
would, without escaping, drop the remainder of the input directly into
the query (ex: \dn =nspname != (LIKE('%_log') OR LIKE('%_shadow'))).
Maybe a psql(1) variable that changes the behavior of the pattern
queries from using an RE to a LIKE statement could also be a
possibility. The more I think about this, a leading pipe could be used
to pipe the output to a utility, so that \dn | egrep -v '(log|shadow)
would work and would be the easiest solution.
Maybe a better "bug report" would be, what's the suggested way of doing:
n.nspname !~ '_(log|shadow)$'?
from a list pattern?
-sc
--
Sean Chittenden
Sean Chittenden <sean@chittenden.org> writes:
> I haven't looked in great detail into why this is happpening, but it
> seems as though processNamePattern() doesn't handle ?'s correctly in
> the negative lookahead context correctly.
Negative lookahead context!? You are several sigmas beyond the subset
of regex functionality that \d and friends are intended to support.
Given that we're defining * and ? as shell-style wildcards, it's not
going to be feasible to handle very much of ordinary regex usage let
alone advanced.
> The more I think about this, a leading pipe could be used
> to pipe the output to a utility, so that \dn | egrep -v '(log|shadow)
> would work and would be the easiest solution.
This on the other hand seems more like a potentially useful feature,
although I'm unclear on what you expect to get sent through the pipe
exactly --- you want column headers for instance? What if you're using
a nondefault display layout?
regards, tom lane
>> I haven't looked in great detail into why this is happpening, but it
>> seems as though processNamePattern() doesn't handle ?'s correctly in
>> the negative lookahead context correctly.
>
> Negative lookahead context!? You are several sigmas beyond the subset
> of regex functionality that \d and friends are intended to support.
> Given that we're defining * and ? as shell-style wildcards, it's not
> going to be feasible to handle very much of ordinary regex usage let
> alone advanced.
I was worried you'd say as much. I'm in a situation where I've got a
few hundred schemas floating around now and about half of them end with
_log or _shadow and I was surprised at how non-trivial it was to filter
out the _log or _shadow schemas with \dn. I tried thinking up the psql
equiv of tcsh's fignore but had no luck (ex: set fignore = (\~ .o
.bak)).
>> The more I think about this, a leading pipe could be used
>> to pipe the output to a utility, so that \dn | egrep -v '(log|shadow)
>> would work and would be the easiest solution.
>
> This on the other hand seems more like a potentially useful feature,
> although I'm unclear on what you expect to get sent through the pipe
> exactly --- you want column headers for instance? What if you're using
> a nondefault display layout?
Instead of using printf(), fprintf(), fwrite(), or whatever it is that
psql(1) uses internally for displaying result sets, have it use the
following chunk of pseudo code:
if (pipe_symbol_found) {
char buf[8192];
size_t len = 0;
memset(&buf, 0, sizeof(buf));
fh = popen(..., "r+"); /* or setsocketpair() + fork() */
fwrite(formatted_output_buffer, strlen(formatted_output_buffer), 1,
fh);
while((len = read(fileno(fh), buf, sizeof(buf)))) {
fwrite(buf, len, 1, stdout);
}
} else {
/* whatever the current code does */
}
That doesn't take into account the set option that lets you write the
output to a file, but that is trivial to handle. To answer your
question, I'd send _everything_ through the pipe and use the pipe as a
blanket IO filter. I haven't thought about this, but would it be
possible to hand the data off to sh(1) and have it handle the
pipe/redirection foo that way psql doesn't have to have any
pipe/redirection brains? If so, I think that'd be slick since you
could do things like '\dn | tail -n +3 | grep -v blah' to handle your
concern about having the header sent through and a utility not wanting
it.
Too bad tee(1) doesn't support a -p option to have tee(1)'s argument
sent to sh(1) or a pipe instead of a file, then there'd be some real
interesting things that one could script. Ex:
\dn | tee -p 'head -n 3 >> /dev/stdout' | tail -n +3 | egrep -v
'_(log|shadow)$'
Which'd show you the header, but everything after the header would be
sent to egrep(1). I can't understand why win32 users think *NIX's CLI
can be confusing.... *grin*.
Just some thoughts. -sc
--
Sean Chittenden