Re: Dollar in identifiers

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Dollar in identifiers
Дата
Msg-id 29572.998060418@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Dollar in identifiers  (Jan Wieck <JanWieck@Yahoo.com>)
Ответы Re: Dollar in identifiers  (Bruce Momjian <pgman@candle.pha.pa.us>)
Re: Dollar in identifiers  (Peter Eisentraut <peter_e@gmx.net>)
Re: Dollar in identifiers  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
I've been thinking some more about this dollar-sign business.  There
are a couple of points that haven't been made yet.  If you'll allow
me to recap:

It seems like there are two reasonable paths we could take:

1. Keep $ as an operator character.  If we go this way, I think we
should allow a single $ as an operator name too (by removing $ from
the set of "self" characters in scan.l, so that it lexes as an Op).

2. Make $ an identifier character.  Remove it from the set of allowed
operator characters, and instead allow it as second-or-later character
in identifiers.  (It cannot be allowed as first character, else it's
totally ambiguous whether $12 is meant to be a parameter or identifier.)

Option 2 improves Oracle compatibility, at the price of breaking
backwards compatibility for applications that presently use $ as part
of multi-character operator names.  (But does anyone know of any?)

An important thing to think about here is the effects on lexing of
parameter symbols ($digits).  Option 1 does not complicate parameter
lexing; $digits will still be read as a parameter since it's a longer
token than could be formed by taking the $ as an Op.  However, this
option doesn't make things any better either: in particular, we still
have the lexing ambiguity of multicharacter operator vs. parameter.
"x+$12" will be read as x +$ 12, though more likely x + $12 was meant.

With $-as-identifier, it'd no longer be possible for adjacent operators
and parameters to be confused.  Instead we have a new ambiguity with
adjacent parameters and identifiers/keywords.  Presently "select$1from"
is read as SELECT param FROM, but with $-as-identifier it'd be read as
a single identifier.  But the interesting point is that this'd make
parameters work a lot more like identifiers.  People don't expect to
be able to write identifiers adjacent to other identifiers with no
whitespace.  They do expect to be able to write them adjacent to
operators.

In fact, with $-as-identifier we'd have this useful property: given a
lexically-recognizable identifier, substitution of a parameter token
for the identifier does not require insertion of any whitespace to
keep the parameter lexically recognizable.  Some of you will recall
plpgsql bugs associated with the fact that the current lexer behavior
does not have this property.  (The other direction doesn't work 100%,
for example: "select $1from" is lexable, "select foofrom" isn't.  But
that direction is much less interesting in practice.)

In short, $-as-identifier makes the lexer behavior noticeably cleaner
than it is now.

I started out firmly in the "keep $ an operator character" camp.  But
after thinking this through I'm sitting on the fence: both options seem
about equally attractive to me.

Comments?
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: crypt and null termination
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: Dollar in identifiers