Re: Unicode escapes in literals

Поиск
Список
Период
Сортировка
От Peter Eisentraut
Тема Re: Unicode escapes in literals
Дата
Msg-id 4900928B.60300@gmx.net
обсуждение исходный текст
Ответ на Re: Unicode escapes in literals  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Unicode escapes in literals  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Unicode escapes in literals  (Andrew Sullivan <ajs@commandprompt.com>)
Список pgsql-hackers
Tom Lane wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
>> SQL has the following escape syntax for it:
>>     U&'special character: \xxxx' [ UESCAPE '\' ]
> 
> Man that's ugly.  Why the ampersand?

Yeah, excellent question.  It seems completely unnecessary, but it is 
surely there in the syntax diagram.

> How do you propose to distinguish
> this from a perfectly legitimate use of the & operator?

Well, technically, there is going to be some conflict, but the practical 
impact should be minimal because:

- There are no spaces allowed between U&' .  We typically suggest spaces 
around binary operators.

- Naming a column "u" might not be terribly common.

- Binary-and with an undecorated string literal is not very common.

Of course, I have no data for these assertions.  An inquiry on -general 
might give more insight.

>> 2. Convert this syntax to a function call.  But that would then create a 
>> lot of inconsistencies, such as needing functional indexes for matches 
>> against what should really be a literal.
> 
> Uh, why do you think that?  The function could surely be stable, even
> immutable if you grant that a database's encoding can't change.

Yeah, true, that would work.

There are some other disadvantages for making a function call.  You 
couldn't use that kind of literal in any other place where the parser 
calls for a string constant: role names, tablespace locations, 
passwords, copy delimiters, enum values, function body, file names.

There is also a related feature for Unicode escapes in identifiers, and 
it might be good to keep the door open on that.

We could to a dual approach: Convert in the scanner when server encoding  is UTF8, and pass on as function call
otherwise. Surely ugly though.
 

Or pass it on as a separate token type to the analyze phase, but that is 
a lot more work.


Others: What use cases do you envision, and what requirements would they 
create for this feature?


В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Gokulakannan Somasundaram"
Дата:
Сообщение: A small performance bug in BTree Infrastructure
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Any reason to have heap_(de)formtuple?