Change to make strings SQL-standard

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Change to make strings SQL-standard
Дата
Msg-id 200506171458.j5HEwUK14871@candle.pha.pa.us
обсуждение исходный текст
Список pgsql-general
You might know that our current handling of quoted strings is not SQL
compliant.  In fact, we have a TODO item for it:

    * Allow backslash handling in quoted strings to be disabled for
      portability

      The use of C-style backslashes (.e.g. \n, \r) in quoted strings is not
      SQL-spec compliant, so allow such handling to be disabled.  However,
      disabling backslashes could break many third-party applications and
      tools.

This non-standard behavior is probably fine for current PostgreSQL
users, but it makes it difficult for users porting applications written
for other databases.

Attached is a plan to move to standards-compliant strings.  It involves
moving all strings that use escapes to a new E''-type string, and
finally making ordinary strings standards-compliant by treating
backslashes literally.

Comments?

---------------------------------------------------------------------------


                         Making Strings Handling Standard
                         ================================

(Current version at http://candle.pha.pa.us/cgi-bin/pgescape)

Right now PostgreSQL uses '\' in a string for escaping of single quotes,
newlines, octal values, and other special characters.  This conflicts
with the ANSI standard and with most other commercial databases.  It
makes porting queries to PostgreSQL harder.

The current idea is to phase out special backslash handling in normal
query strings, and add a special 'E'-type string that will allow
backslash escapes, e.g.

    'c:\tmp' => 'c:\tmp'

    E'c:\\tmp' => 'c:\tmp'
    E'\007abc' => bell, 'abc'

This would give PostgreSQL standards compliance, and allow escapes to be
easily used in E'' strings.

The basic plan is to encourage all applications to change the use of \'
in strings to '', because that is standards-compliant and works on all
versions of PostgreSQL.  Second, all strings that use backslashes should
use E''-style strings, and once that happens, non-E'' strings will no
longer treat the backslash as a special character.  Basically, all use
of \' are removed, and other uses of backslashes are migrated to E''. In
summary:

    1. Change all \' to SQL-standard ''.
    2. Change strings using \ to E'' strings.
    3. Finally, change non-E strings to treat \ literally.

Problems
--------

Problem:  A routine that escapes quotes in strings as \' instead of ''
would allow malicious code injection, e.g. 'abc\'DROP'.  It should be
'abc''DROP'.

Solution:  Tell users to use only '' to place a quote in a string, or
use E''.

Problem:  Backslashes processed as literals, e.g. '\n' is read as '\' and
'n', or '\\' taken as '\\'.

Solution:  Use E''.

Plan
----

Here is a plan to get us there with minimal disruption for our users.

7.X, 8.0.X - Backpatch E'' as a noop for porting?

8.1 - Add warning if \ appears in a string (off by default), support E''
that never generates a warning.

8.2 - Enable warning by default

8.3 - Have non-'E' strings treat backslashes literally.

Loading dumps:
--------------

7.4 - pg_dump uses '' for literal quotes, so that will load fine.  The
only problem is that backslashes in function bodies will come in as
literal.  Also, if the backslash is inside a quoted string in the quoted
function body, it will be quadrupled, and fixing it might be diffcult.
One possible solution is to backpatch E'' as a no-op and have pg_dump
use it if there is a backslash in the string.

8.0 - Loads fine because it uses $$ for function strings, $$ behavior
does not use backslash escapes.

8.1 - Same.

Misc
----

The COPY format does not change.

Add read-only GUC variable to report if normal strings support escapes,
escape_string_prefixes.

Add read-only GUC variable to report if backslashes are treated literally
in strings, standard_compliant_strings.

PQescapeString() and PQescapeBytea() already uses ''.  However, they
double backslashes.  They should be used only for E'' strings.

Add PQquoteString() for non-E strings to double single quotes.

Use log_min_error_statement=warning to see the statement that generates
the escape warning in the server log file.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

В списке pgsql-general по дате отправления:

Предыдущее
От: Douglas McNaught
Дата:
Сообщение: Re: Multiple COPYs
Следующее
От: Tom Lane
Дата:
Сообщение: Re: query plan in pg7.4 vs 8.0.3