Re: substring start position behavior

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: substring start position behavior
Дата
Msg-id 72911.1709703729@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: substring start position behavior  (Bruce Momjian <bruce@momjian.us>)
Ответы Re: substring start position behavior  (Bruce Momjian <bruce@momjian.us>)
Список pgsql-docs
Bruce Momjian <bruce@momjian.us> writes:
> This web page explains the feature:
>     https://stackoverflow.com/questions/33462061/sql-server-substring-position-negative-value
> but also asks:
>     now the only question that remains is, "why would anyone need it
>     to behave this way?"

Yeah.  I believe our implementation adheres to the SQL spec, which
says this for <character substring function> (in SQL:2021 6.3.2):

    a) If the character encoding form of <character value expression>
    is UTF8, UTF16, or UTF32, then, in the remainder of this General
    Rule, the term “character” shall be taken to mean “unit specified
    by <char length units>”.

    b) Let C be the value of the <character value expression>, let LC
    be the length in characters of C, and let S be the value of the
    <start position>.

    c) If <string length> is specified, then let L be the value of
    <string length> and let E be S+L. Otherwise, let E be the larger
    of LC+1 and S.

    d) If at least one of C, S, and L is the null value, then the
    result of the <character substring function> is the null value.

    e) If E is less than S, then an exception condition is raised:
    data exception — substring error (22011).  [tgl note: given c),
    this happens if and only if a negative <string length> is provided.]

    f) Case:

      i) If S is greater than LC or if E is less than 1 (one), then
      the result of the <character substring function> is the
      zero-length character string.

      ii) Otherwise,

        1) Let S1 be the larger of S and 1 (one). Let E1 be the
        smaller of E and LC+1. Let L1 be E1–S1.

        2) The result of the <character substring function> is a
        character string containing the L1 characters of C starting at
        character number S1 in the same order that the characters
        appear in C.

That's a pretty sterling example of standards-ese that is both
unreadable and devoid of any justification.  But if you trace through
the possible effects of a negative S value, it looks like

  (1) if L >= 0 is specified and S+L (E) is less than one, the result
  is an empty string per rule f)i).

  (2) if L >= 0 is specified and S+L (E) is at least one but less than
  LC+1, then E is the substring end+1 position.

  (3) otherwise, a negative S is disregarded and replaced by 1 so
  far as the substring end calculation is concerned.

  (4) in any case, a negative S is disregarded and replaced by 1 so
  far as the substring start calculation is concerned.

I'm kind of inclined to not document this weirdness.  I especially
don't think it's worth giving an example that neither explains the
"disregarded" bit nor highlights the dependency on L being given.

            regards, tom lane



В списке pgsql-docs по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: substring start position behavior
Следующее
От: PG Doc comments form
Дата:
Сообщение: Non-blocking synchronization in libpq using pipeline mode