Re: Define jsonpath functions as stable

Поиск
Список
Период
Сортировка
От Chapman Flack
Тема Re: Define jsonpath functions as stable
Дата
Msg-id 5D82B3F3.9080808@anastigmatix.net
обсуждение исходный текст
Ответ на Re: Define jsonpath functions as stable  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Define jsonpath functions as stable  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On 09/18/19 17:12, Tom Lane wrote:

> After further reading, it seems like what that text is talking about
> is not actually a regex feature, but an outgrowth of the fact that
> the regex pattern is being expressed as a string literal in a language
> for which XML character entities are a native aspect of the string
> literal syntax.  So it looks to me like the entities get folded to
> raw characters in a string-literal parser before the regex engine
> ever sees them.

Hmm. That occurred to me too, but I thought the explicit mention of
'character reference' in the section specific to regexes[1] might not
mean that. It certainly could have been clearer.

But you seem to have the practical agreement of both BaseX:

let $foo := codepoints-to-string((38,35,120,54,49,59))
return ($foo, matches('a', $foo))
------
a
false

and the Saxon-based pljava example:

select occurrences_regex('a', 'a', w3cNewlines => true);
 occurrences_regex
-------------------
                 0

> As such, I think this doesn't apply to SQL/JSON.  The SQL/JSON spec
> seems to defer to Javascript/ECMAscript for syntax details, and
> in either of those languages you have backslash escape sequences
> for writing weird characters, *not* XML entities.  You certainly
> wouldn't have use of such entities in a native implementation of
> LIKE_REGEX in SQL.

So yeah, that seems to be correct.

The upshot seems to be a two-parter:

1. Whatever string literal syntax is used in front of the regex engine
   had better have some way to represent any character you could want
   to match, and
2. There is only one way to literally match a character that is a regex
   metacharacter, namely, to precede it with a backslash (that the regex
   engine will see; therefore doubled if necessary). Whatever codepoint
   escape form might be available in the string literal syntax does not
   offer another way to do that, because it happens too early, before
   the regex engine can see it.

> So now I'm thinking we can just remove the handwaving about entities.
> On the other hand, this points up a large gap in our docs about
> SQL/JSON, which is that nowhere does it even address the question of
> what the string literal syntax is within a path expression.

That does seem like it ought to be covered.

Regards,
-Chap



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: Optimization of some jsonb functions
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: [PATCH] src/test/modules/dummy_index -- way to test reloptionsfrom inside of access method