Re: Define jsonpath functions as stable

Поиск

Список

Период

Сортировка

От	Jonathan S. Katz
Тема	Re: Define jsonpath functions as stable
Дата	16 сентября 2019 г. 20:36:29
Msg-id	1149945c-8a31-ec24-b454-3e410f8a70b6@postgresql.org обсуждение исходный текст
Ответ на	Re: Define jsonpath functions as stable (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: Define jsonpath functions as stable
Список	pgsql-hackers

Дерево обсуждения

On 9/16/19 11:20 AM, Tom Lane wrote:
> "Jonathan S. Katz" <jkatz@postgresql.org> writes:
>> It sounds like the easiest path to completion without potentially adding
>> futures headaches pushing back the release too far would be that, e.g.
>> these examples:
>
>>     $.** ? (@ like_regex "O(w|v)" pg flag "i")
>>     $.** ? (@ like_regex "O(w|v)" pg)
>
>> If it's using POSIX regexp, I would +1 using "posix" instead of "pg"
>
> I agree that we'd be better off to say "POSIX".  However, having just
> looked through the references Chapman provided, it seems to me that
> the regex language Henry Spencer's library provides is awful darn
> close to what XPath is asking for.  The main thing I see in the XML/XPath
> specs that we don't have is a bunch of character class escapes that are
> specifically tied to Unicode character properties.  We could possibly
> add code to implement those, but I'm not sure how it'd work in non-UTF8
> database encodings.

Maybe taking a page from the pg_saslprep implementation. For some cases
where the string in question would issue a "reject" under normal
SASLprep[1] considerations (really stringprep[2]), PostgreSQL just lets
the string passthrough to the next step, without alteration.

What's implied here is if the string is UTF-8, it goes through SASLprep,
but if not, it is just passed through.

So perhaps the answer is that if we implement XQuery, the escape for
UTF-8 character properties are only honored if the encoding is set to be
UTF-8, and ignored otherwise. We would have to document that said
escapes only work on UTF-8 encodings.

>  There may also be subtle differences in the behavior
> of character class escapes that we do have in common, such as "\s" for
> white space; but again I'm not sure that those are any different than
> what you get naturally from encoding or locale variations.
>
> I think we could possibly get away with not having any special marker
> on regexes, but just explaining in the documentation that "features
> so-and-so are not implemented".  Writing that text would require closer
> analysis than I've seen in this thread as to exactly what the differences
> are.

+1, and likely would need some example strings too that highlight the
difference in how they are processed.

And again, if we end up updating the behavior in the future, it becomes
a part of our standard deprecation notice at the beginning of the
release notes, though one that could require a lot of explanation.

Jonathan

[1] https://tools.ietf.org/html/rfc4013
[2] https://www.ietf.org/rfc/rfc3454.txt

Вложения

signature.asc

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Fabien COELHO
Дата: 16 сентября 2019 г., 20:17:27
Сообщение: Re: refactoring - share str2*int64 functions

Следующее

От: Stephen Frost
Дата: 16 сентября 2019 г., 20:39:33
Сообщение: Re: block-level incremental backup

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Define jsonpath functions as stable

Вложения

Предыдущее

Следующее