Re: Unicode normalization SQL functions

Поиск
Список
Период
Сортировка
От Peter Eisentraut
Тема Re: Unicode normalization SQL functions
Дата
Msg-id bba8933e-44ef-761d-55fb-076f2b6b5650@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: Unicode normalization SQL functions  (Andreas Karlsson <andreas@proxel.se>)
Ответы Re: Unicode normalization SQL functions  ("Daniel Verite" <daniel@manitou-mail.org>)
Re: Unicode normalization SQL functions  ("Daniel Verite" <daniel@manitou-mail.org>)
Список pgsql-hackers
On 2020-02-13 01:23, Andreas Karlsson wrote:
> A potential optimization would be to merge utf8_to_unicode() and
> pg_utf_mblen() into one function in unicode_normalize_func() since
> utf8_to_unicode() already knows length of the character. Probably not
> worth it though.

This would also require untangling the entire encoding API.

> It feels a bit wasteful to measure output_size in
> unicode_is_normalized() since unicode_normalize() actually already knows
> the length of the buffer, it just does not return it.

Sure, but really most string APIs work like that.  They surely know the 
string length internally, but afterwards you often have to call strlen() 
again.

> A potential optimization for the normalized case would be to abort the
> quick check on the first maybe and normalize from that point on only. If
> I can find the time I might try this out and benchmark it.

Are you sure this would always be valid?  The fact that this wasn't 
mentioned in UTR #15 makes me suspicious.

> Nitpick: "split/\s*;\s*/, $line" in generate-unicode_normprops_table.pl
> should be "split /\s*;\s*/, $line".

done

> What about using else if in the code below for clarity?
> 
> +        if (check == UNICODE_NORM_QC_NO)
> +            return UNICODE_NORM_QC_NO;
> +        if (check == UNICODE_NORM_QC_MAYBE)
> +            result = UNICODE_NORM_QC_MAYBE;

done

> Remove extra space in the line below.
> 
> +    else if (quickcheck == UNICODE_NORM_QC_NO )

I didn't find this in my local copy.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Fix compiler warnings on 64-bit Windows
Следующее
От: Dmitry Dolgov
Дата:
Сообщение: Re: Index Skip Scan