Re: [HACKERS] ICU integration

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: [HACKERS] ICU integration
Дата
Msg-id CAH2-Wzkt_bf5yBeoNMB95NG40tg_UNNvCpYLzhmAoNijG3LMPA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] ICU integration  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Ответы Re: [HACKERS] ICU integration  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Список pgsql-hackers
On Wed, Feb 15, 2017 at 9:17 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> I have changed it to use ucol_nextSortKeyPart() when appropriate.

I think it would be fine if the second last argument was
"sizeof(Datum)", not "Min(sizeof(Datum), sss->buflen2)" here:

> +               if (GetDatabaseEncoding() == PG_UTF8)
> +               {
> +                   UCharIterator iter;
> +                   uint32_t    state[2];
> +                   UErrorCode  status;
> +
> +                   uiter_setUTF8(&iter, sss->buf1, len);
> +                   state[0] = state[1] = 0;  /* won't need that again */
> +                   status = U_ZERO_ERROR;
> +                   bsize = ucol_nextSortKeyPart(sss->locale->info.icu.ucol,
> +                                                &iter,
> +                                                state,
> +                                                (uint8_t *) sss->buf2,
> +                                                Min(sizeof(Datum), sss->buflen2),
> +                                                &status);
> +                   if (U_FAILURE(status))
> +                       ereport(ERROR,
> +                               (errmsg("sort key generation failed: %s", u_errorName(status))));
> +               }

Does this really need to be in the strxfrm() loop at all? I don't see
why you need to ever retry here.

There is an issue of style here:

> -       /* Just like strcoll(), strxfrm() expects a NUL-terminated string */
>         memcpy(sss->buf1, authoritative_data, len);
> +       /* Just like strcoll(), strxfrm() expects a NUL-terminated string.
> +        * Not necessary for ICU, but doesn't hurt. */

I see that you have preserved strcoll() comparison caching (or should
I say ucol_strcollUTF8() comparison caching?), at the cost of having
to keep around a buffer which we must continue to copy every text
string into within varstr_abbrev_convert(). That was probably the
right trade-off. It doesn't hurt that it required minimal divergence
within new ICU code, too.

-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] Patch to implement pg_current_logfile() function
Следующее
От: Justin Workman
Дата:
Сообщение: Re: [HACKERS] Possible issue with expanded object infrastructure on Postgres 9.6.1