Re: [PATCHES] char/varchar locale support

Поиск
Список
Период
Сортировка
От Thomas G. Lockhart
Тема Re: [PATCHES] char/varchar locale support
Дата
Msg-id 355C4095.88DD1D94@alumni.caltech.edu
обсуждение исходный текст
Ответы Re: [HACKERS] Re: [PATCHES] char/varchar locale support  (Oleg Broytmann <phd@comus.ru>)
Список pgsql-hackers
(moved to hackers list)

> I am working on extending locale support for char/varchar types.
> Q1. I touched ...src/include/utils/builtins.h to insert the following
> macros:
> -----
> #ifdef USE_LOCALE
>    #define pgstrcmp(s1,s2,l) strcoll(s1,s2)
> #else
>    #define pgstrcmp(s1,s2,l) strncmp(s1,s2,l)
> #endif
> -----
> Is it right place? I think so, am I wrong?

Probably the right place. Probably the wrong code; see below...

> Q2. Bartunov said me I should read varlena.c. I read it and found
> that for every strcoll() for both strings there are calls to allocate
> memory (to make them null-terminated). Oleg said I need the same for
> varchar.
> Do I really need to allocate space for varchar? What about char? Is it
> 0-terminated already?

No, neither bpchar nor varchar are guaranteed to be null terminated.
Yes, you will need to allocate (palloc()) local memory for this. Your
pgstrcmp() macros are not equivalent, since strncmp() will stop the
comparison at the specified limit (l) where strcoll() requires a null
terminated string.

If you look in varlena.c you will find several places with
  #if USE_LOCALE
  ...
  #else
  ...
  #endif

Those blocks will need to be replicated in varchar.c for both bpchar and
varchar support routines.

The first example I looked at in varlena.c seems to have trouble in that
the code looks a bit troublesome :( In the code snippet below (from
text_lt), both input strings are replicated and copied to the same
output length, even though the input lengths can be different. Looks
wrong to me:

    memcpy(a1p, VARDATA(arg1), len);
    *(a1p + len) = '\0';
    memcpy(a2p, VARDATA(arg2), len);
    *(a2p + len) = '\0';

Instead of "len" in each expression it should probably be
  len1 = VARSIZE(arg1)-VARHDRSZ
  len2 = VARSIZE(arg2)-VARHDRSZ

Another possibility for implementation is to write a string comparison
routine (e.g. varlena_cmp()) which takes two arguments and returns -1,
0, or 1 for less than, equals, and greater than. All of the comparison
routines can call that one (which would have the #if USE_LOCALE), rather
than having USE_LOCALE spread through each comparison routine.

                       - Tom

В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Jose' Soares Da Silva"
Дата:
Сообщение: Re: [INTERFACES] Group/Order by not in target - Was [NEW ODBC DRIVER]
Следующее
От: "Thomas G. Lockhart"
Дата:
Сообщение: CREATE DATABASE