Re: [HACKERS] multi-byte aware char_length() etc.

Поиск
Список
Период
Сортировка
От Thomas G. Lockhart
Тема Re: [HACKERS] multi-byte aware char_length() etc.
Дата
Msg-id 3510B230.4A815C51@alumni.caltech.edu
обсуждение исходный текст
Ответ на multi-byte aware char_length() etc.  (t-ishii@sra.co.jp)
Ответы Re: [HACKERS] multi-byte aware char_length() etc.  (t-ishii@sra.co.jp)
Список pgsql-hackers
> I'm planning to modify some string functions so that they would be
> aware of multi-byte strings if compiled with the multi-byte
> capability.  Followings are files I'm going to modify. I would like to
> hear your opinions if you have any.
>
> o character_length()
>
> It seems that the function is implemented as textlen() in
> utils/adt/varlena.c or as varcharlen() in varchar.c. Current
> implementaion returns an octet length rather than a char length. So I
> will change them. However, there might be necessity for getting an
> octet length in some applications. Maybe this is a good chance to add
> SQL92's octet_length().

Yes.

> o lower()/upper()
>
> Implemented in oracle_compat.c. One thing I have noticed is that it
> uses toupper()/tolower(). For ASCII, they are fine. But on some
> platforms (I guess SysV) they might have some problems:
>
>         char c; /* c is an 8-bit letter and this platform uses char as
>                    signed char */
>         toupper(c);     /* may cause segfault or any other bad thing */
>
> So I will change like:
>
>         toupper((unsigned char)c);

I would like to move these routines, as you clean them up, to varlena.c
or whatever Postgres-specific source file is appropriate. Let's leave
oracle_compat.c for non-standard, Oracle-specific functions. Perhaps
eventually we can move any of those which remain to the contrib
directory, assuming that there are good equivalent functions available
in SQL92.

Sort of annoying having oracle_compat when Oracle doesn't return the
favor by having a "postgres_compat". Well, maybe DataBlades are the same
thing?? :)

> o position()
>
> Implemented as textpos() in varlena.c.
>
> o substring()
>
> Implemented as text_substr() in varlena.c.

These two are OK. I'm not yet clear on where in the parser these varlena
functions are matched up with both text and varchar() types. We may need
to do something different as we keep working on getting the
text/varchar/char behavior improved.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: t-ishii@sra.co.jp
Дата:
Сообщение: Re: [HACKERS] First mega-patch...
Следующее
От: Doug Lo
Дата:
Сообщение: Tix + Postgres.