Re: daitch_mokotoff module

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: daitch_mokotoff module
Дата
Msg-id 3563190.1641227676@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: daitch_mokotoff module  (Dag Lem <dag@nimrod.no>)
Ответы [PATCH] Run UTF8-dependent tests for citext [Re: daitch_mokotoff module]  (Dag Lem <dag@nimrod.no>)
Список pgsql-hackers
Dag Lem <dag@nimrod.no> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> (We do have methods for dealing with non-ASCII test cases, but
>> I can't see that this patch is using any of them.)

> I naively assumed that tests would be run in an UTF8 environment.

Nope, not necessarily.

Our current best practice for this is to separate out encoding-dependent
test cases into their own test script, and guard the script with an
initial test on database encoding.  You can see an example in
src/test/modules/test_regex/sql/test_regex_utf8.sql
and the two associated expected-files.  It's a good idea to also cover
as much as you can with pure-ASCII test cases that will run regardless
of the prevailing encoding.

> Running "ack -l '[\x80-\xff]'" in the contrib/ directory reveals that
> two other modules are using UTF8 characters in tests - citext and
> unaccent.

Yeah, neither of those have been upgraded to said best practice.
(If you feel like doing the legwork to improve that situation,
that'd be great.)

> Looking into the unaccent module, I don't quite understand how it will
> work with various encodings, since it doesn't seem to decode its input -
> will it fail if run under anything but ASCII or UTF8?

Its Makefile seems to be forcing the test database to use UTF8.
I think this is a less-than-best-practice choice, because then
we have zero test coverage for other encodings; but it does
prevent test failures.

            regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Remove extra spaces
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: psql - add SHOW_ALL_RESULTS option