Re: EBCDIC sorting as a use case for ICU rules

Поиск
Список
Период
Сортировка
От Peter Eisentraut
Тема Re: EBCDIC sorting as a use case for ICU rules
Дата
Msg-id 1f20d0d7-6b15-d10f-94f5-77b2e82112b1@enterprisedb.com
обсуждение исходный текст
Ответ на EBCDIC sorting as a use case for ICU rules  ("Daniel Verite" <daniel@manitou-mail.org>)
Ответы Re: EBCDIC sorting as a use case for ICU rules
Re: EBCDIC sorting as a use case for ICU rules
Список pgsql-hackers
On 21.06.23 15:28, Daniel Verite wrote:
> A collation like the following this seems to work (the rule simply enumerates
> US-ASCII letters in the EBCDIC alphabet order, with adequate quoting)
> 
> CREATE COLLATION ebcdic (provider='icu', locale='und',
> rules=$$&'
>
'<'.'<'<'<'('<'+'<\|<'&'<'!'<'$'<'*'<')'<';'<'-'<'/'<','<'%'<'_'<'>'<'?'<'`'<':'<'#'<'@'<\'<'='<'"'<a<b<c<d<e<f<g<h<i<j<k<l<m<n<o<p<q<r<'~'<s<t<u<v<w<x<y<z<'['<'^'<']'<'{'<A<B<C<D<E<F<G<H<I<'}'<J<K<L<M<N<O<P<Q<R<'\'<S<T<U<V<W<X<Y<Z<0<1<2<3<4<5<6<7<8<9$$);
> 
> This can be useful for people who migrate from mainframes to Postgres
> and need their migration tests to produce the same sorted results as the
> original system.
> Since rules can be defined at the database level with the icu_rules option,
> they don't even need to tweak their queries to add COLLATE clauses,
> which surely is appreciable in that kind of project.
> 
> US-ASCII when sorted in EBCDIC order comes out like this:
> 
> .<(+|&!$*);-/,%_>?`:#@'="abcdefghijklmnopqr~stuvwxyz[^]{ABCDEFGHI}JKLMNOPQR\ST
> UVWXYZ0123456789
> 
> Maybe this example could be added to the documentation except for
> the problem that the rule is very long and dollar-quoting cannot be split
> into several lines. Literals enclosed by single quotes can be split that
> way, but would require escaping the single quotes in the rule, which
> would lead to scary-looking over-quoted contents.

You can use whitespace in the rules.  For example,

CREATE COLLATION ebcdic (provider='icu', locale='und',
rules=$$
& ' ' < '.' < '<' < '(' < '+' < \|
< '&' < '!' < '$' < '*' < ')' < ';'
< '-' < '/' < ',' < '%' < '_' < '>' < '?'
< '`' < ':' < '#' < '@' < \' < '=' < '"'
< a < b < c < d < e < f < g < h < i
< j < k < l < m < n < o < p < q < r
< '~' < s < t < u < v < w < x < y < z
< '[' < '^' < ']'
< '{' < A < B < C < D < E < F < G < H < I
< '}' < J < K < L < M < N < O < P < Q < R
< '\'  < S < T < U < V < W < X < Y < Z
< 0 < 1 < 2 < 3 < 4 < 5 < 6 < 7 < 8 < 9
$$);

(This particular layout is meant to match the rows in
https://en.wikipedia.org/wiki/EBCDIC#Code_page_layout.)



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: logicalrep_message_type throws an error
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: EBCDIC sorting as a use case for ICU rules