Proposal: CREATE CONVERSION

Поиск
Список
Период
Сортировка
От Tatsuo Ishii
Тема Proposal: CREATE CONVERSION
Дата
Msg-id 20020705.153641.71101525.t-ishii@sra.co.jp
обсуждение исходный текст
Ответы Re: Proposal: CREATE CONVERSION  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Proposal: CREATE CONVERSION  (Bruce Momjian <pgman@candle.pha.pa.us>)
Re: Proposal: CREATE CONVERSION  (Tatsuo Ishii <t-ishii@sra.co.jp>)
Список pgsql-hackers
Here is my proposal for new CREATE CONVERSION which makes it possible
to define new encoding conversion mapping between two encodings on the
fly.

The background:

We are getting having more and more encoding conversion tables. Up to
now, they reach to 385352 source lines and over 3MB in compiled forms
in total. They are statically linked to the backend. I know this
itself is not a problem since modern OSs have smart memory management
capabilities to fetch only necessary pages from a disk. However, I'm
worried about the infinite growing of these static tables.  I think
users won't love 50MB PostgreSQL backend load module.

Second problem is more serious. The conversion definitions between
certain encodings, such as Unicode and others are not well
defined. For example, there are several conversion tables for Japanese
Shift JIS and Unicode. This is because each vendor has its own
"special characters" and they define the table in that the conversion
fits for their purpose.

The solution:

The proposed new CREATE CONVERSION will solve these problems. A
particular conversion table is statically linked to a dynamic loaded
function and CREATE CONVERSION will tell PostgreSQL that if
a conversion from encoding A to encoding B, then function C should be
used. In this way, conversion tables are no more statically linked to
the backend.

Users also could define their own conversion tables easily that would
best fit for their purpose. Also needless to say, people could define
new conversions which PostgreSQL does not support yet.

Syntax proposal:

CREATE CONVERSION <conversion name>      SOURCE <source encoding name>      DESTINATION <destination encoding name>
FROM <conversion function name>
 
;
DROP CONVERSION <conversion name>;

Example usage:

CREATE OR REPLACE FUNCTION euc_jp_to_utf8(TEXT, TEXT, INTEGER)      RETURNS INTEGER AS euc_jp_to_utf8.so LANGUAGE 'c';
CREATE CONVERSION euc_jp_to_utf8      SOURCE EUC_JP DESTINATION UNICODE      FROM euc_jp_to_utf8;

Implementation:

Implementation would be quite straightforward. Create a new system
table, and CREATE CONVERSION stores info onto
it. pg_find_encoding_converters(utils/mb/mbutils.c) and friends needs
to be modified so that they recognize dynamically defined conversions.
Also psql would need some capabilities to print conversion definition
info.

Comments?
--
Tatsuo Ishii




В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Christopher Kings-Lynne"
Дата:
Сообщение: Re: BETWEEN Node & DROP COLUMN
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: BETWEEN Node & DROP COLUMN