Proof of concept COLLATE support with patch
От | Martijn van Oosterhout |
---|---|
Тема | Proof of concept COLLATE support with patch |
Дата | |
Msg-id | 20050902130420.GA15466@svana.org обсуждение исходный текст |
Ответы |
Re: Proof of concept COLLATE support with patch
Re: Proof of concept COLLATE support with patch Re: Proof of concept COLLATE support with patch |
Список | pgsql-hackers |
Supports any glibc platform and possibly Win32. Adds: SELECT ... ORDER BY expr COLLATE 'locale' CREATE INDEX locale_index ON table(expr COLLATE 'locale') Index scan usedwhen COLLATE order permits This is just a proof of concept patch. I didn't send it to -patches because as Tom pointed out, there's no hope of it getting in due to platform dependant behaviour. This patch does not use setlocale and is completely orthoganal to any locale support already in the backend. As it turns out, meaningful locale support only needs a handful of support functions to work. These are listed at the bottom. My patch only uses the first two, but the third will be needed at some stage. The use of the last one depends on how the backend ends up support locales. Both glibc and wine32 have locale sensetive versions of many functions including: toupper_l, tolower_l, strfmon_l, strtoul_l, strtof_l, strftime_l, is*_l A windows function list is at: http://msdn2.microsoft.com/library/wyzd2bce(en-us,vs.80).aspx Patch available here: http://svana.org/kleptog/pgsql/collate1.patch Implementation notes follow and table of functions is at the bottom. I hope this helps whenever someone gets around to full COLLATE support. Have a nice day, Notes: * It works by replacing (expr COLLATE 'locale') with pg_strxfrm(expr, pg_findlocale(locale)) in the parsetree. pg_findlocale returns an opaque pointer to the locale. It is STRICT IMMUTABLE and is optimised away in the final query. pg_strxfrm takes the string and the locale and returns a bytea. bytea comparison uses memcmp so is safe from otherlocale effects in the backend. * Use of COLLATE for an index will probably double the diskspace required for that index due to the strxfrm. * I had to add the functions to pg_proc.h because CREATE FUNCTION couldn't find them. So they have OIDs I made up. Youmay need to initdb, I'm not sure. You can compile pg_xlocale.c as an shared object and load them that way too if you want to avoid the initdb. * Internally they are defined as taking and returning "internal". CREATE FUNCTION doesn't like that so specify opaqueor oid instead. The declarations are: create function pg_findlocale(text) returns oid as 'pg_findlocale' language internal strict immutable; create function pg_strxfrm(text,oid)returns bytea as 'pg_strxfrm' language internal strict immutable; * The clause ORDER BY 1 COLLATE 'en_AU' breaks, it treats the 1 like a constant. I couldn't quickly work out how toreference the columns the right way. Long term that code should be in the sorting code anyway. * The locale needs to be in quotes, otherwise the parser converts it to lower-case. Locale names are case-sensetiveon many systems. * There is a text function strcoll_l for testing collation: create function pg_strcoll_l(text,text,text) returns int4 as 'pg_strcoll_l' language internal strict immutable; * Yes this is the easy way out, implementing the inheritence of the COLLATE attribute will be much more invasive. Thisgives most people what they want though. * Although these functions are documented on Windows, they are not for glibc, so it is an unstable insterface. Function Needed glibc Win32 --------------------------------------------------------------------- Function returing opaque newlocale _create_locale pointer to locale data strxfrm with locale parameter strxfrm_l _strxfrm_l Method finding encoding for nl_langinfo_l ??? locale strcoll with locale parameter strcoll_l _strcoll_l -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
В списке pgsql-hackers по дате отправления: