Обсуждение: create custom collation from case insensitive portuguese

Поиск
Список
Период
Сортировка

create custom collation from case insensitive portuguese

От
Alexandre Riveira
Дата:
Thanks to all the work they have done with this incredible database, 
postgresql.

I've achieved some success in changing collate operating system (linux) 
to generate sort of way of Brazil Portuguese hopes by adding the 
following code in LC_COLLATE

LC_COLLATE
copy "iso14651_t1_ci"
reorder-after <U00A0>
<U0020> <CAP>; <CAP>; <CAP>; <U0020>
reorder-end


And in my tests I could change the behavior of the postgresql UPPER 
changing the file i18n ... etc.

toupper /  (<U0061>, <U0041>);


So far, I understood the mechanism of change collate and reproduce in 
postgresql, and I could not generate a case-insensitive search, I 
believe that would change within the LC_COLLATE variable, but could not 
go any further than that.

Could someone guide me please.

Thanks,

Alexandre Riveira
Brazil


Re: create custom collation from case insensitive portuguese

От
Euler Taveira de Oliveira
Дата:
Alexandre Riveira escreveu:
> I've achieved some success in changing collate operating system (linux)
> to generate sort of way of Brazil Portuguese hopes by adding the
> following code in LC_COLLATE
> 
This was already discussed; search the archives [1] [2].

> So far, I understood the mechanism of change collate and reproduce in
> postgresql, and I could not generate a case-insensitive search, I
> believe that would change within the LC_COLLATE variable, but could not
> go any further than that.
> 
PostgreSQL doesn't support case-insensitive searches specifying the collate
per column yet. Look at [3]. But you could use ILIKE or regular expression to
achieve  that.


[1] http://pgfoundry.org/pipermail/brasil-usuarios/20060330/001667.html
[2] http://www.mail-archive.com/brasil-usuarios@pgfoundry.org/msg00895.html
[3] http://archives.postgresql.org/pgsql-hackers/2010-07/msg00512.php


--  Euler Taveira de Oliveira http://www.timbira.com/


Re: create custom collation from case insensitive portuguese

От
Robert Haas
Дата:
On Tue, Nov 2, 2010 at 8:40 AM, Euler Taveira de Oliveira
<euler@timbira.com> wrote:
> Alexandre Riveira escreveu:
>> I've achieved some success in changing collate operating system (linux)
>> to generate sort of way of Brazil Portuguese hopes by adding the
>> following code in LC_COLLATE
>>
> This was already discussed; search the archives [1] [2].
>
>> So far, I understood the mechanism of change collate and reproduce in
>> postgresql, and I could not generate a case-insensitive search, I
>> believe that would change within the LC_COLLATE variable, but could not
>> go any further than that.
>>
> PostgreSQL doesn't support case-insensitive searches specifying the collate
> per column yet. Look at [3]. But you could use ILIKE or regular expression to
> achieve  that.

Is citext also useful for this?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: create custom collation from case insensitive portuguese

От
Alexandre Riveira
Дата:
Thank you all for your help !

When mentioned in Portuguese case-insensitive in fact we are also 
talking about accent-insensitive

A common example is that the name Jose and José also can be written, 
citext or ilike only not solve the problem

My progress is ...

Edit file /usr/share/i18n/locales/i18n e alter section tolower /, an 
example:

(<U00C9>,<U00E9>) e alter for (<U00C9>,<U0065>)

LOWER reproduce: LOWER("ITAPAGÉ") => "itapage",

Example success:

SELECT * FROM endereco WHERE LOWER(logradouro) LIKE LOWER('itapage%')

this behavior is reproduced in citext (logradouro is column citext)

SELECT * FROM endereco WHERE logradouro = 'itapage'
or
SELECT * FROM endereco WHERE logradouro LIKE 'itapage%'

All examples return the desired value "ITAPAGÉ"
Issue:

SELECT * FROM endereco WHERE logradouro LIKE 'itapage%'  NOT USE INDEX  
I tried   CREATE INDEX cep_ik_logradouro ON cep USING btree (logradouro);   CREATE INDEX like_index ON cep(logradouro
varchar_pattern_ops);  CREATE INDEX cep_ci_index ON cep(lower(logradouro) varchar_pattern_ops);
 

I've had success with using index
SELECT * FROM endereco WHERE LOWER(logradouro) LIKE LOWER('itapage%') 
and CREATE INDEX cep_ci_index ON cep(lower(logradouro) varchar_pattern_ops)
But I want to solve using only citext


Tank's

Alexandre Riveira
Brazil




Robert Haas escreveu:
> On Tue, Nov 2, 2010 at 8:40 AM, Euler Taveira de Oliveira
> <euler@timbira.com> wrote:
>   
>> Alexandre Riveira escreveu:
>>     
>>> I've achieved some success in changing collate operating system (linux)
>>> to generate sort of way of Brazil Portuguese hopes by adding the
>>> following code in LC_COLLATE
>>>
>>>       
>> This was already discussed; search the archives [1] [2].
>>
>>     
>>> So far, I understood the mechanism of change collate and reproduce in
>>> postgresql, and I could not generate a case-insensitive search, I
>>> believe that would change within the LC_COLLATE variable, but could not
>>> go any further than that.
>>>
>>>       
>> PostgreSQL doesn't support case-insensitive searches specifying the collate
>> per column yet. Look at [3]. But you could use ILIKE or regular expression to
>> achieve  that.
>>     
>
> Is citext also useful for this?
>
>   



Re: create custom collation from case insensitive portuguese

От
Dimitri Fontaine
Дата:
Alexandre Riveira <alexandre@objectdata.com.br> writes:
> When mentioned in Portuguese case-insensitive in fact we are also talking
> about accent-insensitive

See unaccent dictionary, but don't use only this one in your text search
configuration, IIRC.
 http://www.postgresql.org/docs/9/static/unaccent.html

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support