Обсуждение: BUG #14278: Problem searching spanish words with accent mark outside the stem
BUG #14278: Problem searching spanish words with accent mark outside the stem
От
paco@hernandezgomez.com
Дата:
VGhlIGZvbGxvd2luZyBidWcgaGFzIGJlZW4gbG9nZ2VkIG9uIHRoZSB3ZWJz aXRlOgoKQnVnIHJlZmVyZW5jZTogICAgICAxNDI3OApMb2dnZWQgYnk6ICAg ICAgICAgIFBhY28gSGVybsOhbmRlegpFbWFpbCBhZGRyZXNzOiAgICAgIHBh Y29AaGVybmFuZGV6Z29tZXouY29tClBvc3RncmVTUUwgdmVyc2lvbjogOS42 YmV0YTMKT3BlcmF0aW5nIHN5c3RlbTogICBMaW51eApEZXNjcmlwdGlvbjog ICAgICAgIAoKRGVhciBzaXJzOg0KDQpTZWFyY2ggd2l0aG91dCBhY2NlbnQg bWFyayBpcyBub3Qgd29ya2luZyBjb3JyZWN0bHkgd2hlbiB0aGUgYWNjZW50 IG1hcmsgaXMKb3V0c2lkZSB0aGUgc3RlbSBvZiB0aGUgd29yZC4NCg0KRm9y IGV4YW1wbGUsIHRoaXMgbWF0Y2hlcyBjb3JyZWN0bHk6DQoNCnBvc3RncmVz PSMgc2VsZWN0IHRvX3RzdmVjdG9yKCdzcGFuaXNoJywgJ2NhbmNpw7NuJykg QEAgdG9fdHNxdWVyeSgnc3BhbmlzaCcsCidjYW5jaW9uJyk7DQogP2NvbHVt bj8gDQotLS0tLS0tLS0tDQogdA0KKDEgcm93KQ0KDQpUaGlzIHdvcmtzIGFu ZCByZXR1cm5zIHRydWUgYmVjYXVzZSB0aGUgc3RlbSBvZiAiY2FuY2nDs24i IGlzICJjYW5jaW9uIiwgc28Kd2hlbiB3ZSBzZWFyY2ggZm9yICJjYW5jaW9u IiAod2l0aG91dCBhY2NlbnQgbWFyayksIGl0IG1hdGNoZXMgY29ycmVjdGx5 Lg0KDQpCdXQsIHdoZW4gdGhlIGFjY2VudCBtYXJrIGlzIG91dHNpZGUgdGhl IHN0ZW0sIGZvciBleGFtcGxlIGluICJwZWx1cXVlcsOtYSIsCnRoZW4gaXQg ZG9lcyBub3Qgd29yayBiZWNhdXNlIHRoZSBzdGVtIG9mICJwZWx1cXVlcsOt YSIgaXMgInBlbHVxdSIsIGJ1dAp0b190c3F1ZXJ5KCdzcGFuaXNoJywgJ3Bl bHVxdWVyaWEnKSBpcyAicGVsdXF1ZXJpIi4NCg0KcG9zdGdyZXM9IyBzZWxl Y3QgdG9fdHN2ZWN0b3IoJ3NwYW5pc2gnLCAncGVsdXF1ZXLDrWEnKSBAQAp0 b190c3F1ZXJ5KCdzcGFuaXNoJywgJ3BlbHVxdWVyaWEnKTsNCiA/Y29sdW1u PyANCi0tLS0tLS0tLS0NCiBmDQooMSByb3cpDQoNClRoaXMgaXMgaW1wb3J0 YW50IGJlY2F1c2UgdGhlcmUgYXJlIG1hbnkgcGVvcGxlIHRoYXQgZG9uJ3Qg dXNlIHRoZSBhY2NlbnQKbWFyayBhdCBsZXR0ZXIgImkiIGluICJwZWx1cXVl csOtYSIgYW5kIHdvcmRzIGxpa2UgdGhhdC4NCg0KVGhhbmsgeW91IHZlcnkg bXVjaC4NCg0KQmVzdCByZWdhcmRzLA0KUGFjbyBIZXJuw6FuZGV6Lg0KCgo=
Re: BUG #14278: Problem searching spanish words with accent mark outside the stem
От
Alvaro Herrera
Дата:
paco@hernandezgomez.com wrote: > Search without accent mark is not working correctly when the accent mark is > outside the stem of the word. I think it'd be better to apply unaccent() to both the stored text before ts_vectorization and to the query terms. That would reliably remove all diacritics (eñes too, though I suppose nobody would search for their ñandúes by writing nandú, so it's not as severe). -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: BUG #14278: Problem searching spanish words with accent mark outside the stem
От
Jaime Casanova
Дата:
On 7 August 2016 at 23:58, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > paco@hernandezgomez.com wrote: > >> Search without accent mark is not working correctly when the accent mark= is >> outside the stem of the word. > > I think it'd be better to apply unaccent() to both the stored text > before ts_vectorization and to the query terms. That would reliably > remove all diacritics (e=C3=B1es too, though I suppose nobody would searc= h > for their =C3=B1and=C3=BAes by writing nand=C3=BA, so it's not as severe)= . > > problem is that unaccent() is stable so can't be in the index expression, so OP would need to create a ts_vector field to store a preprocessed version of the string (one in which ts_vector('spanish', unaccent()) has been already executed. and query over that field. <cough> or create an immutable version of unaccent() </cough> --=20 Jaime Casanova www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: BUG #14278: Problem searching spanish words with accent mark outside the stem
От
pacohernandezg
Дата:
Thank you very much =C3=81lvaro. It is perfect for us, once you apply unaccent. We have finally altered spanish config with: alter text search configuration spanish alter mapping for hword, hword_part, word with unaccent, spanish_stem; And it is working perfectly. Thanks again. ;-) -- View this message in context: http://postgresql.nabble.com/BUG-14278-Proble= m-searching-spanish-words-with-accent-mark-outside-the-stem-tp5914833p59153= 53.html Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.