Обсуждение: Lower or Upper case for F.33. pg_trgm
The following documentation comment has been logged on the website: Page: https://www.postgresql.org/docs/14/pgtrgm.html Description: Hey guys, I have a question regarding the trigram algorithm and I can not find any information about it in your documentation: Do you distinguish between lower and uppercase? Or do you consider all words in lowercase? Happy to get a short feedback from you, Greetings, Marc
> On 16 Aug 2022, at 12:17, PG Doc comments form <noreply@postgresql.org> wrote:
> I have a question regarding the trigram algorithm and I can not find any
> information about it in your documentation:
Maybe we should add something about this?
> Do you distinguish between lower and uppercase? Or do you consider all words
> in lowercase?
There is support for compiling pg_trgm case sensitive, but it's by default case
insensitive.
# SELECT word_similarity('word', 'WORD');
word_similarity
-----------------
1
(1 row)
> Happy to get a short feedback from you,
I would recommend the pg_general mailinglist as that will be a safer way to get
general questions answered.
--
Daniel Gustafsson https://vmware.com/
Op 16-08-2022 om 12:36 schreef Daniel Gustafsson:
>> On 16 Aug 2022, at 12:17, PG Doc comments form <noreply@postgresql.org> wrote:
>
>> I have a question regarding the trigram algorithm and I can not find any
>> information about it in your documentation:
>
> Maybe we should add something about this?
Yeah, it's a bit strange that none of the following strings yield any
info on that page: 'case', 'sensitiv', 'upper', 'lower', and that there
is no mention of the ~ versus ~* difference.
Maybe worth to (already in pgtrgm.html) give the simple hint:
~ is case-sensitive
~* is case-insensitive
In any case a link to functions-matching.html seems indicated.
Erik Rijkers
>
>> Do you distinguish between lower and uppercase? Or do you consider all words
>> in lowercase?
>
> There is support for compiling pg_trgm case sensitive, but it's by default case
> insensitive.
>
> # SELECT word_similarity('word', 'WORD');
> word_similarity
> -----------------
> 1
> (1 row)
>
>> Happy to get a short feedback from you,
>
> I would recommend the pg_general mailinglist as that will be a safer way to get
> general questions answered.
>
> --
> Daniel Gustafsson https://vmware.com/
>
>
>
> On 16 Aug 2022, at 12:54, Erik Rijkers <er@xs4all.nl> wrote: > > Op 16-08-2022 om 12:36 schreef Daniel Gustafsson: >>> On 16 Aug 2022, at 12:17, PG Doc comments form <noreply@postgresql.org> wrote: >>> I have a question regarding the trigram algorithm and I can not find any >>> information about it in your documentation: >> Maybe we should add something about this? > > Yeah, it's a bit strange that none of the following strings yield any info on that page: 'case', 'sensitiv', 'upper','lower', and that there is no mention of the ~ versus ~* difference. > > Maybe worth to (already in pgtrgm.html) give the simple hint: > ~ is case-sensitive > ~* is case-insensitive > > In any case a link to functions-matching.html seems indicated. Yeah, I think there is room for improvements here. Are you up for drafting a patch for this? -- Daniel Gustafsson https://vmware.com/
Thanks for your fast response.
Is this a question for me? I am fine with a short hint regarding the default.
A link to another documentation is also fine.
Is this a question for me? I am fine with a short hint regarding the default.
A link to another documentation is also fine.
Am Di., 16. Aug. 2022 um 13:46 Uhr schrieb Daniel Gustafsson <daniel@yesql.se>:
> On 16 Aug 2022, at 12:54, Erik Rijkers <er@xs4all.nl> wrote:
>
> Op 16-08-2022 om 12:36 schreef Daniel Gustafsson:
>>> On 16 Aug 2022, at 12:17, PG Doc comments form <noreply@postgresql.org> wrote:
>>> I have a question regarding the trigram algorithm and I can not find any
>>> information about it in your documentation:
>> Maybe we should add something about this?
>
> Yeah, it's a bit strange that none of the following strings yield any info on that page: 'case', 'sensitiv', 'upper', 'lower', and that there is no mention of the ~ versus ~* difference.
>
> Maybe worth to (already in pgtrgm.html) give the simple hint:
> ~ is case-sensitive
> ~* is case-insensitive
>
> In any case a link to functions-matching.html seems indicated.
Yeah, I think there is room for improvements here. Are you up for drafting a
patch for this?
--
Daniel Gustafsson https://vmware.com/
Op 16-08-2022 om 13:46 schreef Daniel Gustafsson: >> On 16 Aug 2022, at 12:54, Erik Rijkers <er@xs4all.nl> wrote: >> >> Op 16-08-2022 om 12:36 schreef Daniel Gustafsson: >>>> On 16 Aug 2022, at 12:17, PG Doc comments form <noreply@postgresql.org> wrote: >>>> I have a question regarding the trigram algorithm and I can not find any >>>> information about it in your documentation: >>> Maybe we should add something about this? >> >> Yeah, it's a bit strange that none of the following strings yield any info on that page: 'case', 'sensitiv', 'upper','lower', and that there is no mention of the ~ versus ~* difference. >> >> Maybe worth to (already in pgtrgm.html) give the simple hint: >> ~ is case-sensitive >> ~* is case-insensitive >> >> In any case a link to functions-matching.html seems indicated. > > Yeah, I think there is room for improvements here. Are you up for drafting a > patch for this? > How is this? (bluntly stating 'similarity comparisons are case-insensitive' - although I'm not really sure..) Erik > -- > Daniel Gustafsson https://vmware.com/ >
Вложения
Erik Rijkers <er@xs4all.nl> writes:
> (bluntly stating 'similarity comparisons are case-insensitive' -
> although I'm not really sure..)
Perhaps like "similarity comparisons are case-insensitive in a
standard build of pg_trgm", if you want to nod to the existence
of a compile option without going into detail.
regards, tom lane
Sounds good to me.
Am Di., 16. Aug. 2022 um 15:53 Uhr schrieb Tom Lane <tgl@sss.pgh.pa.us>:
Erik Rijkers <er@xs4all.nl> writes:
> (bluntly stating 'similarity comparisons are case-insensitive' -
> although I'm not really sure..)
Perhaps like "similarity comparisons are case-insensitive in a
standard build of pg_trgm", if you want to nod to the existence
of a compile option without going into detail.
regards, tom lane
> On 16 Aug 2022, at 15:53, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Erik Rijkers <er@xs4all.nl> writes:
>> (bluntly stating 'similarity comparisons are case-insensitive' -
>> although I'm not really sure..)
>
> Perhaps like "similarity comparisons are case-insensitive in a
> standard build of pg_trgm", if you want to nod to the existence
> of a compile option without going into detail.
Looking at this I'm leaning towards paring down the diff posted upthread with
pretty much this, I think that will provide value while avoid causing
confusion.
As a related side note, there are four instances of "case insensitive{ly}" in
the docs with all other instances using "case-insensitive{ly}". I'm inclined
to fix those four to use a dash while at it to be consistent across all pages.
--
Daniel Gustafsson https://vmware.com/
Вложения
Daniel Gustafsson <daniel@yesql.se> writes:
> Looking at this I'm leaning towards paring down the diff posted upthread with
> pretty much this, I think that will provide value while avoid causing
> confusion.
WFM.
> As a related side note, there are four instances of "case insensitive{ly}" in
> the docs with all other instances using "case-insensitive{ly}". I'm inclined
> to fix those four to use a dash while at it to be consistent across all pages.
+1
regards, tom lane