Обсуждение: Custom text type for title text

Поиск
Список
Период
Сортировка

Custom text type for title text

От
Daniel Heath
Дата:
Hi All,

I've frequently seen an issue in applications which store titles (eg of books, events, user profiles) where duplicate values are not properly vetted. 

The 'citext' type is helpful here, but I'd be keen to go further. 

I propose a 'titletext' type, which has the following properties when compared for equality:
 * Case insensitivity (like 'citext')
 * Only considers characters in [:alnum:] (that is, ignores spaces, punctuation, etc)

This would be useful for a range of situations where it's important to avoid entering duplicate values.

Given the discussion at https://www.postgresql.org/message-id/CAKFQuwY9u14TqG8Yzj%3DfAB0tydvvtK7ibgFEx3tegbPWsGjJpg%40mail.gmail.com I'd lean towards making this type not automatically coerce to text (to avoid surprising behaviour when comparing text to titletext).

Is a suitable patch likely to be accepted?

Thanks,

Daniel Heath

Re: Custom text type for title text

От
Fabrízio de Royes Mello
Дата:


Em qui, 3 de jan de 2019 às 20:22, Daniel Heath <daniel@heath.cc> escreveu:
Hi All,

I've frequently seen an issue in applications which store titles (eg of books, events, user profiles) where duplicate values are not properly vetted. 

The 'citext' type is helpful here, but I'd be keen to go further. 

I propose a 'titletext' type, which has the following properties when compared for equality:
 * Case insensitivity (like 'citext')
 * Only considers characters in [:alnum:] (that is, ignores spaces, punctuation, etc)

This would be useful for a range of situations where it's important to avoid entering duplicate values.

Given the discussion at https://www.postgresql.org/message-id/CAKFQuwY9u14TqG8Yzj%3DfAB0tydvvtK7ibgFEx3tegbPWsGjJpg%40mail.gmail.com I'd lean towards making this type not automatically coerce to text (to avoid surprising behaviour when comparing text to titletext).

Is a suitable patch likely to be accepted?

You don’t need touch the core to do that. Just implement it as an extension and share throught some channel like pgxn.org.

Note that citext also is an extension and released as a contrib module.

Regards,

--
   Fabrízio de Royes Mello         Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

Re: Custom text type for title text

От
Daniel Heath
Дата:
Would this also be appropriate for inclusion as contrib? I'm unfamiliar with the policy for what is / is not included there.

Thanks,
Daniel Heath


On Fri, Jan 4, 2019, at 9:47 AM, Fabrízio de Royes Mello wrote:


Em qui, 3 de jan de 2019 às 20:22, Daniel Heath <daniel@heath.cc> escreveu:
Hi All,

I've frequently seen an issue in applications which store titles (eg of books, events, user profiles) where duplicate values are not properly vetted. 

The 'citext' type is helpful here, but I'd be keen to go further. 

I propose a 'titletext' type, which has the following properties when compared for equality:
 * Case insensitivity (like 'citext')
 * Only considers characters in [:alnum:] (that is, ignores spaces, punctuation, etc)

This would be useful for a range of situations where it's important to avoid entering duplicate values.

Given the discussion at https://www.postgresql.org/message-id/CAKFQuwY9u14TqG8Yzj%3DfAB0tydvvtK7ibgFEx3tegbPWsGjJpg%40mail.gmail.com I'd lean towards making this type not automatically coerce to text (to avoid surprising behaviour when comparing text to titletext).

Is a suitable patch likely to be accepted?

You don’t need touch the core to do that. Just implement it as an extension and share throught some channel like pgxn.org.

Note that citext also is an extension and released as a contrib module.

Regards,

--
   Fabrízio de Royes Mello         Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

Re: Custom text type for title text

От
Fabrízio de Royes Mello
Дата:


Em qui, 3 de jan de 2019 às 20:53, Daniel Heath <daniel@heath.cc> escreveu:
Would this also be appropriate for inclusion as contrib? I'm unfamiliar with the policy for what is / is not included there.


Please do not top post.

At first I recommend you implement it as an extension (using gitlab, github, bitbucket or something else) and after you have a stable working code maybe you should try to send it as a contrib module and then the community will decide to accept it or not.

PostgreSQL is extensible enough to you provide this piece of work without care with the community decisions. What I mean is you necessarily don’t need to send it as a contrib module, just maintain it as a separate extension project.

Regards,



--
   Fabrízio de Royes Mello         Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

Re: Custom text type for title text

От
Peter Eisentraut
Дата:
On 03/01/2019 23:22, Daniel Heath wrote:
> I propose a 'titletext' type, which has the following properties when
> compared for equality:
>  * Case insensitivity (like 'citext')
>  * Only considers characters in [:alnum:] (that is, ignores spaces,
> punctuation, etc)

My work on insensitive/non-deterministic collations[0] might cover this.

[0]:
https://www.postgresql.org/message-id/flat/1ccc668f-4cbc-0bef-af67-450b47cdfee7%402ndquadrant.com

For example:

CREATE COLLATION yournamehere (provider = icu,
  locale = 'und-u-ks-level2-ka-shifted', deterministic = false);

(Roughly, ks-level2 means ignore case, ka-shifted means ignore punctuation.)

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Custom text type for title text

От
Greg Stark
Дата:
Policy on contrib has shifted over time. But generally we want to encourage a lively ecosystem of extensions maintained outside of the Postgres source tree so we avoid adding things to contrib when there's no particular advantage. 

The most common reason things are added to contrib is when the extension is closely tied to internals and needs to be maintained along with changes to internals. Modules like that are hard to maintain separately. But modules that use documented general extensibility APIs should be able to be stable across versions and live outside contrib.

On Thu 3 Jan 2019, 23:54 Daniel Heath <daniel@heath.cc wrote:
Would this also be appropriate for inclusion as contrib? I'm unfamiliar with the policy for what is / is not included there.

Thanks,
Daniel Heath


On Fri, Jan 4, 2019, at 9:47 AM, Fabrízio de Royes Mello wrote:


Em qui, 3 de jan de 2019 às 20:22, Daniel Heath <daniel@heath.cc> escreveu:
Hi All,

I've frequently seen an issue in applications which store titles (eg of books, events, user profiles) where duplicate values are not properly vetted. 

The 'citext' type is helpful here, but I'd be keen to go further. 

I propose a 'titletext' type, which has the following properties when compared for equality:
 * Case insensitivity (like 'citext')
 * Only considers characters in [:alnum:] (that is, ignores spaces, punctuation, etc)

This would be useful for a range of situations where it's important to avoid entering duplicate values.

Given the discussion at https://www.postgresql.org/message-id/CAKFQuwY9u14TqG8Yzj%3DfAB0tydvvtK7ibgFEx3tegbPWsGjJpg%40mail.gmail.com I'd lean towards making this type not automatically coerce to text (to avoid surprising behaviour when comparing text to titletext).

Is a suitable patch likely to be accepted?

You don’t need touch the core to do that. Just implement it as an extension and share throught some channel like pgxn.org.

Note that citext also is an extension and released as a contrib module.

Regards,

--
   Fabrízio de Royes Mello         Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento