Обсуждение: Re: [SQL] RULE questions.

Поиск
Список
Период
Сортировка

Re: [SQL] RULE questions.

От
"D'Arcy" "J.M." Cain
Дата:
Thus spake Neil Burrows
> First off, is there an easier way to ensure that data is stored in uppercase
> for certain columns (not the whole table).  And if not does anyone have
> comments on performance issues, or ways of stopping users accidentally or
> intentionally inserting lower case data straight into the table rather than
> the view?

This makes me think of two features missing in PostgreSQL that I would
love to see.  I know it's probably to late to think about it now for
6.5 but I wonder what others think about this.

First, as suggested above, how about an option to automatically convert
data to upper case on entry?  I realize that triggers can do the job but
it seems to be needed often enough that putting it into the definition
for the field seems useful.  I guess a lower option would make sense too.

Second, an option to CREATE INDEX to make the index case insensitive.
Other RDBMS systems do this and it is nice not to depend on users being
consistent when entering names.  Consider ("albert", "Daniel", "DENNIS")
which would sort exactly opposite.  Also, in a primary key field (or
unique index) it would be nice if "A" was rejected if "a" already was
in the database.

Thoughts?

Followups to hackers.

--
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 424 2871     (DoD#0082)    (eNTP)   |  what's for dinner.

Re: [HACKERS] Re: [SQL] RULE questions.

От
Tom Lane
Дата:
"D'Arcy" "J.M." Cain <darcy@druid.net> writes:
> Second, an option to CREATE INDEX to make the index case insensitive.

That, at least, we can already do: build the index on lower(field) not
just field.  Or upper(field) if that seems more natural to you.

> Also, in a primary key field (or
> unique index) it would be nice if "A" was rejected if "a" already was
> in the database.

Making either of the above a UNIQUE index should accomplish that.

            regards, tom lane

Re: [HACKERS] Re: [SQL] RULE questions.

От
Hannu Krosing
Дата:
D'Arcy J.M. Cain wrote:
>
> This makes me think of two features missing in PostgreSQL that I would
> love to see.  I know it's probably to late to think about it now for
> 6.5 but I wonder what others think about this.
>
> First, as suggested above, how about an option to automatically convert
> data to upper case on entry?  I realize that triggers can do the job but
> it seems to be needed often enough that putting it into the definition
> for the field seems useful.  I guess a lower option would make sense too.

These could probably be implemened more effectively using rules. Having
the
rules generated automatically for simple cases would of course be nice,
but a warning at least should be given to user about creating the rule,
like it's currently done with primary key.

Or maybe it would be better to support virtual fields, like this :

create table people(
  first_name                 varchar(25),
  last_name                  varchar(25),
  upper_first_name VIRTUAL upper(first_name),
  upper_last_name  VIRTUAL upper(last_name),
  full_name        VIRTUAL (upper_first_name || ' ' || upper_last_name)
primary key
);

and then untangle this in the backend and create required rules and
indexes automatically ?

> Second, an option to CREATE INDEX to make the index case insensitive.

If you have this option on idex, how do you plan to make sure that the
index is actually used ?

It may be better to do it explicitly -

1. create index on upper(field)

2. use where upper(field) = 'MYDATA'

---------------
Hannu

Re: [HACKERS] Re: [SQL] RULE questions.

От
jwieck@debis.com (Jan Wieck)
Дата:
>
> D'Arcy J.M. Cain wrote:
> >
> > This makes me think of two features missing in PostgreSQL that I would
> > love to see.  I know it's probably to late to think about it now for
> > 6.5 but I wonder what others think about this.
> >
> > First, as suggested above, how about an option to automatically convert
> > data to upper case on entry?  I realize that triggers can do the job but
> > it seems to be needed often enough that putting it into the definition
> > for the field seems useful.  I guess a lower option would make sense too.
>
> These could probably be implemened more effectively using rules. Having
> the
> rules generated automatically for simple cases would of course be nice,
> but a warning at least should be given to user about creating the rule,
> like it's currently done with primary key.

    No it can't.

    Such a rule would look like

        CREATE RULE xxx AS ON INSERT TO this_table
            DO INSTEAD INSERT INTO this_table ...

    The   rule   system  will  be  triggerd  on  an  INSERT  INTO
    this_table, rewrite and generate another parsetree that is an
    INSERT  INTO this_table, which is recursively rewritten again
    applying rule xxx...

    That's an endless recursion. A rule can  never  do  the  same
    operation to a table it is fired for.

    The  old  pre-Postgres95 university version (Postgres release
    4.2) had the possibility to define  rules  that  UPDATE  NEW.
    They  where buggy and didn't worked sometimes at all. Instead
    of fixing them, this functionality got removed when  Postgres
    became 95 :-(


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #

Re: [HACKERS] Re: [SQL] RULE questions.

От
"D'Arcy" "J.M." Cain
Дата:
Thus spake Tom Lane
> "D'Arcy" "J.M." Cain <darcy@druid.net> writes:
> > Second, an option to CREATE INDEX to make the index case insensitive.
> 
> That, at least, we can already do: build the index on lower(field) not
> just field.  Or upper(field) if that seems more natural to you.

Almost.  I guess I wasn't completely clear.  Here's an example.

darcy=> create table x (a int, t text);
CREATE
darcy=> create unique index ti on x (lower(t) text_ops);
CREATE
darcy=> insert into x values (1, 'abc');
INSERT 19021 1
darcy=> insert into x values (2, 'ABC');
ERROR:  Cannot insert a duplicate key into a unique index
darcy=> insert into x values (2, 'Def');
INSERT 19023 1
darcy=> select * from x;
a|t  
-+---
1|abc
2|Def
(2 rows)

darcy=> select * from x where t = 'ABC';
a|t
-+-
(0 rows)

Note that it prevented me from adding the upper case dup just fine.  The
last select is the issue.  It's necessary for the user to know how it is
stored before doing the select.  I realize that you can do this.

darcy=>  select * from x where lower(t) = 'abc';

But other systems make this more convenient by just making 'ABC' and 'abc'
equivalent.

Mind you, it may not be possible in our system without creating a new,
case-insensitive type.

> > Also, in a primary key field (or
> > unique index) it would be nice if "A" was rejected if "a" already was
> > in the database.
> 
> Making either of the above a UNIQUE index should accomplish that.

True.  I'm thinking of the situation where you want the primary key to
be case-insensitive.  You can't control that on the auto-generated
unique index so you have to add a second unique index on the same
field.  Again, perhaps a new type is the proper way to handle this.

Speaking of primary keys, there's one more thing needed to make primary
support complete, I think.  Didn't we originally say that a primary
key field was immutable?  We should be able to delete the record but
not change the value of the field in an update.  Would this be hard
to do?

-- 
D'Arcy J.M. Cain <darcy@{druid|vex}.net>   |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 424 2871     (DoD#0082)    (eNTP)   |  what's for dinner.


Re: [HACKERS] Re: [SQL] RULE questions.

От
jwieck@debis.com (Jan Wieck)
Дата:
D'Arcy J.M. Cain wrote:

> But other systems make this more convenient by just making 'ABC' and 'abc'
> equivalent.
>
> Mind you, it may not be possible in our system without creating a new,
> case-insensitive type.

    And  that  wouldn't  be  too  hard. For example, implementing
    citext (case insensitive text) could use text's  input/output
    functions and all the things for lower/upper case conversion,
    concatenation, substring etc (these are SQL language wrappers
    as  we  already have tons of). Only new comparision operators
    have to be built  that  compare  case  insensitive  and  then
    creating  a new operator class for it. All qualifications and
    the sorting in indices, order by, group by are done with  the
    operators defined for the type.

    Also comparision wrappers like to compare text = citext would
    be useful, which simply uses citext_eq().

> > Making either of the above a UNIQUE index should accomplish that.
>
> True.  I'm thinking of the situation where you want the primary key to
> be case-insensitive.  You can't control that on the auto-generated
> unique index so you have to add a second unique index on the same
> field.  Again, perhaps a new type is the proper way to handle this.

    The above citext type would inherit this auto.

>
> Speaking of primary keys, there's one more thing needed to make primary
> support complete, I think.  Didn't we originally say that a primary
> key field was immutable?  We should be able to delete the record but
> not change the value of the field in an update.  Would this be hard
> to do?

    No efford on that.  I'm  planning  to  reincarnate  attribute
    specification  for rules and implement a RAISE statement. The
    attributes (this time it  will  be  multiple)  suppress  rule
    action  completely  if  none  of the attributes appear in the
    queries targetlist (what they must on UPDATE to change).

    So at create table time, a rule like

        CREATE RULE somename AS ON UPDATE TO table
            ATTRIBUTE att1, att2
            WHERE old.att1 != new.att1 OR old.att2 != old.att2
            DO RAISE EXCEPTION 'Primary key of "table" cannot be changed';

    could be installed. As long as nobody specifies the fields of
    the  primary  key in it's update, the rewrite system will not
    add the RAISE query to the querytree list, so no checking  is
    done at all.

    But  as  soon as one of the attributes appears in the UPDATE,
    there will be one extra query RAISE  executed  prior  to  the
    UPDATE  itself  and check that all the new values are the old
    ones. This would have the extra benefit, that the transaction
    would abort BEFORE any changes have been made to the table at
    all (remember that UPDATE  in  Postgres  means  another  heap
    tuple  for  each  touched  row and one more invalid tuple for
    vacuum to throw away and for in-the-middle-aborted updates it
    means  so-far-I-came  more  never  committed heap tuples that
    vacuum has to send to byte-hell).

    This will not appear in v6.5 (hopefully in  v6.6).  But  it's
    IMHO  the  best  solution. With the mentioned RAISE, plus the
    currently discussed deferred queries etc. we would  have  the
    rule  system  ready  to  support  ALL  the  constraint  stuff
    (cascaded delete, foreign key). But the more we use the  rule
    system,  the more important it becomes that we get rid of the
    block limit for tuples.

    I think it would be better  to  spend  your  efford  on  that
    issue.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #

Re: [HACKERS] RULE (and ALTER TABLE) questions

От
Hannu Krosing
Дата:
Jan Wieck wrote:
>
> > These could probably be implemened more effectively using rules. Having
> > the
> > rules generated automatically for simple cases would of course be nice,
> > but a warning at least should be given to user about creating the rule,
> > like it's currently done with primary key.
>
>     No it can't.
>
>     Such a rule would look like
>
>         CREATE RULE xxx AS ON INSERT TO this_table
>             DO INSTEAD INSERT INTO this_table ...
>
>     The   rule   system  will  be  triggerd  on  an  INSERT  INTO
>     this_table, rewrite and generate another parsetree that is an
>     INSERT  INTO this_table, which is recursively rewritten again
>     applying rule xxx...
>
>     That's an endless recursion. A rule can  never  do  the  same
>     operation to a table it is fired for.

But when doing that at the table creation time, then the table can
actually
be defined as a view on storage table and rules for insert update and
delete
be defined for this view that do the actual data manipulation on the
storage table.

Or is the rule system currently not capable for this ?

When some field is changed to UPPER-ONLY status using alter table, the
table
could be renamed to staorage table and all the rules be created ?


And the other question - what is the status of ALTER TABLE commands -
can we add/remove/disable constraints without recreating the table ?

Is constraint and index disabling supported at all ?

-------------------
Hannu

Re: [HACKERS] RULE (and ALTER TABLE) questions

От
jwieck@debis.com (Jan Wieck)
Дата:
> But when doing that at the table creation time, then the table can
> actually
> be defined as a view on storage table and rules for insert update and
> delete
> be defined for this view that do the actual data manipulation on the
> storage table.

    That's  IMHO  a too specific case to do it generally with the
    rule system.  Should be some kind of  constraint  handled  by
    the  parser  in  putting  an  UPPER()  func  node  around the
    targetlist expression.

    There could be more general support implemented,  in  that  a
    user can allways tell that a custom function should be called
    with the result of the TLE-expr before the value  is  dropped
    into the tuple on INSERT/UPDATE.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #