Re: Referential Integrity In PostgreSQL

Поиск
Список
Период
Сортировка
От wieck@debis.com (Jan Wieck)
Тема Re: Referential Integrity In PostgreSQL
Дата
Msg-id m11TLQP-0003kLC@orion.SAPserv.Hamburg.dsh.de
обсуждение исходный текст
Список pgsql-hackers
>
> Hi , Jan
>
> my name is Max .

Hi Max,

>
> I have contributed to SPI interface ,
> that with external Trigger try to make
> a referential integrity.
>
> If I can Help , in something ,
> I'm here .
>

    You're welcome.

    I've  CC'd  the  hackers list because we might get some ideas
    from there too (and to  surface  once  in  a  while  -  Bruce
    already missed me).

    Currently  I'm  very  busy  for  serious work so I don't find
    enough  spare  time  to  start  on  such  a  big  change   to
    PostgreSQL.   But  I'd like to give you an overview of what I
    have in mind so far so you can decide if you're able to help.

    Referential integrity (RI) is based on constraints defined in
    the schema of a database. There are some different  types  of
    constraints:

    1.  Uniqueness constraints.

    2.  Foreign key constraints that ensure that a key value used
        in  an  attribute  exists  in   another   relation.   One
        constraint  must ensure you're unable to INSERT/UPDATE to
        a value that doesn't  exist,  another  one  must  prevent
        DELETE  on  a  referenced  key item or that it is changed
        during UPDATE.

    3.  Cascading deletes that let rows referring to a key follow
        on DELETE silently.

    Even  if  not  defined in the standard (AFAIK) there could be
    others like letting references automatically follow on UPDATE
    to a key value.

    All constraints can be enabled and/or default to be deferred.
    That means, that the RI checks aren't performed when they are
    triggerd.  Instead,  they're checked at transaction end or if
    explicitly invoked by some special statement.  This is really
    important  because  someone  must  be able to setup cyclic RI
    checks that could never be satisfied if the checks  would  be
    performed  immediately.  The  major  problem  on  this is the
    amount of data affected until the checks must  be  performed.
    The number of statements executed, that trigger such deferred
    constraints,   shouldn't   be   limited.   And   one   single
    INSERT/UPDATE/DELETE could affect thousands of rows.

    Due  to these problems I thought, it might not be such a good
    idea to remember CTID's or the like to get back OLD/NEW  rows
    at the time the constraints are checked. Instead I planned to
    misuse the rule system for it. Unfortunately, the rule system
    has  damned  tricky problems itself when it comes to having-,
    distinct and other clauses and extremely  on  aggregates  and
    subselects. These problems would have to get fixed first.  So
    it's a solution that cannot be implemented right now.

    Fallback to CTID remembering though. There are  problems  too
    :-(.   Let's  enhance  the  trigger mechanism with a deferred
    feature. First this requires two additional  bool  attributes
    in  the  pg_trigger  relation  that  tell  if this trigger is
    deferrable and if it is deferred by default.  While at it  we
    should  add another bool that tells if the trigger is enabled
    (ALTER TRIGGER {ENABLE|DISABLE} trigger).

    Second we  need  an  internal  list  of  triggers,  that  are
    currently DEFINED AS DEFERRED. Either because they default to
    it, or the user explicitly asked to deferr it.

    Third we need an internal  list  of  triggers  that  must  be
    invoked later because at the time an event occured where they
    should have been triggered, they appeared in the  other  list
    and  their  execution  is  delayed  until  transaction end or
    explicit execution. This list must remember the  OID  of  the
    trigger   to  invoke  (to  identify  the  procedure  and  the
    arguments), the relation that  caused  the  trigger  and  the
    CTID's of the OLD and NEW row.

    That  last  list  could  grow  extremely!  Think of a trigger
    that's executing commands over SPI  which  in  turn  activate
    deferred  triggers.  Since  the order of trigger execution is
    very  important  for  RI,  I  can't   see   any   chance   to
    simplify/condense  this  information.  Thus it is 16 bytes at
    least per deferred trigger call (2 OID's plus  2  CTID's).  I
    think one or more temp files would fit best for this.

    A last tricky point is if one of a bunch of deferred triggers
    is explicitly called for execution. At this time, the entries
    for  it  in  the  temp  file(s) must get processed and marked
    executed (maybe by overwriting  the  triggers  OID  with  the
    invalid  OID)  while  other  trigger events still have to get
    recorded.

    Needless to say that reading thousands of those entries  just
    to find a few isn't good on performance. But better have this
    special case slow that dealing with hundreds of temp files or
    other overhead slowing down the usual case where ALL deferred
    triggers get called at transaction end.

    Trigger invocation is simple now - fetch the OLD and NEW rows
    by  CTID  and  execute  the  trigger  as  done by the trigger
    manager. Oh - well - vacuum shouldn't touch  relations  where
    deferred   triggers  are  outstanding.   Might  require  some
    special lock entry - Vadim?

    Did I miss something?


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#========================================= wieck@debis.com (Jan Wieck) #

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Hannu Krosing
Дата:
Сообщение: Re: [HACKERS] Re: HISTORY for 6.5.2
Следующее
От: wieck@debis.com (Jan Wieck)
Дата:
Сообщение: Re: [HACKERS] Status on Jan Wieck