Обсуждение: INSERT or UPDATE

Поиск
Список
Период
Сортировка

INSERT or UPDATE

От
"James B. Byrne"
Дата:
I have spent the last couple of days reading up on SQL, of which I
know very little, and PL/pgSQl, of which I know less.  I am trying
to decide how best to approach the following requirement.

Given a legal name and a common name and associated details, we wish
to insert this information into a table, entities. As well, we
believe it useful to allow a specific entity more than one common
name.  So, at the moment we are considering having another table,
identifiers, that takes entity_id = entity.id (synthetic sequenced
PK for entities), the identifier_type (always 'AKNA' for this
collection of identifiers) and identifier_value =
entity.common_name.

This seems straight forward enough when initially inserting an
entity.  However, it is conceivable that over the lifetime of the
system a particular entity might change its common name. For example
the former "John Tash Enterprises" might become popularly known as
"JTE Inc." while the legal name remains unchanged.

When we update the entity record and set the common_name = "JTE
Inc." then we need insert an identifier row to match.  However,
identifiers for a given entity can be maintained separately from the
entity itself. It is therefore possible, indeed likely, that the
identifier "JTE Inc." for that entity already exists.  Likely, but
not certain.  In any case, the old identifier row remains unchanged
after the new is inserted.

The issue then is how to determine on an UPDATE entities whether it
is necessary to INSERT a new identifier using values provided from
the entities row.

From what I have gathered, what one does is simply insert the new
identifiers row. If there is a primary key conflict then the update
fails, which the function handles gracefully. If not, then it
succeeds.

I have also formed the opinion that what one does is write a
function or functions, such as fn_aknau(entity_id, name), and tie
these with triggers to the appropriate actions on entities such as:

CREATE TRIGGER tr_entities_aioru AFTER INSERT OR UPDATE
            ON entities
           FOR EACH ROW
       EXECUTE fn_aknai(entities.id, 'AKNA', entities.common_name);

Is my appreciation correct or am I missing the mark entirely?  Is
this considered the proper place and means to accomplish this sort
of task in an RDBMS?  Does it belong elsewhere? Am I correct in
inferring that the values in the columns id and common_name will be
those of entities AFTER the insert or update and that these will be
available to the body of the function?  Is the trigger dependent
upon a SUCCESSFUL INSERT or UPDATE of entities or will it be called
regardless? Must the function be written in PL/pgSQl (or similar PL)
or could this function be written in straight SQL? Should it be
straight SQL if possible?  What should the function return, if
anything?

Fairly basic stuff I am sure but somewhat mystifying for me at the
moment.  Any help would be appreciated.

--
***          E-Mail is NOT a SECURE channel          ***
James B. Byrne                mailto:ByrneJB@Harte-Lyne.ca
Harte & Lyne Limited          http://www.harte-lyne.ca
9 Brockley Drive              vox: +1 905 561 1241
Hamilton, Ontario             fax: +1 905 561 0757
Canada  L8E 3C3


Re: INSERT or UPDATE

От
"James B. Byrne"
Дата:
On Mon, April 6, 2009 17:00, Dann Corbit wrote:
.
>
> It is a difficult question.
>
> For instance, there are many possibilities when a collision occurs.
>
> I guess that for some collisions, sharing the name is OK.
>

I failed to explicitly state what the PK looked like.

  entity_id(entities.id) +
  identifier_type ('AKNA') +
  identifier_value(entities.common_name)

There will only be a PK collision when we attempt to add a duplicate
common name for the same entity, which means it already exists and
does not need to be added again.


--
***          E-Mail is NOT a SECURE channel          ***
James B. Byrne                mailto:ByrneJB@Harte-Lyne.ca
Harte & Lyne Limited          http://www.harte-lyne.ca
9 Brockley Drive              vox: +1 905 561 1241
Hamilton, Ontario             fax: +1 905 561 0757
Canada  L8E 3C3



Re: INSERT or UPDATE

От
"Dann Corbit"
Дата:
> -----Original Message-----
> From: James B. Byrne [mailto:byrnejb@harte-lyne.ca]
> Sent: Monday, April 06, 2009 2:06 PM
> To: Dann Corbit
> Cc: pgsql-general@postgresql.org
> Subject: RE: [GENERAL] INSERT or UPDATE
>
> On Mon, April 6, 2009 17:00, Dann Corbit wrote:
> .
> >
> > It is a difficult question.
> >
> > For instance, there are many possibilities when a collision occurs.
> >
> > I guess that for some collisions, sharing the name is OK.
> >
>
> I failed to explicitly state what the PK looked like.
>
>   entity_id(entities.id) +
>   identifier_type ('AKNA') +
>   identifier_value(entities.common_name)
>
> There will only be a PK collision when we attempt to add a duplicate
> common name for the same entity, which means it already exists and
> does not need to be added again.

The pedagogic solution for this type of problem is called merge.
The last I knew, PostgreSQL did not directly support merge.
So you can accomplish the same thing in two stages:
1. Check for existence and perform an update if the key is present
2. If the key is not present, then perform an insert.

Again, this may or may not be the right thing to do.

Re: INSERT or UPDATE

От
Thomas Kellerer
Дата:
Dann Corbit wrote on 06.04.2009 23:15:
>>> I guess that for some collisions, sharing the name is OK.
>>>
>> I failed to explicitly state what the PK looked like.
>>
>>   entity_id(entities.id) +
>>   identifier_type ('AKNA') +
>>   identifier_value(entities.common_name)
>>
>> There will only be a PK collision when we attempt to add a duplicate
>> common name for the same entity, which means it already exists and
>> does not need to be added again.
>
> The pedagogic solution for this type of problem is called merge.
> The last I knew, PostgreSQL did not directly support merge.
> So you can accomplish the same thing in two stages:
> 1. Check for existence and perform an update if the key is present
> 2. If the key is not present, then perform an insert.

You don't actually need to check for existence. Just do the update, if no rows
were updated, you can insert (UPDATE will do an existence check anyway)

Thomas

Re: INSERT or UPDATE

От
"James B. Byrne"
Дата:
On Mon, April 6, 2009 17:15, Dann Corbit wrote:

>
> The pedagogic solution for this type of problem is called merge.
> The last I knew, PostgreSQL did not directly support merge.
> So you can accomplish the same thing in two stages:
> 1. Check for existence and perform an update if the key is present
> 2. If the key is not present, then perform an insert.
>
> Again, this may or may not be the right thing to do.
>

Forgive my obtuseness, but what does the preliminary SELECT
accomplish?  When the trigger fires we already know whether or not
the entities row existed previously, what we are deciding is how to
handle the concurrent identifiers table entry.

I initially thought along these lines (select insert/update
depending on the return value) but I gradually realized that it did
not matter whether the identifier row was already there or not.  If
it exists then an UNIQUE key constraint prevents duplicates. If it
does not exist then the INSERT succeeds.  The previous identifier
associated with the original common name has to remain on file to
allow lookups by former names.  Thus, we never update an identifier
row in this fashion.

--
***          E-Mail is NOT a SECURE channel          ***
James B. Byrne                mailto:ByrneJB@Harte-Lyne.ca
Harte & Lyne Limited          http://www.harte-lyne.ca
9 Brockley Drive              vox: +1 905 561 1241
Hamilton, Ontario             fax: +1 905 561 0757
Canada  L8E 3C3


Re: INSERT or UPDATE

От
"Dann Corbit"
Дата:
> -----Original Message-----
> From: James B. Byrne [mailto:byrnejb@harte-lyne.ca]
> Sent: Monday, April 06, 2009 5:16 PM
> To: Dann Corbit
> Cc: pgsql-general@postgresql.org
> Subject: RE: [GENERAL] INSERT or UPDATE
>
>
> On Mon, April 6, 2009 17:15, Dann Corbit wrote:
>
> >
> > The pedagogic solution for this type of problem is called merge.
> > The last I knew, PostgreSQL did not directly support merge.
> > So you can accomplish the same thing in two stages:
> > 1. Check for existence and perform an update if the key is present
> > 2. If the key is not present, then perform an insert.
> >
> > Again, this may or may not be the right thing to do.
> >
>
> Forgive my obtuseness, but what does the preliminary SELECT
> accomplish?  When the trigger fires we already know whether or not
> the entities row existed previously, what we are deciding is how to
> handle the concurrent identifiers table entry.
>
> I initially thought along these lines (select insert/update
> depending on the return value) but I gradually realized that it did
> not matter whether the identifier row was already there or not.  If
> it exists then an UNIQUE key constraint prevents duplicates. If it
> does not exist then the INSERT succeeds.  The previous identifier
> associated with the original common name has to remain on file to
> allow lookups by former names.  Thus, we never update an identifier
> row in this fashion.

If a transaction involves rows where some succeed and some fail, all
will roll back.
If that is the desired behavior, or if all operations are singleton,
then you won't see any problems.


Re: INSERT or UPDATE

От
"James B. Byrne"
Дата:
On Mon, April 6, 2009 20:23, Dann Corbit wrote:

>
> If a transaction involves rows where some succeed and some fail,
> all will roll back. If that is the desired behavior, or if all
> operations are singleton, then you won't see any problems.
>

Do I understand correctly that this means that even if the function
"handles" a failed insert then if the function occurs inside a
transaction then that transaction fails and is rolled back
regardless?


--
***          E-Mail is NOT a SECURE channel          ***
James B. Byrne                mailto:ByrneJB@Harte-Lyne.ca
Harte & Lyne Limited          http://www.harte-lyne.ca
9 Brockley Drive              vox: +1 905 561 1241
Hamilton, Ontario             fax: +1 905 561 0757
Canada  L8E 3C3


Re: INSERT or UPDATE

От
"Dann Corbit"
Дата:
> -----Original Message-----
> From: James B. Byrne [mailto:byrnejb@harte-lyne.ca]
> Sent: Monday, April 06, 2009 5:43 PM
> To: Dann Corbit
> Cc: pgsql-general@postgresql.org
> Subject: RE: [GENERAL] INSERT or UPDATE
>
>
> On Mon, April 6, 2009 20:23, Dann Corbit wrote:
>
> >
> > If a transaction involves rows where some succeed and some fail,
> > all will roll back. If that is the desired behavior, or if all
> > operations are singleton, then you won't see any problems.
> >
>
> Do I understand correctly that this means that even if the function
> "handles" a failed insert then if the function occurs inside a
> transaction then that transaction fails and is rolled back
> regardless?

It depends on how it is handled.  You could certainly (for instance) use
a cursor and perform the operations one by one.


Re: INSERT or UPDATE TRIGGER

От
"James B. Byrne"
Дата:
This is what I have come up with.  Comments are welcomed.


      CREATE OR REPLACE FUNCTION hll_pg_fn_ident_insert()
      RETURNS TRIGGER AS $pg_fn$
        -- ROW AFTER TRIGGER
        -- trigger passes identifier_type, _value and _description
        -- received as ARGV[0], ARGV[1] and ARGV[2]
      BEGIN
        INSERT INTO identifiers(
            entity_id,
            identifier_type,
            identifier_value,
            identifier_description)
          VALUES(
            NEW.id,
            TG_ARGV[0],
            TG.ARGV[1],
            TG_ARGV[2]);

      -- Assume the INSERT fails because of a unique key violation,
      --   (entity_id + identifier_type + identifier_value)
      --
      -- This does not matter since we only need ensure that this
      -- alias exists, so handle the exception and return:

      EXCEPTION
        WHEN unique_violation THEN
          -- do nothing
          NULL;
      END;
      $pg_fn$ LANGUAGE plpgsql;

      COMMENT ON FUNCTION hll_pg_fn_ident_insert IS
        'Used by entities trigger. Inserts a corresponding
identifiers row.'

      CREATE TRIGGER hll_pg_tr_entity_identifier_akna
        AFTER INSERT OR UPDATE ON entities
        FOR EACH ROW EXECUTE PROCEDURE hll_pg_fn_identifier_insert(
          "AKNA", entities.entity_common_name, "Common Name
auto-insert");

      COMMENT ON TRIGGER hll_pg_tr_entity_identifier_akna IS
        'Inserts an alias identifier for common name if one does not
exist'




--
***          E-Mail is NOT a SECURE channel          ***
James B. Byrne                mailto:ByrneJB@Harte-Lyne.ca
Harte & Lyne Limited          http://www.harte-lyne.ca
9 Brockley Drive              vox: +1 905 561 1241
Hamilton, Ontario             fax: +1 905 561 0757
Canada  L8E 3C3


Re: INSERT or UPDATE TRIGGER

От
"James B. Byrne"
Дата:
I am testing the trigger function that I wrote.  Is there a way to
increase the logging detail level for just a single database
instance?  The manual indicates not, but just in case I am
misreading things I am asking here?


--
***          E-Mail is NOT a SECURE channel          ***
James B. Byrne                mailto:ByrneJB@Harte-Lyne.ca
Harte & Lyne Limited          http://www.harte-lyne.ca
9 Brockley Drive              vox: +1 905 561 1241
Hamilton, Ontario             fax: +1 905 561 0757
Canada  L8E 3C3


Re: INSERT or UPDATE TRIGGER

От
"Chris Spotts"
Дата:
Not in regards to logging detail, but that function in general...
I'm pretty new to postgres, so I could be totally wrong in this, but I think
this thread
http://archives.postgresql.org/pgsql-performance/2008-03/msg00204.php
may pertain if you see some performance degradation with that trigger.
Like I said, somebody correct me if I'm way off base.
Chris

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of James B. Byrne
Sent: Tuesday, April 07, 2009 1:52 PM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] INSERT or UPDATE TRIGGER

I am testing the trigger function that I wrote.  Is there a way to
increase the logging detail level for just a single database
instance?  The manual indicates not, but just in case I am
misreading things I am asking here?


--
***          E-Mail is NOT a SECURE channel          ***
James B. Byrne                mailto:ByrneJB@Harte-Lyne.ca
Harte & Lyne Limited          http://www.harte-lyne.ca
9 Brockley Drive              vox: +1 905 561 1241
Hamilton, Ontario             fax: +1 905 561 0757
Canada  L8E 3C3


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: INSERT or UPDATE TRIGGER

От
Tom Lane
Дата:
"James B. Byrne" <byrnejb@harte-lyne.ca> writes:
> I am testing the trigger function that I wrote.  Is there a way to
> increase the logging detail level for just a single database
> instance?

ALTER DATABASE foo SET log_min_messages = whatever;

Note this will only affect subsequently-started sessions.  Also,
if memory serves, you have to be superuser to set this particular
variable.

            regards, tom lane

Re: INSERT or UPDATE TRIGGER

От
"James B. Byrne"
Дата:
On Tue, April 7, 2009 15:09, Tom Lane wrote:

>
> ALTER DATABASE foo SET log_min_messages = whatever;
>
> Note this will only affect subsequently-started sessions.  Also,
> if memory serves, you have to be superuser to set this particular
> variable.

Thanks.  Am I correct to infer from the output this generates that
log_min_messages = debug is primarily for developers of PG itself?
I am poking in the dark here.  What I want to do is to determine if
the trigger is firing and whether the function works as intended.
At the moment I am not seeing anything show up in the secondary
table so I have done something wrong.  Is there some way of getting
PG to tell me what it is doing?

--
***          E-Mail is NOT a SECURE channel          ***
James B. Byrne                mailto:ByrneJB@Harte-Lyne.ca
Harte & Lyne Limited          http://www.harte-lyne.ca
9 Brockley Drive              vox: +1 905 561 1241
Hamilton, Ontario             fax: +1 905 561 0757
Canada  L8E 3C3


Re: INSERT or UPDATE TRIGGER

От
Tom Lane
Дата:
"James B. Byrne" <byrnejb@harte-lyne.ca> writes:
> I am poking in the dark here.  What I want to do is to determine if
> the trigger is firing and whether the function works as intended.
> At the moment I am not seeing anything show up in the secondary
> table so I have done something wrong.  Is there some way of getting
> PG to tell me what it is doing?

You might find it more useful to add some elog(LOG) statements to the
trigger body.

            regards, tom lane

Re: INSERT or UPDATE TRIGGER

От
"James B. Byrne"
Дата:
On Tue, April 7, 2009 16:07, Tom Lane wrote:

>
> You might find it more useful to add some elog(LOG) statements to
> the trigger body.
>

Thank you again.  I will go through section 44.2 tonight.


--
***          E-Mail is NOT a SECURE channel          ***
James B. Byrne                mailto:ByrneJB@Harte-Lyne.ca
Harte & Lyne Limited          http://www.harte-lyne.ca
9 Brockley Drive              vox: +1 905 561 1241
Hamilton, Ontario             fax: +1 905 561 0757
Canada  L8E 3C3


Re: INSERT or UPDATE

От
"Dann Corbit"
Дата:
> -----Original Message-----
> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
> owner@postgresql.org] On Behalf Of James B. Byrne
> Sent: Monday, April 06, 2009 1:46 PM
> To: pgsql-general@postgresql.org
> Subject: [GENERAL] INSERT or UPDATE
>
> I have spent the last couple of days reading up on SQL, of which I
> know very little, and PL/pgSQl, of which I know less.  I am trying
> to decide how best to approach the following requirement.
>
> Given a legal name and a common name and associated details, we wish
> to insert this information into a table, entities. As well, we
> believe it useful to allow a specific entity more than one common
> name.  So, at the moment we are considering having another table,
> identifiers, that takes entity_id = entity.id (synthetic sequenced
> PK for entities), the identifier_type (always 'AKNA' for this
> collection of identifiers) and identifier_value =
> entity.common_name.
>
> This seems straight forward enough when initially inserting an
> entity.  However, it is conceivable that over the lifetime of the
> system a particular entity might change its common name. For example
> the former "John Tash Enterprises" might become popularly known as
> "JTE Inc." while the legal name remains unchanged.
>
> When we update the entity record and set the common_name = "JTE
> Inc." then we need insert an identifier row to match.  However,
> identifiers for a given entity can be maintained separately from the
> entity itself. It is therefore possible, indeed likely, that the
> identifier "JTE Inc." for that entity already exists.  Likely, but
> not certain.  In any case, the old identifier row remains unchanged
> after the new is inserted.
>
> The issue then is how to determine on an UPDATE entities whether it
> is necessary to INSERT a new identifier using values provided from
> the entities row.
>
> From what I have gathered, what one does is simply insert the new
> identifiers row. If there is a primary key conflict then the update
> fails, which the function handles gracefully. If not, then it
> succeeds.
>
> I have also formed the opinion that what one does is write a
> function or functions, such as fn_aknau(entity_id, name), and tie
> these with triggers to the appropriate actions on entities such as:
>
> CREATE TRIGGER tr_entities_aioru AFTER INSERT OR UPDATE
>             ON entities
>            FOR EACH ROW
>        EXECUTE fn_aknai(entities.id, 'AKNA', entities.common_name);
>
> Is my appreciation correct or am I missing the mark entirely?  Is
> this considered the proper place and means to accomplish this sort
> of task in an RDBMS?  Does it belong elsewhere? Am I correct in
> inferring that the values in the columns id and common_name will be
> those of entities AFTER the insert or update and that these will be
> available to the body of the function?  Is the trigger dependent
> upon a SUCCESSFUL INSERT or UPDATE of entities or will it be called
> regardless? Must the function be written in PL/pgSQl (or similar PL)
> or could this function be written in straight SQL? Should it be
> straight SQL if possible?  What should the function return, if
> anything?
>
> Fairly basic stuff I am sure but somewhat mystifying for me at the
> moment.  Any help would be appreciated.

It is a difficult question.

For instance, there are many possibilities when a collision occurs.

I guess that for some collisions, sharing the name is OK.

Consider two different fictional companies (hopefully in different
domains):
Jet Propulsion Industries Incorporated == JPI Inc. (makes jet engines)
Journey Protection Investments Inc. == JPI Inc. (underwrites travel
insurance)
Probably, they don't have a legal battle because they have completely
different domains.
So it seems OK for both companies to relate to this entity if it is only
used as a label.

On the other hand, you may have a typographical error on data entry for
a computer firm.
If you label a company as "IBM" when it should have been "IBN" I guess
that won't make anyone happy.

I think that the real issue is that you must truly and carefully
identify your business rules and model those in the database structure.
Said another way, "How would a human handle this issue given a name
collision?" If the answer is not obvious, then maybe you need to write
an exceptions log and handle each case by hand that is not solved by a
simple and clear to understand rule.