Re: insert fail gracefully if primary key already exists

Поиск
Список
Период
Сортировка
От Bartosz Dmytrak
Тема Re: insert fail gracefully if primary key already exists
Дата
Msg-id CAD8_UcbeNKuj5szmVfR=P9AeDG2aopLjAvOcLbKPkJ20FsFopw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: insert fail gracefully if primary key already exists  (Alessandro Gagliardi <alessandro@path.com>)
Ответы Re: insert fail gracefully if primary key already exists  (Alessandro Gagliardi <alessandro@path.com>)
Список pgsql-novice
To answer Your trigger question:
in my example trigger is devided into 2 parts:
  1. trigger itself:  checkTrigger
  2. trigger funtion: "myschema"."checkTriggerFunction"   
according to this part of doumentation: "The trigger will be associated with the specified table or view and will execute the specified function function_name when certain events occur." I think this describes   trigger in essential part, so my intention is to create object (trigger) related with table and execute certain action (trigger function) when specific event occurs (this event is INSERT into this table). I also would like to perform this action before insert to ignore insert in some case (that is why trigger function returns NULL under certain condition)

this is special kind of function (has to be marked as return TRIGGER). This function is intended to do the "job". Plpgsql trigger functions are very smart - they could recognize the table on which trigger is fired, event (INSERT, UPDATE, DELETE), and other interesting things - full description of possible trigger variables is under above link.
Very interesting are NEW and OLD variables. These variables contain record of the table on which trigger function is called (only for row level trigger - in my example trigger declaration FOR EACH ROW).
NEW holds row being inserted - just before it is written to the table, could be modified in fly (is null for delete). OLD holds existing row - just before update (is null for insert). That is why in my example it is possible to query against Your table (with known structure) like this:

SELECT 1 FROM "myschema".mytable WHERE "MyKey" = NEW."MyKey"

this exactly means: select whatever (1) from required table (mytable) where Primary Key column ("MyKey") value match value from inserted row Primary Key column (NEW).

It is possible to write one generic trigger functions to work with few tables, so eg. for audit table  and log all changes in the rows, it is possible to write generic trigger function using TG_TABLE_NAME, TG_TABLE_SCHEMA and TG_OP variables - simple example is here: http://www.postgresql.org/docs/9.1/static/plpgsql-trigger.html#PLPGSQL-TRIGGER-AUDIT-EXAMPLE

if trigger function (row level) returns NEW then NEW record is written to table (could be modified by trigger function). If trigger function returns NULL then nothing is written to table, because "NEW" record is modified to be nothing. 

hope this will help You to understand my example.

Regards,
Bartek


2012/2/17 Alessandro Gagliardi <alessandro@path.com>
First, to answer your question, in this case, I do not expect the data to be different from what is already stored, so I should be able to safely ignore the insert if the primary key already exists. (Though an answer to the alternative "UPSERT" question would also be welcome as that may prove to be an issue as well.)


BEGIN
    INSERT INTO blocks (block_id, user_id, created, shared, private, moment_type) VALUES (E'4f3adf9cfa63b31c77000b7c', E'4ee02f599c6c3f6d360076ce', '2012-02-14T22:26:36.721980'::timestamp, false, false, E'ambient')
 
Looks like semicolon here is missed :) Semicolon is required after SQL statement.
 
EXCEPTION
    WHEN unique_violation THEN
        NULL;
END;

But I get:

ERROR: syntax error at or near "INSERT"
SQL state: 42601
Character: 11

I feel like I must be doing something stupid here, but I can't see what it is.

At the same time, I was also thinking using a trigger might be preferable so I don't have to mess with application code. But I'm a little mystified by how your example would work. The checkTriggerFunction makes reference to NEW."MyKey" and yet there is no mention of that in checkTrigger. Does it automagically divine that from the insert statement?

Thanks,
-Alessandro


On Tue, Feb 14, 2012 at 12:40 AM, Bartosz Dmytrak <bdmytrak@eranet.pl> wrote:
You can user few different solutions:

1. You can perform insert using stored procedure. Inside this procedure You can use BEGIN...EXCEPTION...END construct - this will handle duplicate key error. You can find this useful: http://www.postgresql.org/docs/9.1/static/plpgsql-control-structures.html - there is example of such approach.

2. You can also check if such row exists before insert (also in stored procedure):
CREATE OR REPLACE FUNCTION "myschema"."InsertIfNotExists" (IN "vKeyValue" int, IN "vValue1" text, IN "vValue2" text)
RETURNS VOID
AS
$BODY$
BEGIN
     IF NOT EXISTS (SELECT 1 FROM "myschema".mytable WHERE "MyKey" = "vKeyValue") THEN
         INSERT INTO "myschema".mytable ("MyKey", "Value1", "Value2")
         VALUES ("vKeyValue", "vValue1", "vValue2");

    END IF;
END;
$BODY$ LANGUAGE plpgsql 

3. You can create trigger before insert which will check if record already exists:
trigger function could look like this:
CREATE OR REPLACE FUNCTION "myschema"."checkTriggerFunction" ()
RETURNS TRIGGER
AS
$BODY$
BEGIN
IF EXISTS (SELECT 1 FROM "myschema".mytable WHERE "MyKey" = NEW."MyKey") THEN
RETURN NULL;
ELSE
RETURN NEW;
END IF;
END;
$BODY$
LANGUAGE plpgsql;

and trigger:
CREATE TRIGGER "checkTrigger"
  BEFORE INSERT
  ON "myschema".mytable
  FOR EACH ROW
  EXECUTE PROCEDURE "myschema"."checkTriggerFunction"();


Another question is: do You want to update record if primary key exists or simply ignore insert?
It should be noticed, every check impacts performance.

Regards,
Bartek



2012/2/13 Alessandro Gagliardi <alessandro@path.com>
Sorry I wasn't clear. The application is written in Python. It services requests that may go to an external API or to our production database (which is Mongo). I should add that we have a somewhat unusual situation in which we have two parallel databases: MongoDB for OLTP and newer PostgreSQL for OLAP. Because Postgres is only consuming, it would waste precious cycles to have the client check to see if a record already exists in Postgres before trying to insert it. I'd rather let Postgres deal with that (which it does well enough anyway with my primary key constraints). My Postgres instance is hosted by Heroku and they provide me with a log which is currently being spammed by these "duplicate key value violates unique constraint" errors making it impossible to see if there are other errors I need to be paying more attention to. 


On Sat, Feb 11, 2012 at 12:08 AM, Bartosz Dmytrak <bdmytrak@eranet.pl> wrote:
Could You provide more details?
PostgreSQL does not allow cross database queries.


Regards,
Bartek


2012/2/10 Alessandro Gagliardi <alessandro@path.com>
I have a situation where I am gradually filling in a table from another database as the need requires. As it currently is, whenever a record comes up, it tries to insert it into my database. If the row is already there, the primary key collides and the insert fails. And that's fine except that it makes for a really cluttered log (making it hard to find errors that I really do need to pay attention to). I'm wondering if there's a better (i.e. cleaner, but possibly also more efficient) way to do this. 

Thoughts?

Thank you in advance,
-Alessandro





В списке pgsql-novice по дате отправления:

Предыдущее
От: Léa Massiot
Дата:
Сообщение: Re: Clusters list - Windows PostgreSQL server
Следующее
От: Alessandro Gagliardi
Дата:
Сообщение: Re: insert fail gracefully if primary key already exists