Обсуждение: Most effective insert or replace

Поиск
Список
Период
Сортировка

Most effective insert or replace

От
Sergei Politov
Дата:
Hi,

  As far as I recall postgres does not have built-in support for "insert or replace" feature.
  But there is a lot of ways to obtain the same result.
  The problem is that performance of these techniques is quite bad.
  It is about two times slower than simple insert.

  I tried the following ways:
1) Add the following rule on insert:
CREATE RULE replace_dummy AS
  ON INSERT TO dummy
  WHERE
    EXISTS(SELECT 1 FROM dummy WHERE key = NEW.key)
  DO INSTEAD
     (UPDATE dummy SET value = NEW.value);

2) Use the function:
CREATE FUNCTION merge_dummy(ikey int, ivalue text) RETURNS VOID AS
$$
BEGIN
    UPDATE dummy SET value = ivalue WHERE key = ikey;
    IF found THEN
        RETURN;
    END IF;
    INSERT INTO dummy VALUES (ikey, ivalue);
    RETURN;
END;
$$
LANGUAGE plpgsql;
3) Last the most effective in a short period, but seems produces a lot of work for vacuum.
    Add extra column, (time int) into table. I can guarantee that next insert has time greater than previous.
    And use the following rule:
create rule dummy_insert as on insert to dummy do also
    delete from dummy
        where key == NEW.key and time != NEW.time;

  Please comment these ways and propose effective ways to simulate "insert or replace" behavior.
  Also in may case I'm making a lot of inserts in a batch.

  Note: insert or replace I meant.
  Suggest we have:
  dummy with columns: key int, value text.
  Filled with:
  insert into dummy values (1, "one"), (2, "two"), (3, "three")
  When user tries to "insert or replace" pair into this table then in should be inserted if there is no row with the same key.
  Otherwise value of appropriate row is updated.

Best Regards,
  Sergei

Re: Most effective insert or replace

От
Matthew Wakeling
Дата:
On Fri, 3 Jul 2009, Sergei Politov wrote:
>   As far as I recall postgres does not have built-in support for "insert or replace" feature.

>   Please comment these ways and propose effective ways to simulate "insert or replace" behavior.
>   Also in may case I'm making a lot of inserts in a batch.

A few years ago I researched this, and came up with the following method
as seeming the fastest:

BEGIN;
DELETE FROM table WHERE id IN (big long list);
COPY table FROM STDIN BINARY;
COMMIT;

However, our circumstances may not be the same as yours for the following
reasons:

1. We are updating whole rows indexed by primary key, not just a single
    field in each row.
2. We are able to use the COPY command - indeed we wrote a fair amount of
    Java to enable batching, background writing, and COPY support. See
    http://www.flymine.org/api/index.html?org/intermine/sql/writebatch/Batch.html
    and http://www.intermine.org/
3. HOT has been invented since then, and it won't play well with this
    method.

Matthew

--
 Trying to write a program that can't be written is... well, it can be an
 enormous amount of fun!                 -- Computer Science Lecturer