Re: Danger of idiomatic plpgsql loop for merging data

Поиск
Список
Период
Сортировка
От J. Greg Davidson
Тема Re: Danger of idiomatic plpgsql loop for merging data
Дата
Msg-id 1355958611.1781.1280892747641.JavaMail.root@zimbra.well.com
обсуждение исходный текст
Ответ на Danger of idiomatic plpgsql loop for merging data  ("J. Greg Davidson" <jgd@well.com>)
Ответы Re: Danger of idiomatic plpgsql loop for merging data  (Merlin Moncure <mmoncure@gmail.com>)
Список pgsql-general
Hi fellow PostgreSQL hackers,

First, a thank you to Merlin for commenting on my earlier post!
I've run into another dangerous problem since the earlier post.

I began converting from the plpgsql loop idiom for merging data
into a COALESCE(find(), create(), find()) idiom and ran into a
problem with the latter.  I'll hold the full code to the end of
this post and summarize the situation first.

I'm using this idiom in 48 get_ functions associated with
48 different tables!  In all cases, the functions should be
idempotent with monotonic side effects, i.e. when the data
desired is not present (the first time they are called),
they create the data; in all cases, they return a reference
(primary key value) to the data.

Here's the new problem: Depending on the idiom I use, when I
nest a call to one of these get_ functions as an argument to
another function, the outer function does not see the new data!

To be more specific, in the call:
   SELECT access_foo( get_foo(foo_data) )
where access_foo expects a reference to a row of TABLE foos
and get_foo returns such a reference, possibly creating the
desired row,

when I write get_foo() with the plpgsql loop idiom, it seems
to always return a reference which access_foo can use immediately.
On the other hand, when I use the COALESCE(find(), create(), find())
idiom and the get_foo() function created a new row, access_foo
fails to find it!

In all cases saying:
   SELECT get_foo(foo_data);
   SELECT access_foo( get_foo(foo_data) );
works fine since if the data needed to be added, it was done
in the separate earlier transaction.

Because of this problem, I'm abandoning my original preference
for the COALESCE(find(), create(), find()) idiom and I'm adding
a check to the plpgsql LOOP idiom to prevent it going infinite.

For those who'd like to see the gory details, here is the
code, simplified as much as possible (and with a suffix on
the type name to please Merlin:):

-- (0) The table in question:

-- The trailing underscores can be read as "field" or "slot"
-- which is sometimes useful to avoid clashes with reserved
-- words, local variables, etc.

CREATE TABLE foos (
  ref_ foo_reftype PRIMARY KEY DEFAULT next_foo_ref();
  name_ text UNIQUE NOT NULL;
);

-- (1) The idiom from the PostgreSQL reference manual, which
-- unfortunately can go into an infinite loop if a trigger
-- should raise a unique_violation exception.

-- The underscore prefix can be read as "local"
-- and is sometimes useful to avoid name clashes, etc.

CREATE OR REPLACE
FUNCTION get_foo(text) RETURNS foo_reftype AS $$
DECLARE
  _ref foo_reftype;
BEGIN
  LOOP
    SELECT ref_ INTO _ref
      FROM foos WHERE name_ = $1;
    IF FOUND THEN RETURN _ref; END IF;
    BEGIN
      INSERT INTO foos(name_) VALUES($1);
    EXCEPTION
      WHEN unique_violation THEN
      -- maybe another thread?
    END;
  END LOOP;
END;
$$ LANGUAGE plpgsql STRICT;

-- (2) Where I was originally going
-- to avoid the infinite loop problem,
-- and also hoping to get better performance
-- on the most common case where the
-- first call to old_foo() finds the row
-- (since SQL functions are inlined into
-- the execution plan):

CREATE OR REPLACE
FUNCTION old_foo(text) RETURNS foo_reftype AS $$
  SELECT ref_ FROM foos WHERE name_ = $1
$$ LANGUAGE SQL STRICT;

CREATE OR REPLACE
FUNCTION new_foo(text) RETURNS foo_reftype AS $$
DECLARE
  this regprocedure := 'new_foo(text)';
  _ref foo_reftype;
BEGIN
  INSERT INTO foos(name_) VALUES ($1)
    RETURNING ref_ INTO _ref;
  RETURN _ref;
EXCEPTION
  WHEN unique_violation THEN
    -- maybe another thread?
    RAISE NOTICE '% "%" unique_violation', this, $1;
    RETURN NULL;
END;
$$ LANGUAGE plpgsql STRICT;

CREATE OR REPLACE
FUNCTION get_foo(text) RETURNS foo_reftype AS $$
  SELECT COALESCE(
    old_foo($1), new_foo($1), old_foo($1)
  )
$$ LANGUAGE sql STRICT;

-- (3) Where I'm going now, although it feels
-- like a patch (I hate :-( patches!):

CREATE OR REPLACE
FUNCTION get_foo(text) RETURNS foo_reftype AS $$
DECLARE
  _ref foo_reftype;
  killroy_was_here boolean := false;
  this regprocedure := 'get_foo(text)';
BEGIN
  LOOP
    SELECT ref_ INTO _ref
      FROM foos WHERE name_ = $1;
    IF FOUND THEN RETURN _foo; END IF;
    IF killroy_was_here THEN
      RAISE EXCEPTION '% "%" loops!', this, $1;
    END IF;
    killroy_was_here := true;
    BEGIN
      INSERT INTO foos(name_) VALUES($1);
    EXCEPTION
      WHEN unique_violation THEN -- another thread?
      RAISE NOTICE '% "%" unique_violation', this, $1;
    END;
  END LOOP;
END;
$$ LANGUAGE plpgsql;

-- end code

_Greg

P.S. Sorry about the excessive underscores!
I've been burned by having local variables
confused with fields, so when I need to clearly
differentiate such things, I use this convention.

J. Greg Davidson

В списке pgsql-general по дате отправления:

Предыдущее
От: "Joshua D. Drake"
Дата:
Сообщение: Re: deleting db cluster
Следующее
От: "Joshua D. Drake"
Дата:
Сообщение: Re: problem with pg_standby