Обсуждение: possible race condition in trigger functions on insert operations?
Bit of a trigger NOOB Q:
I am trying to use a trigger function to automatically populate new
rows in a table with a public ID of the form YYYY-NNN such that the
42nd row created in 2011 would get the ID "2011-042". Each row is
associated via an iasid column with a row in an audit table that has a
timestamp column called created. This works OK, but I am worried about
two rows getting the same case_no if they come in at the same time
(whatever that means):
declare
case_yr integer;
yr_case_count bigint;
begin
select date_part('year', created) into case_yr
from audit
where audit.sid = NEW.iasid;
select count(*) into yr_case_count
from fwa_case, audit
where fwa_case.iasid=audit.sid
and date_part('year', created) = case_yr;
NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
to_char(1+yr_case_count, 'FM000');
return NEW;
end;
Do I have to worry about this, or does ACID bail me out? If the
former, what do I do? I am thinking first put a uniqueness constraint
on the column and then figure out how to do retries in a trigger
function.
kenneth
-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Kenneth Tilton
Sent: Tuesday, November 22, 2011 12:26 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] possible race condition in trigger functions on insert
operations?
Bit of a trigger NOOB Q:
I am trying to use a trigger function to automatically populate new rows in
a table with a public ID of the form YYYY-NNN such that the 42nd row
created in 2011 would get the ID "2011-042". Each row is associated via an
iasid column with a row in an audit table that has a timestamp column called
created. This works OK, but I am worried about two rows getting the same
case_no if they come in at the same time (whatever that means):
declare
case_yr integer;
yr_case_count bigint;
begin
select date_part('year', created) into case_yr
from audit
where audit.sid = NEW.iasid;
select count(*) into yr_case_count
from fwa_case, audit
where fwa_case.iasid=audit.sid
and date_part('year', created) = case_yr;
NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
to_char(1+yr_case_count, 'FM000');
return NEW;
end;
Do I have to worry about this, or does ACID bail me out? If the former, what
do I do? I am thinking first put a uniqueness constraint on the column and
then figure out how to do retries in a trigger function.
kenneth
--------------------------------------------------------
Why can't you just use a sequence? Yes, gaps end up being possible but that
usually is not a big issue and you will avoid any possibility of collision.
David J.
Kenneth Tilton <ktilton@mcna.net> wrote:
> Bit of a trigger NOOB Q:
>
> I am trying to use a trigger function to automatically populate new
> rows in a table with a public ID of the form YYYY-NNN such that the
> 42nd row created in 2011 would get the ID "2011-042". Each row is
> associated via an iasid column with a row in an audit table that has a
> timestamp column called created. This works OK, but I am worried about
> two rows getting the same case_no if they come in at the same time
> (whatever that means):
>
> declare
> case_yr integer;
> yr_case_count bigint;
> begin
> select date_part('year', created) into case_yr
> from audit
> where audit.sid = NEW.iasid;
>
> select count(*) into yr_case_count
> from fwa_case, audit
> where fwa_case.iasid=audit.sid
> and date_part('year', created) = case_yr;
>
> NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
> to_char(1+yr_case_count, 'FM000');
> return NEW;
> end;
If i where you, i would not use such a column. What happens if you
insert/delete a record?
I would use something like to_char(row_number() over (...),'FM000') to
count while select.
Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect. (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly." (unknown)
Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
On Tue, Nov 22, 2011 at 11:25 AM, Kenneth Tilton <ktilton@mcna.net> wrote:
> Bit of a trigger NOOB Q:
>
> I am trying to use a trigger function to automatically populate new
> rows in a table with a public ID of the form YYYY-NNN such that the
> 42nd row created in 2011 would get the ID "2011-042". Each row is
> associated via an iasid column with a row in an audit table that has a
> timestamp column called created. This works OK, but I am worried about
> two rows getting the same case_no if they come in at the same time
> (whatever that means):
>
> declare
> case_yr integer;
> yr_case_count bigint;
> begin
> select date_part('year', created) into case_yr
> from audit
> where audit.sid = NEW.iasid;
>
> select count(*) into yr_case_count
> from fwa_case, audit
> where fwa_case.iasid=audit.sid
> and date_part('year', created) = case_yr;
>
> NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
> to_char(1+yr_case_count, 'FM000');
> return NEW;
> end;
>
> Do I have to worry about this, or does ACID bail me out? If the
> former, what do I do? I am thinking first put a uniqueness constraint
> on the column and then figure out how to do retries in a trigger
> function.
ACID does not bail you out -- you've put no synchonization in to
prevent to concurrent counts coming at roughly the same time and
getting the same answer. A uniqueness constraint is definitely a good
idea. In terms of doing a gapless sequence generally, see here:
http://www.varlena.com/GeneralBits/130.php. Basically the general
idea is to keep a counter field somewhere that you lock and update.
merlin
On Tue, Nov 22, 2011 at 12:48 PM, David Johnston <polobo@yahoo.com> wrote:
> -----Original Message-----
> From: pgsql-general-owner@postgresql.org
> [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Kenneth Tilton
> Sent: Tuesday, November 22, 2011 12:26 PM
> To: pgsql-general@postgresql.org
> Subject: [GENERAL] possible race condition in trigger functions on insert
> operations?
>
> Bit of a trigger NOOB Q:
>
> I am trying to use a trigger function to automatically populate new rows in
> a table with a public ID of the form YYYY-NNN such that the 42nd row
> created in 2011 would get the ID "2011-042". Each row is associated via an
> iasid column with a row in an audit table that has a timestamp column called
> created. This works OK, but I am worried about two rows getting the same
> case_no if they come in at the same time (whatever that means):
>
> declare
> case_yr integer;
> yr_case_count bigint;
> begin
> select date_part('year', created) into case_yr
> from audit
> where audit.sid = NEW.iasid;
>
> select count(*) into yr_case_count
> from fwa_case, audit
> where fwa_case.iasid=audit.sid
> and date_part('year', created) = case_yr;
>
> NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
> to_char(1+yr_case_count, 'FM000');
> return NEW;
> end;
>
> Do I have to worry about this, or does ACID bail me out? If the former, what
> do I do? I am thinking first put a uniqueness constraint on the column and
> then figure out how to do retries in a trigger function.
>
> kenneth
> --------------------------------------------------------
>
> Why can't you just use a sequence?
The sequence has to be within the year. Someone suggested a cron job
to reset the sequence at the beginning of the year but I find that
alternative unappealing for some reason.
-kt
On Tue, Nov 22, 2011 at 12:53 PM, Andreas Kretschmer
<akretschmer@spamfence.net> wrote:
> Kenneth Tilton <ktilton@mcna.net> wrote:
>
>> Bit of a trigger NOOB Q:
>>
>> I am trying to use a trigger function to automatically populate new
>> rows in a table with a public ID of the form YYYY-NNN such that the
>> 42nd row created in 2011 would get the ID "2011-042". Each row is
>> associated via an iasid column with a row in an audit table that has a
>> timestamp column called created. This works OK, but I am worried about
>> two rows getting the same case_no if they come in at the same time
>> (whatever that means):
>>
>> declare
>> case_yr integer;
>> yr_case_count bigint;
>> begin
>> select date_part('year', created) into case_yr
>> from audit
>> where audit.sid = NEW.iasid;
>>
>> select count(*) into yr_case_count
>> from fwa_case, audit
>> where fwa_case.iasid=audit.sid
>> and date_part('year', created) = case_yr;
>>
>> NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
>> to_char(1+yr_case_count, 'FM000');
>> return NEW;
>> end;
>
> If i where you, i would not use such a column. What happens if you
> insert/delete a record?
We only do logical deletes. Not sure what you mean about inserts --
that is what I am working on, and they always should get the next
highest sequence number in a year.
>
> I would use something like to_char(row_number() over (...),'FM000') to
> count while select.
Unfortunately it must be within the year, not overall.
-kt
Thanks, that's perfect.
-kenneth
On Tue, Nov 22, 2011 at 12:53 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Tue, Nov 22, 2011 at 11:25 AM, Kenneth Tilton <ktilton@mcna.net> wrote:
>> Bit of a trigger NOOB Q:
>>
>> I am trying to use a trigger function to automatically populate new
>> rows in a table with a public ID of the form YYYY-NNN such that the
>> 42nd row created in 2011 would get the ID "2011-042". Each row is
>> associated via an iasid column with a row in an audit table that has a
>> timestamp column called created. This works OK, but I am worried about
>> two rows getting the same case_no if they come in at the same time
>> (whatever that means):
>>
>> declare
>> case_yr integer;
>> yr_case_count bigint;
>> begin
>> select date_part('year', created) into case_yr
>> from audit
>> where audit.sid = NEW.iasid;
>>
>> select count(*) into yr_case_count
>> from fwa_case, audit
>> where fwa_case.iasid=audit.sid
>> and date_part('year', created) = case_yr;
>>
>> NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
>> to_char(1+yr_case_count, 'FM000');
>> return NEW;
>> end;
>>
>> Do I have to worry about this, or does ACID bail me out? If the
>> former, what do I do? I am thinking first put a uniqueness constraint
>> on the column and then figure out how to do retries in a trigger
>> function.
>
> ACID does not bail you out -- you've put no synchonization in to
> prevent to concurrent counts coming at roughly the same time and
> getting the same answer. A uniqueness constraint is definitely a good
> idea. In terms of doing a gapless sequence generally, see here:
> http://www.varlena.com/GeneralBits/130.php. Basically the general
> idea is to keep a counter field somewhere that you lock and update.
>
> merlin
>
Kenneth Tilton <ktilton@mcna.net> wrote:
> On Tue, Nov 22, 2011 at 12:53 PM, Andreas Kretschmer
> <akretschmer@spamfence.net> wrote:
> > Kenneth Tilton <ktilton@mcna.net> wrote:
> >
> >> Bit of a trigger NOOB Q:
> >>
> >> I am trying to use a trigger function to automatically populate new
> >> rows in a table with a public ID of the form YYYY-NNN such that the
> >> 42nd row created in 2011 would get the ID "2011-042". Each row is
> >> associated via an iasid column with a row in an audit table that has a
> >> timestamp column called created. This works OK, but I am worried about
> >> two rows getting the same case_no if they come in at the same time
> >> (whatever that means):
> >>
> >> declare
> >> case_yr integer;
> >> yr_case_count bigint;
> >> begin
> >> select date_part('year', created) into case_yr
> >> from audit
> >> where audit.sid = NEW.iasid;
> >>
> >> select count(*) into yr_case_count
> >> from fwa_case, audit
> >> where fwa_case.iasid=audit.sid
> >> and date_part('year', created) = case_yr;
> >>
> >> NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
> >> to_char(1+yr_case_count, 'FM000');
> >> return NEW;
> >> end;
> >
> > If i where you, i would not use such a column. What happens if you
> > insert/delete a record?
>
> We only do logical deletes. Not sure what you mean about inserts --
inserts that change the numbering...
> that is what I am working on, and they always should get the next
> highest sequence number in a year.
>
> >
> > I would use something like to_char(row_number() over (...),'FM000') to
> > count while select.
>
> Unfortunately it must be within the year, not overall.
That is not a problem - over (...) with, for instance, PARTITION BY ...
Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect. (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly." (unknown)
Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
On Tue, Nov 22, 2011 at 2:05 PM, Andreas Kretschmer
<akretschmer@spamfence.net> wrote:
> Kenneth Tilton <ktilton@mcna.net> wrote:
>
>> On Tue, Nov 22, 2011 at 12:53 PM, Andreas Kretschmer
>> <akretschmer@spamfence.net> wrote:
>> > Kenneth Tilton <ktilton@mcna.net> wrote:
>> >
>> >> Bit of a trigger NOOB Q:
>> >>
>> >> I am trying to use a trigger function to automatically populate new
>> >> rows in a table with a public ID of the form YYYY-NNN such that the
>> >> 42nd row created in 2011 would get the ID "2011-042". Each row is
>> >> associated via an iasid column with a row in an audit table that has a
>> >> timestamp column called created. This works OK, but I am worried about
>> >> two rows getting the same case_no if they come in at the same time
>> >> (whatever that means):
>> >>
>> >> declare
>> >> case_yr integer;
>> >> yr_case_count bigint;
>> >> begin
>> >> select date_part('year', created) into case_yr
>> >> from audit
>> >> where audit.sid = NEW.iasid;
>> >>
>> >> select count(*) into yr_case_count
>> >> from fwa_case, audit
>> >> where fwa_case.iasid=audit.sid
>> >> and date_part('year', created) = case_yr;
>> >>
>> >> NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
>> >> to_char(1+yr_case_count, 'FM000');
>> >> return NEW;
>> >> end;
>> >
>> > If i where you, i would not use such a column. What happens if you
>> > insert/delete a record?
>>
>> We only do logical deletes. Not sure what you mean about inserts --
>
> inserts that change the numbering...
>
>
>> that is what I am working on, and they always should get the next
>> highest sequence number in a year.
>>
>> >
>> > I would use something like to_char(row_number() over (...),'FM000') to
>> > count while select.
>>
>> Unfortunately it must be within the year, not overall.
>
> That is not a problem - over (...) with, for instance, PARTITION BY ...
Ah, OK, I did not understand what they meant by "partition" when I
checked out row_number. I'll give that a try.
First Q: does a row have a row_number at the time of insert/before?
Or will I be updating the row in the insert/after? I should know
soon... :)
-kenneth
-----Original Message-----
From: Kenneth Tilton [mailto:ktilton@mcna.net]
Sent: Tuesday, November 22, 2011 1:52 PM
To: David Johnston
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] possible race condition in trigger functions on
insert operations?
On Tue, Nov 22, 2011 at 12:48 PM, David Johnston <polobo@yahoo.com> wrote:
> -----Original Message-----
> From: pgsql-general-owner@postgresql.org
> [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Kenneth
> Tilton
> Sent: Tuesday, November 22, 2011 12:26 PM
> To: pgsql-general@postgresql.org
> Subject: [GENERAL] possible race condition in trigger functions on
> insert operations?
>
> Bit of a trigger NOOB Q:
>
> I am trying to use a trigger function to automatically populate new
> rows in a table with a public ID of the form YYYY-NNN such that the
> 42nd row created in 2011 would get the ID "2011-042". Each row is
> associated via an iasid column with a row in an audit table that has a
> timestamp column called created. This works OK, but I am worried about
> two rows getting the same case_no if they come in at the same time
(whatever that means):
>
> declare
> case_yr integer;
> yr_case_count bigint;
> begin
> select date_part('year', created) into case_yr
> from audit
> where audit.sid = NEW.iasid;
>
> select count(*) into yr_case_count
> from fwa_case, audit
> where fwa_case.iasid=audit.sid
> and date_part('year', created) = case_yr;
>
> NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
> to_char(1+yr_case_count, 'FM000');
> return NEW;
> end;
>
> Do I have to worry about this, or does ACID bail me out? If the
> former, what do I do? I am thinking first put a uniqueness constraint
> on the column and then figure out how to do retries in a trigger function.
>
> kenneth
> --------------------------------------------------------
>
> Why can't you just use a sequence?
The sequence has to be within the year. Someone suggested a cron job to
reset the sequence at the beginning of the year but I find that alternative
unappealing for some reason.
----------------------------------------------
Just create a single sequence for each year and then call the proper one
on-the-fly. You can create multiple sequences in advance and possible even
auto-create the sequence the first time one is attempted to be used in a
given year. If you can live with possible (but probably unlikely) gaps in
the sequence then all the concurrency will be handled for you and you can
focus on writing a function that, given a year, will return the proper
value.
David J.
On Tue, Nov 22, 2011 at 2:43 PM, David Johnston <polobo@yahoo.com> wrote: > Just create a single sequence for each year and then call the proper one > on-the-fly. You can create multiple sequences in advance and possible even > auto-create the sequence the first time one is attempted to be used in a > given year. If you can live with possible (but probably unlikely) gaps in > the sequence then all the concurrency will be handled for you and you can > focus on writing a function that, given a year, will return the proper > value. I personally think the 'record the next to be inserted value' in a table somewhere is better unless you are trying to support a lot of concurrent operations. Also the gap issue is more likely to come up than you're letting on -- a rolled back transaction is all it takes. merlin
Pre-created sequences per year. Not a bad idea at all. I had not been
worrying about gaps but I checked with the user and they definitely do
not want gaps. But as long as we do not delete, is there still a risk
of gaps?
I am leaning towards a serial column maintained by postgres, using
row_number partitioned by year ordered by the serial column. No gaps
since we never delete, and no concurrency issues. I think. :)
Might also go for allocating the IDs from a separate table.
-kenneth
On Tue, Nov 22, 2011 at 3:43 PM, David Johnston <polobo@yahoo.com> wrote:
> -----Original Message-----
> From: Kenneth Tilton [mailto:ktilton@mcna.net]
> Sent: Tuesday, November 22, 2011 1:52 PM
> To: David Johnston
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] possible race condition in trigger functions on
> insert operations?
>
> On Tue, Nov 22, 2011 at 12:48 PM, David Johnston <polobo@yahoo.com> wrote:
>> -----Original Message-----
>> From: pgsql-general-owner@postgresql.org
>> [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Kenneth
>> Tilton
>> Sent: Tuesday, November 22, 2011 12:26 PM
>> To: pgsql-general@postgresql.org
>> Subject: [GENERAL] possible race condition in trigger functions on
>> insert operations?
>>
>> Bit of a trigger NOOB Q:
>>
>> I am trying to use a trigger function to automatically populate new
>> rows in a table with a public ID of the form YYYY-NNN such that the
>> 42nd row created in 2011 would get the ID "2011-042". Each row is
>> associated via an iasid column with a row in an audit table that has a
>> timestamp column called created. This works OK, but I am worried about
>> two rows getting the same case_no if they come in at the same time
> (whatever that means):
>>
>> declare
>> case_yr integer;
>> yr_case_count bigint;
>> begin
>> select date_part('year', created) into case_yr
>> from audit
>> where audit.sid = NEW.iasid;
>>
>> select count(*) into yr_case_count
>> from fwa_case, audit
>> where fwa_case.iasid=audit.sid
>> and date_part('year', created) = case_yr;
>>
>> NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
>> to_char(1+yr_case_count, 'FM000');
>> return NEW;
>> end;
>>
>> Do I have to worry about this, or does ACID bail me out? If the
>> former, what do I do? I am thinking first put a uniqueness constraint
>> on the column and then figure out how to do retries in a trigger function.
>>
>> kenneth
>> --------------------------------------------------------
>>
>> Why can't you just use a sequence?
>
> The sequence has to be within the year. Someone suggested a cron job to
> reset the sequence at the beginning of the year but I find that alternative
> unappealing for some reason.
>
> ----------------------------------------------
>
> Just create a single sequence for each year and then call the proper one
> on-the-fly. You can create multiple sequences in advance and possible even
> auto-create the sequence the first time one is attempted to be used in a
> given year. If you can live with possible (but probably unlikely) gaps in
> the sequence then all the concurrency will be handled for you and you can
> focus on writing a function that, given a year, will return the proper
> value.
>
> David J.
>
>
>
On Tue, Nov 22, 2011 at 3:52 PM, Merlin Moncure <mmoncure@gmail.com> wrote: > On Tue, Nov 22, 2011 at 2:43 PM, David Johnston <polobo@yahoo.com> wrote: >> Just create a single sequence for each year and then call the proper one >> on-the-fly. You can create multiple sequences in advance and possible even >> auto-create the sequence the first time one is attempted to be used in a >> given year. If you can live with possible (but probably unlikely) gaps in >> the sequence then all the concurrency will be handled for you and you can >> focus on writing a function that, given a year, will return the proper >> value. > > I personally think the 'record the next to be inserted value' in a > table somewhere is better unless you are trying to support a lot of > concurrent operations. Also the gap issue is more likely to come up > than you're letting on -- a rolled back transaction is all it takes. Yeah, using a table seems to have the advantage of being about the only thing that would work, though I think row_number properly used (with an invariant ordering and no deletes) also works. -kenneth