Обсуждение: Concurrent updates

Поиск
Список
Период
Сортировка

Concurrent updates

От
Steve Erickson
Дата:

I have a table that I want to use as a queue with all functionality (Insert, update, delete) embodied in a stored procedure.  Inserts and deletes are no problem.  An external program would call the stored procedure to get one or more emails to work on, selecting on “state=’N’”, then updating the row so “state=’P’”.  My problem is having multiple threads calling the stored procedure simultaneously and getting the same row(s).  Selecting FOR UPDATE won’t work as, if thread #1 gets 3 rows and thread #2 starts before thread #1 completes (Commits), thread #2 will select the same 3 rows as thread #1 except, since thread #1 will update the state (changing the state to ‘P’) so that those rows no longer meet thread #2 criteria, and thread #2 will receive zero rows.  The table looks like:

 

CREATE TABLE dss.stage_email

(

  emailid bigserial NOT NULL,

  email_path text,

  state character(1) DEFAULT 'N'::bpchar, -- N = New Email, P=Processing, D=Deleting

  fetch_date timestamp without time zone DEFAULT now(),

  ingest_date timestamp without time zone

)

 

Steve Erickson

Senior Developer

266 East 7th Street, Floor 4

Saint Paul, MN 55101

651.925.3237 office

612.242.1343 cell

 

NOTICE:  This email, including any attachments, is covered by the Electronic Communications Privacy Act, is confidential and may be legally privileged.  If you are not the intended recipient, you are hereby notified that any retention, dissemination, distribution or copying of this communication is strictly prohibited.  If you have received this communication in error, please immediately notify our office by return email or at our telephone number (651) 925-3200.  Then delete and destroy all copies of this email and any attachments.

 

Thank you.

 

Re: Concurrent updates

От
Albe Laurenz
Дата:
Steve Erickson wrote:
> I have a table that I want to use as a queue with all functionality (Insert, update, delete) embodied
> in a stored procedure.  Inserts and deletes are no problem.  An external program would call the stored
> procedure to get one or more emails to work on, selecting on "state='N'", then updating the row so
> "state='P'".  My problem is having multiple threads calling the stored procedure simultaneously and
> getting the same row(s).  Selecting FOR UPDATE won't work as, if thread #1 gets 3 rows and thread #2
> starts before thread #1 completes (Commits), thread #2 will select the same 3 rows as thread #1
> except, since thread #1 will update the state (changing the state to 'P') so that those rows no longer
> meet thread #2 criteria, and thread #2 will receive zero rows.  The table looks like:
>
> CREATE TABLE dss.stage_email
> (
>   emailid bigserial NOT NULL,
>   email_path text,
>   state character(1) DEFAULT 'N'::bpchar, -- N = New Email, P=Processing, D=Deleting
>   fetch_date timestamp without time zone DEFAULT now(),
>   ingest_date timestamp without time zone
> )

Strange, it works fine for me.

SELECT * FROM stage_email;

 emailid | email_path | state |         fetch_date         | ingest_date
---------+------------+-------+----------------------------+-------------
       1 | one        | N     | 2013-03-19 09:25:35.25905  |
       2 | two        | N     | 2013-03-19 09:25:38.682343 |
       3 | three      | N     | 2013-03-19 09:25:42.162118 |
       4 | four       | N     | 2013-03-19 09:25:45.626052 |
(4 rows)

Session 1: BEGIN;
Session 1: SELECT * FROM stage_email WHERE state = 'N' ORDER BY emailid LIMIT 2 FOR UPDATE;

 emailid | email_path | state |         fetch_date         | ingest_date
---------+------------+-------+----------------------------+-------------
       1 | one        | N     | 2013-03-19 09:25:35.25905  |
       2 | two        | N     | 2013-03-19 09:25:38.682343 |
(2 rows)

Session 2: BEGIN;
Session 2: SELECT * FROM stage_email WHERE state = 'N' ORDER BY emailid LIMIT 2 FOR UPDATE;
  (Session 2 blocks)

Session 1: UPDATE stage_email SET state = 'P' WHERE emailid IN (1, 2);
Session 1: COMMIT;

Now session 2 gets this result:

 emailid | email_path | state |         fetch_date         | ingest_date
---------+------------+-------+----------------------------+-------------
       3 | three      | N     | 2013-03-19 09:25:42.162118 |
       4 | four       | N     | 2013-03-19 09:25:45.626052 |
(2 rows)

So session 2 can continue processing the next two entries.


Maybe you can illustrate your problem with an example.

Yours,
Laurenz Albe