Re: Serialization, Locking...implement processing Queue with a table

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Serialization, Locking...implement processing Queue with a table
Дата
Msg-id 8777.1052749406@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Serialization, Locking...implement processing Queue with a table  ("D. Dante Lorenso" <dante@lorenso.com>)
Список pgsql-general
"D. Dante Lorenso" <dante@lorenso.com> writes:
> How should I go about implementing a synchronized process id
> queue that will select one unique row from a table at a time
> and make the selection be safe with concurrent accesses?

You can find various discussions of this in the archives, but a
reasonable way to proceed is:

1. The table of pending or in-process jobs has a column "processor_id"
that is zero for pending jobs and equal to the (unique) processor number
for active jobs.  (Assume for the moment that completed jobs are removed
from the table entirely.)

2. When idle, you try to reserve a job like so:

    BEGIN;
    SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
    SELECT job_id, ... FROM job_table
        WHERE processor_id = 0 LIMIT 1 FOR UPDATE;

The SELECT has three possible outcomes:

2a: One row is returned.  You do

    UPDATE job_table SET processor_id = $me
        WHERE job_id = $jobid;
    COMMIT;

and then go about executing the job.  When done, delete the row from
job_table and try to get another one.

2b: No row is returned: no jobs are pending.  Commit your transaction,
sleep for an appropriate delay period, and try again.

2c: You get a "can't serialize" failure.  This will happen if two
processors try to reserve the same row at the same time.  In this case,
roll back your transaction, sleep for a short interval (maybe a few
msec) and try again.  You don't want to sleep as long as normal in this
case, since there might be another available job.

(Note that you *cannot* do this in a plpgsql function, since it cannot
start or commit a transaction; these commands have got to be directly
issued by the application.)


Assuming that there aren't a vast number of pending jobs at any time,
this should work pretty well without even bothering with an index on
job_table.  You will want to vacuum it often though (at least every few
hundred job completions, I'd think).

Now, what if you wanted to remember completed jobs?  I'd actually
recommend transferring the records of completed jobs to a different
table.  But if you really want to keep them in the same table, maybe
add a boolean "completed" field, and make the initial SELECT be

    SELECT job_id, ... FROM job_table
        WHERE processor_id = 0 AND NOT completed LIMIT 1 FOR UPDATE;

Now you *will* need an index to keep things speedy.  I'd try a partial
index on processor_id with condition "NOT completed".  You'll still
need frequent vacuums.

            regards, tom lane


В списке pgsql-general по дате отправления:

Предыдущее
От: Jason Hihn
Дата:
Сообщение: Arrays with MS Access?
Следующее
От: Jon Earle
Дата:
Сообщение: Re: Error installing postgresql-7.3.2 (fixed, but Q