PostgreSQL as advanced job queuing system

Поиск
Список
Период
Сортировка
Hello List,

i am playing with the idea to implement a job queuing system using PostgreSQL. To meet requirements the system needs to
offersome advanced 
 
features compared to "classic" queuing systems:

- users can create new queues at any time
- queues have priorities
- priorities of queues can change at any time
- jobs in queues with the highest priority should be processed first

A simple schema could look like this:

    create table queues (
      id integer primary key,
      priority integer not null default 0
    );

    create table jobs (
      id serial primary key,
      queue_id integer not null references queues(id)
    );

Right now i am stuck writing an efficient query for dequeuing. This is what i got so far:

    insert into queues (id, priority) values (1, 0), (2, 1), (3, 1);
    insert into jobs (queue_id) select 1 from generate_series(1, 1000000);
    insert into jobs (queue_id) select 2 from generate_series(1, 1000000);
    insert into jobs (queue_id) select 3 from generate_series(1, 1000000);

Here is a simple dequeuing query that is super efficient, but ignores priority:

    select * from jobs limit 1 for update of jobs skip locked;
    --  id | queue_id
    -- ----+----------
    --   1 |        1
    -- (1 row)
    --
    -- Time: 2.641 ms

This is my naive query obeying priority. This is very slow and i could not get it any faster:

    select *
    from jobs
    join queues on queues.id = jobs.queue_id
    order by priority desc
    limit 1
    for update of jobs skip locked;
    --    id    | queue_id | id | priority
    -- ---------+----------+----+----------
    --  1000001 |        2 |  2 |        1
    -- (1 row)
    --
    -- Time: 1116.631 ms (00:01.117)

Here is the query plan:

   --                                                                QUERY PLAN
   --

-----------------------------------------------------------------------------------------------------------------------------------------
   --  Limit  (cost=66225.61..66225.63 rows=1 width=28) (actual time=1333.139..1333.142 rows=1 loops=1)
   --    ->  LockRows  (cost=66225.61..103725.61 rows=3000000 width=28) (actual time=1333.137..1333.139 rows=1
loops=1)
   --          ->  Sort  (cost=66225.61..73725.61 rows=3000000 width=28) (actual time=1333.110..1333.112 rows=1
loops=1)
   --                Sort Key: queues.priority DESC
   --                Sort Method: external merge  Disk: 111568kB
   --                ->  Hash Join  (cost=60.85..51225.61 rows=3000000 width=28) (actual time=0.064..660.317
rows=3000000loops=1)
 
   --                      Hash Cond: (jobs.queue_id = queues.id)
   --                      ->  Seq Scan on jobs  (cost=0.00..43275.00 rows=3000000 width=14) (actual
time=0.027..253.868rows=3000000 loops=1)
 
   --                      ->  Hash  (cost=32.60..32.60 rows=2260 width=14) (actual time=0.021..0.022 rows=3 loops=1)
   --                            Buckets: 4096  Batches: 1  Memory Usage: 33kB
   --                            ->  Seq Scan on queues  (cost=0.00..32.60 rows=2260 width=14) (actual
time=0.011..0.013rows=3 loops=1)
 
   --  Planning Time: 0.430 ms
   --  Execution Time: 1347.448 ms
   -- (13 rows)

Any ideas for a more efficient implementation?

Thank you for your time,
ushi




В списке pgsql-general по дате отправления:

Предыдущее
От: Vijaykumar Jain
Дата:
Сообщение: Re: uncommitted xmin 3100586 from before xid cutoff 10339367 needs to be frozen
Следующее
От: Dominique Devienne
Дата:
Сообщение: Re: PostgreSQL as advanced job queuing system