Re: Postgresql PL parallel processing inside Postgresql function....

Поиск
Список
Период
Сортировка
От David Johnston
Тема Re: Postgresql PL parallel processing inside Postgresql function....
Дата
Msg-id 017c01cdd7c9$65bf5a10$313e0e30$@yahoo.com
обсуждение исходный текст
Ответ на Postgresql PL parallel processing inside Postgresql function....  (Misa Simic <misa.simic@gmail.com>)
Список pgsql-general

From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Misa Simic
Sent: Tuesday, December 11, 2012 10:52 AM
To: pgsql
Subject: [GENERAL] Postgresql PL parallel processing inside Postgresql function....

 

Hi,

 

I have a table of bunch of records:

 

MainTable:

-MainID

-Other attributes....

 

 

Several Tables what Track Activities about MainIDs...

 

 

And one routine written as plpgsql function what Loops trough MainTable and for each record, calls ProccessTheThing(MainID),

 

What again sequentially calls, several plpgsql functions bellow for one thing...

 

 

Main problem is, every single record in MainTable waits to previous record be processed to can continue processing....

 

Because of they are totally independent, I would like if possible to process several things in the same time if possible?

 

But inside Postgresql (db_link is acceptable - I could not figure out full picture... I mean, I am aware about the way to send async query to the same db, but could not figure out full flow... i.e. sent first bunch of 10 or 20, 50 - get notice when  1 of them finsihed so can send next one... etc etc)

 

i.e. Outside Postgresql, It would be possible to from client app, take all MainID from MainTable, loop and call async ProccessTheThing function through several threads... 

 

Any Suggestions?

 

Many Thanks,

 

Misa

 

 

Currently, application level concurrency must be implemented at the client layer.  PostgreSQL transactions (connections really) are single-threaded.

 

That said, quickly scanning the dblink documentation I guess it would be possible to setup one named connection for each “thread”, place a single async query onto each connection, and then continually loop until all connections are cleared (optionally adding a new query as connections become freed up).  I have never actually used dblink so this is just an idea-starter.

 

DO $$ -- (sample code only, exact syntax not attempted)

DECALRE conn_1 varchar; conn_1_send integer; conn_2 varchar; any_are_busy boolean; any_actions_todo Boolean;

BEGIN;

 

conn_1 := dblink_connect(…);

conn_2 := dblink_connect(…);

 

LOOP

any_are_busy := FALSE;

any_actions_todo := ???;

 

              IF NOT dblink_is_busy(…) THEN

                              IF any_actions_todo THEN

                                              conn_1_send := dblink_send_query(…);

                                              IF conn_1_send = 1 THEN any_are_busy := TRUE ELSE … END IF;

                              END IF;

              END IF;

             

IF (NOT any_are_busy) AND (NOT any_actions_todo) THEN

              EXIT --loop

END IF;

 

END;

$$

;

 

David J.

 

В списке pgsql-general по дате отправления:

Предыдущее
От: Jeff Janes
Дата:
Сообщение: Re: large database
Следующее
От: Alex Burkoff
Дата:
Сообщение: Re: execution plan is wrong, or the query ?