Re: On-demand running query plans using auto_explain and signals

Поиск
Список
Период
Сортировка
От Shulgin, Oleksandr
Тема Re: On-demand running query plans using auto_explain and signals
Дата
Msg-id CACACo5SR1OJz3F-fJJQq1_DcqK+xBDHnbaZ+D5QVrcHScBQr_A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: On-demand running query plans using auto_explain and signals  (Pavel Stehule <pavel.stehule@gmail.com>)
Ответы Re: On-demand running query plans using auto_explain and signals  (Pavel Stehule <pavel.stehule@gmail.com>)
Список pgsql-hackers
On Wed, Sep 2, 2015 at 11:16 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:


2015-09-02 11:01 GMT+02:00 Shulgin, Oleksandr <oleksandr.shulgin@zalando.de>:
On Tue, Sep 1, 2015 at 7:02 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

But do we really need the slots mechanism?  Would it not be OK to just let the LWLock do the sequencing of concurrent requests?  Given that we only going to use one message queue per cluster, there's not much concurrency you can gain by introducing slots I believe.

I afraid of problems on production. When you have a queue related to any process, then all problems should be off after end of processes. One message queue per cluster needs restart cluster when some pathological problems are - and you cannot restart cluster in production week, sometimes weeks. The slots are more robust.

Yes, but in your implementation the slots themselves don't have a queue/buffer.  Did you intend to have a message queue per slot?

The message queue cannot be reused, so I expect one slot per caller to be used passing parameters, - message queue will be created/released by demand by caller.

I don't believe a message queue cannot really be reused.  What would stop us from calling shm_mq_create() on the queue struct again?

To give you an idea, in my current prototype I have only the following struct:

typedef struct {
LWLock   *lock;
/*CmdStatusInfoSlot slots[CMDINFO_SLOTS];*/
pid_t target_pid;
pid_t sender_pid;
int request_type;
int result_code;
shm_mq buffer;
} CmdStatusInfo;

An instance of this is allocated on shared memory once, using BUFFER_SIZE of 8k.

In pg_cmdstatus() I lock on the LWLock to check if target_pid is 0, then it means nobody else is using this communication channel at the moment.  If that's the case, I set the pids and request_type and initialize the mq buffer.  Otherwise I just sleep and retry acquiring the lock (a timeout should be added here probably).

What sort of pathological problems are you concerned of?  The communicating backends should just detach from the message queue properly and have some timeout configured to prevent deadlocks.  Other than that, I don't see how having N slots really help the problem: in case of pathological problems you will just deplete them all sooner or later.

I afraid of unexpected problems :) - any part of signal handling or multiprocess communication is fragile. Slots are simple and simply attached to any process without necessity to alloc/free some memory.

Yes, but do slots solve the actual problem?  If there is only one message queue, you still have the same problem regardless of the number of slots you decide to have.

--
Alex

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Langote
Дата:
Сообщение: Re: Horizontal scalability/sharding
Следующее
От: Fujii Masao
Дата:
Сообщение: Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: HEAD seems to generate larger WAL regarding GIN index