Обсуждение: shm_mq inconsistent behavior of SHM_MQ_DETACHED

Поиск
Список
Период
Сортировка

shm_mq inconsistent behavior of SHM_MQ_DETACHED

От
Petr Jelinek
Дата:
Hi,

I was playing with shm_mq and found a little odd behavior with detaching 
after sending messages.

Following sequence behaves as expected (receiver gets 2 messages):
P1 -> set_sender
P1 -> attach
P2 -> set_receiver
P2 -> attach
P1 -> send
P2 -> receive
P1 -> send
P1 -> detach
P2 -> receive
P2 -> detach

But if I do first receive after detach like in this sequence:
P1 -> set_sender
P1 -> attach
P2 -> set_receiver
P2 -> attach
P1 -> send
P1 -> send
P1 -> detach
P2 -> receive

I get SHM_MQ_DETACHED on the receiver even though there are messages in 
the ring buffer.

The reason for this behavior is that mqh_counterparty_attached is only 
set by shm_mq_receive. This does not seem to be consistent - I would 
either expect to get SHM_MQ_DETACHED always when other party has 
detached or always get all remaining messages that are in queue (and I 
would strongly prefer the latter).

Maybe the shm_mq_get_bytes_written should be used to determine if there 
is something left for us to read in the receiver if we hit the 
!mqh_counterparty_attached code path with detached sender?


--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: shm_mq inconsistent behavior of SHM_MQ_DETACHED

От
Robert Haas
Дата:
On Tue, Apr 22, 2014 at 9:55 AM, Petr Jelinek <petr@2ndquadrant.com> wrote:
> I was playing with shm_mq and found a little odd behavior with detaching
> after sending messages.
>
> Following sequence behaves as expected (receiver gets 2 messages):
> P1 -> set_sender
> P1 -> attach
> P2 -> set_receiver
> P2 -> attach
> P1 -> send
> P2 -> receive
> P1 -> send
> P1 -> detach
> P2 -> receive
> P2 -> detach
>
> But if I do first receive after detach like in this sequence:
> P1 -> set_sender
> P1 -> attach
> P2 -> set_receiver
> P2 -> attach
> P1 -> send
> P1 -> send
> P1 -> detach
> P2 -> receive
>
> I get SHM_MQ_DETACHED on the receiver even though there are messages in the
> ring buffer.

That's a bug.

> The reason for this behavior is that mqh_counterparty_attached is only set
> by shm_mq_receive. This does not seem to be consistent - I would either
> expect to get SHM_MQ_DETACHED always when other party has detached or always
> get all remaining messages that are in queue (and I would strongly prefer
> the latter).
>
> Maybe the shm_mq_get_bytes_written should be used to determine if there is
> something left for us to read in the receiver if we hit the
> !mqh_counterparty_attached code path with detached sender?

That's probably not a good idea, because there could be just a partial
message left in the buffer, if the sender died midway through writing
it.  I suspect that attacking the problem that way will lead to a
bunch of nasty edge cases.

I'm thinking that the problem is really revolves around
shm_mq_wait_internal().  It returns true if the queue is attached but
not detached, and false if either the detach has already happened, or
if we establish via the background worker handle that it will never
come.  But in the case of receiving, we want to treat
attached-then-detached as a success case, not a failure case.

Can you see if the attached patch fixes it?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Вложения

Re: shm_mq inconsistent behavior of SHM_MQ_DETACHED

От
Petr Jelinek
Дата:
On 28/04/14 15:36, Robert Haas wrote:
> On Tue, Apr 22, 2014 at 9:55 AM, Petr Jelinek <petr@2ndquadrant.com> wrote:
>>
>> But if I do first receive after detach like in this sequence:
>> P1 -> set_sender
>> P1 -> attach
>> P2 -> set_receiver
>> P2 -> attach
>> P1 -> send
>> P1 -> send
>> P1 -> detach
>> P2 -> receive
>>
>> I get SHM_MQ_DETACHED on the receiver even though there are messages in the
>> ring buffer.
>
> That's a bug.
>
> I'm thinking that the problem is really revolves around
> shm_mq_wait_internal().  It returns true if the queue is attached but
> not detached, and false if either the detach has already happened, or
> if we establish via the background worker handle that it will never
> come.  But in the case of receiving, we want to treat
> attached-then-detached as a success case, not a failure case.
>
> Can you see if the attached patch fixes it?
>


Yes, the patch fixes it for me.


--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: shm_mq inconsistent behavior of SHM_MQ_DETACHED

От
Robert Haas
Дата:
On Mon, Apr 28, 2014 at 4:24 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
> Yes, the patch fixes it for me.

OK.  I committed it.  Thanks for the report.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company