Обсуждение: shm_mq inconsistent behavior of SHM_MQ_DETACHED
Hi, I was playing with shm_mq and found a little odd behavior with detaching after sending messages. Following sequence behaves as expected (receiver gets 2 messages): P1 -> set_sender P1 -> attach P2 -> set_receiver P2 -> attach P1 -> send P2 -> receive P1 -> send P1 -> detach P2 -> receive P2 -> detach But if I do first receive after detach like in this sequence: P1 -> set_sender P1 -> attach P2 -> set_receiver P2 -> attach P1 -> send P1 -> send P1 -> detach P2 -> receive I get SHM_MQ_DETACHED on the receiver even though there are messages in the ring buffer. The reason for this behavior is that mqh_counterparty_attached is only set by shm_mq_receive. This does not seem to be consistent - I would either expect to get SHM_MQ_DETACHED always when other party has detached or always get all remaining messages that are in queue (and I would strongly prefer the latter). Maybe the shm_mq_get_bytes_written should be used to determine if there is something left for us to read in the receiver if we hit the !mqh_counterparty_attached code path with detached sender? -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Apr 22, 2014 at 9:55 AM, Petr Jelinek <petr@2ndquadrant.com> wrote: > I was playing with shm_mq and found a little odd behavior with detaching > after sending messages. > > Following sequence behaves as expected (receiver gets 2 messages): > P1 -> set_sender > P1 -> attach > P2 -> set_receiver > P2 -> attach > P1 -> send > P2 -> receive > P1 -> send > P1 -> detach > P2 -> receive > P2 -> detach > > But if I do first receive after detach like in this sequence: > P1 -> set_sender > P1 -> attach > P2 -> set_receiver > P2 -> attach > P1 -> send > P1 -> send > P1 -> detach > P2 -> receive > > I get SHM_MQ_DETACHED on the receiver even though there are messages in the > ring buffer. That's a bug. > The reason for this behavior is that mqh_counterparty_attached is only set > by shm_mq_receive. This does not seem to be consistent - I would either > expect to get SHM_MQ_DETACHED always when other party has detached or always > get all remaining messages that are in queue (and I would strongly prefer > the latter). > > Maybe the shm_mq_get_bytes_written should be used to determine if there is > something left for us to read in the receiver if we hit the > !mqh_counterparty_attached code path with detached sender? That's probably not a good idea, because there could be just a partial message left in the buffer, if the sender died midway through writing it. I suspect that attacking the problem that way will lead to a bunch of nasty edge cases. I'm thinking that the problem is really revolves around shm_mq_wait_internal(). It returns true if the queue is attached but not detached, and false if either the detach has already happened, or if we establish via the background worker handle that it will never come. But in the case of receiving, we want to treat attached-then-detached as a success case, not a failure case. Can you see if the attached patch fixes it? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
On 28/04/14 15:36, Robert Haas wrote: > On Tue, Apr 22, 2014 at 9:55 AM, Petr Jelinek <petr@2ndquadrant.com> wrote: >> >> But if I do first receive after detach like in this sequence: >> P1 -> set_sender >> P1 -> attach >> P2 -> set_receiver >> P2 -> attach >> P1 -> send >> P1 -> send >> P1 -> detach >> P2 -> receive >> >> I get SHM_MQ_DETACHED on the receiver even though there are messages in the >> ring buffer. > > That's a bug. > > I'm thinking that the problem is really revolves around > shm_mq_wait_internal(). It returns true if the queue is attached but > not detached, and false if either the detach has already happened, or > if we establish via the background worker handle that it will never > come. But in the case of receiving, we want to treat > attached-then-detached as a success case, not a failure case. > > Can you see if the attached patch fixes it? > Yes, the patch fixes it for me. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Apr 28, 2014 at 4:24 PM, Petr Jelinek <petr@2ndquadrant.com> wrote: > Yes, the patch fixes it for me. OK. I committed it. Thanks for the report. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company