Re: Parallel Seq Scan

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Parallel Seq Scan
Дата
Msg-id CAA4eK1+ArakRB4pAOftQrADh4KkBfVzAobaYWz9GKtiV80numw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parallel Seq Scan  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Ответы Re: Parallel Seq Scan  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Список pgsql-hackers
On Mon, Mar 16, 2015 at 9:40 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>
> On 13-03-2015 PM 11:03, Amit Kapila wrote:
> > On Fri, Mar 13, 2015 at 7:15 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> >>
> >> I don't think this is the right fix; the point of that code is to
> >> remove a tuple queue from the funnel when it gets detached, which is a
> >> correct thing to want to do.  funnel->nextqueue should always be less
> >> than funnel->nqueues; how is that failing to be the case here?
> >>
> >
> > I could not reproduce the issue, neither the exact scenario is
> > mentioned in mail.  However what I think can lead to funnel->nextqueue
> > greater than funnel->nqueues is something like below:
> >
> > Assume 5 queues, so value of funnel->nqueues will be 5 and
> > assume value of funnel->nextqueue is 2, so now let us say 4 workers
> > got detached one-by-one, so for such a case it will always go in else loop
> > and will never change funnel->nextqueue whereas value of funnel->nqueues
> > will become 1.
> >
>
> Or if the just-detached queue happens to be the last one, we'll make
> shm_mq_receive() to read from a potentially already-detached queue in the
> immediately next iteration.

Won't the last queue case already handled by below code:
else
{
--funnel->nqueues;
if (funnel->nqueues == 0)
{
if (done != NULL)
*done = true;
return NULL;
}

> That seems to be caused by not having updated the
> funnel->nextqueue. With the returned value being SHM_MQ_DETACHED, we'll again
> try to remove it from the queue. In this case, it causes the third argument to
> memcpy be negative and hence the segfault.
>

In anycase, I think we need some handling for such cases.

> I can't seem to really figure out the other problem of waiting forever in
> WaitLatch() 
>

The reason seems that for certain scenarios, the way we set the latch before
exiting needs some more thought.  Currently we are setting the latch in
HandleParallelMessageInterrupt(), that doesn't seem to be sufficient.  

> By the way, you can try reproducing this with the example I posted on Friday.
>

Sure.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fabien COELHO
Дата:
Сообщение: Re: PATCH: pgbench - merging transaction logs
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: Parallel Seq Scan