Обсуждение: Re: pg_background (and more parallelism infrastructure patches)

Поиск
Список
Период
Сортировка

Re: pg_background (and more parallelism infrastructure patches)

От
Alvaro Herrera
Дата:
On Fri, Jul 25, 2014 at 02:11:32PM -0400, Robert Haas wrote:

> +    pq_mq_busy = true;
> +
> +    iov[0].data = &msgtype;
> +    iov[0].len = 1;
> +    iov[1].data = s;
> +    iov[1].len = len;
> +
> +    Assert(pq_mq_handle != NULL);
> +    result = shm_mq_sendv(pq_mq_handle, iov, 2, false);
> +
> +    pq_mq_busy = false;

Don't you need a PG_TRY block here to reset pq_mq_busy?

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: pg_background (and more parallelism infrastructure patches)

От
Robert Haas
Дата:
On Fri, Jul 25, 2014 at 4:16 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> On Fri, Jul 25, 2014 at 02:11:32PM -0400, Robert Haas wrote:
>> +     pq_mq_busy = true;
>> +
>> +     iov[0].data = &msgtype;
>> +     iov[0].len = 1;
>> +     iov[1].data = s;
>> +     iov[1].len = len;
>> +
>> +     Assert(pq_mq_handle != NULL);
>> +     result = shm_mq_sendv(pq_mq_handle, iov, 2, false);
>> +
>> +     pq_mq_busy = false;
>
> Don't you need a PG_TRY block here to reset pq_mq_busy?

No.  If shm_mq_sendv is interrupted, we can't use the shm_mq any more.
But since that should only happen if an interrupt arrives while the
queue is full, I think that's OK.  (Think about the alternatives: if
the queue is full, we have no way of notifying the launching process
without waiting for it to retrieve the results, but it might not do
that right away, and if we've been killed we need to die *now* not
later.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: pg_background (and more parallelism infrastructure patches)

От
Amit Kapila
Дата:
On Sat, Jul 26, 2014 at 9:32 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Jul 25, 2014 at 4:16 PM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
> > On Fri, Jul 25, 2014 at 02:11:32PM -0400, Robert Haas wrote:
> >> +     pq_mq_busy = true;
> >> +
> >> +     iov[0].data = &msgtype;
> >> +     iov[0].len = 1;
> >> +     iov[1].data = s;
> >> +     iov[1].len = len;
> >> +
> >> +     Assert(pq_mq_handle != NULL);
> >> +     result = shm_mq_sendv(pq_mq_handle, iov, 2, false);
> >> +
> >> +     pq_mq_busy = false;
> >
> > Don't you need a PG_TRY block here to reset pq_mq_busy?
>
> No.  If shm_mq_sendv is interrupted, we can't use the shm_mq any more.
> But since that should only happen if an interrupt arrives while the
> queue is full, I think that's OK.

I think here not only on interrupt, but any other error in this
function shm_mq_sendv() path (one example is WaitLatch())
could lead to same behaviour.

> (Think about the alternatives: if
> the queue is full, we have no way of notifying the launching process
> without waiting for it to retrieve the results, but it might not do
> that right away, and if we've been killed we need to die *now* not
> later.)

So in such cases what is the advise to users, currently they will
see the below message:
postgres=# select * from pg_background_result(5124) as (x int);
ERROR:  lost connection to worker process with PID 5124

One way is to ask them to check logs, but what about if they want
to handle error and take some action?

Another point about error handling is that to execute the sql in
function pg_background_worker_main(), it starts the transaction
which I think doesn't get aborted if error occurs and similarly handling
for timeout seems to be missing in error path.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: pg_background (and more parallelism infrastructure patches)

От
Amit Kapila
Дата:
On Mon, Sep 8, 2014 at 10:39 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, Jul 26, 2014 at 9:32 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> > On Fri, Jul 25, 2014 at 4:16 PM, Alvaro Herrera
> > <alvherre@2ndquadrant.com> wrote:
> > > On Fri, Jul 25, 2014 at 02:11:32PM -0400, Robert Haas wrote:
> > >> +     pq_mq_busy = true;
> > >> +
> > >> +     iov[0].data = &msgtype;
> > >> +     iov[0].len = 1;
> > >> +     iov[1].data = s;
> > >> +     iov[1].len = len;
> > >> +
> > >> +     Assert(pq_mq_handle != NULL);
> > >> +     result = shm_mq_sendv(pq_mq_handle, iov, 2, false);
> > >> +
> > >> +     pq_mq_busy = false;
> > >
> > > Don't you need a PG_TRY block here to reset pq_mq_busy?
> >
> > No.  If shm_mq_sendv is interrupted, we can't use the shm_mq any more.
> > But since that should only happen if an interrupt arrives while the
> > queue is full, I think that's OK.
>
> I think here not only on interrupt, but any other error in this
> function shm_mq_sendv() path (one example is WaitLatch())
> could lead to same behaviour.
>
> > (Think about the alternatives: if
> > the queue is full, we have no way of notifying the launching process
> > without waiting for it to retrieve the results, but it might not do
> > that right away, and if we've been killed we need to die *now* not
> > later.)
>
> So in such cases what is the advise to users, currently they will
> see the below message:
> postgres=# select * from pg_background_result(5124) as (x int);
> ERROR:  lost connection to worker process with PID 5124
>
> One way is to ask them to check logs, but what about if they want
> to handle error and take some action?
>
> Another point about error handling is that to execute the sql in
> function pg_background_worker_main(), it starts the transaction
> which I think doesn't get aborted if error occurs

For this I was just referring error handling code of
StartBackgroundWorker(), however during shutdown of process, it
will call AbortOutOfAnyTransaction() which will take care of aborting
transaction.

> and similarly handling
> for timeout seems to be missing in error path.

As we are anyway going to exit on error, so not sure, if this will be
required, however having it for clean exit seems to be better.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: pg_background (and more parallelism infrastructure patches)

От
Robert Haas
Дата:
On Mon, Sep 8, 2014 at 1:09 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> > Don't you need a PG_TRY block here to reset pq_mq_busy?
>>
>> No.  If shm_mq_sendv is interrupted, we can't use the shm_mq any more.
>> But since that should only happen if an interrupt arrives while the
>> queue is full, I think that's OK.
>
> I think here not only on interrupt, but any other error in this
> function shm_mq_sendv() path (one example is WaitLatch())
> could lead to same behaviour.

Probably true.  But I think we generally count on that to be no-fail,
or close to it, so I'm not really worried about it.

>> (Think about the alternatives: if
>> the queue is full, we have no way of notifying the launching process
>> without waiting for it to retrieve the results, but it might not do
>> that right away, and if we've been killed we need to die *now* not
>> later.)
>
> So in such cases what is the advise to users, currently they will
> see the below message:
> postgres=# select * from pg_background_result(5124) as (x int);
> ERROR:  lost connection to worker process with PID 5124
>
> One way is to ask them to check logs, but what about if they want
> to handle error and take some action?

They have to check the logs.  To reiterate what I said before, there
is no reasonable way to both have the background worker terminate
quickly and also guarantee that the full error message is received by
the process that started it.  You have to pick one, and I stick by the
one I picked.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: pg_background (and more parallelism infrastructure patches)

От
Robert Haas
Дата:
On Mon, Sep 8, 2014 at 2:05 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> Another point about error handling is that to execute the sql in
>> function pg_background_worker_main(), it starts the transaction
>> which I think doesn't get aborted if error occurs
>
> For this I was just referring error handling code of
> StartBackgroundWorker(), however during shutdown of process, it
> will call AbortOutOfAnyTransaction() which will take care of aborting
> transaction.

Yeah.

>> and similarly handling
>> for timeout seems to be missing in error path.
>
> As we are anyway going to exit on error, so not sure, if this will be
> required, however having it for clean exit seems to be better.

Can you be more specific?

I'm generally of the view that there's little point in spending time
cleaning things up that will go away anyway when the process exits.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: pg_background (and more parallelism infrastructure patches)

От
Amit Kapila
Дата:
On Tue, Sep 9, 2014 at 2:32 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Sep 8, 2014 at 2:05 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >> Another point about error handling is that to execute the sql in
> >> function pg_background_worker_main(), it starts the transaction
> >> which I think doesn't get aborted if error occurs
> >
> > For this I was just referring error handling code of
> > StartBackgroundWorker(), however during shutdown of process, it
> > will call AbortOutOfAnyTransaction() which will take care of aborting
> > transaction.
>
> Yeah.
>
> >> and similarly handling
> >> for timeout seems to be missing in error path.
> >
> > As we are anyway going to exit on error, so not sure, if this will be
> > required, however having it for clean exit seems to be better.
>
> Can you be more specific?

I was thinking to handle errors similar to what PostgreMain does,
however if it is going to exit then it might no be worth.

> I'm generally of the view that there's little point in spending time
> cleaning things up that will go away anyway when the process exits.

Yeah, in that case it might not make much sense.  The argument here
could be why it has to exit, why can't it wait till the launcher asks it
to exit.  You have mentioned in previous mail that you want to stick to
the approach taken by patch, however it is not clear why you think that
is better.  If I try to think about how the worker backend should behave
incase the master backend assigns some task to worker, then I think it
would be sensible for worker to not exit after completing it's task (either
it has finished the execution of work assigned or otherwise) as master
backend can assign further tasks to the same worker.  Another advantage
would be that setup and tear down cost of worker will be saved.  Now If
we just think w.r.t pg_background it might not be so important to let
worker wait till launcher asks to quit, however as you have mentioned
in your mail upthread that you are expecting this kind of set-up for actual
parallelism, so it might not be completely insensible to expect worker
to wait till it has been asked to quit.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com