On Wed, Apr 1, 2015 at 7:30 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> > Patch fixes the problem and now for Rescan, we don't need to Wait
>> > for workers to finish.
>>
>> I realized that there is a problem with this. If an error occurs in
>> one of the workers just as we're deciding to kill them all, then the
>> error won't be reported.
>
> We are sending SIGTERM to worker for terminating the worker, so
> if the error occurs before the signal is received then it should be
> sent to master backend. Am I missing something here?
The master only checks for messages at intervals - each
CHECK_FOR_INTERRUPTS(), basically. So when the master terminates the
workers, any errors generated after the last check for messages will
be lost.
>> Also, the new code to propagate
>> XactLastRecEnd won't work right, either.
>
> As we are generating FATAL error on termination of worker
> (bgworker_die()), so won't it be handled in AbortTransaction path
> by below code in parallel-mode patch?
That will asynchronously flush the WAL, but if the master goes on to
commit, we've wait synchronously for WAL flush, and possibly sync rep.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company