On Sat, Mar 28, 2020 at 6:46 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> On Sat, Mar 28, 2020 at 06:28:38AM +0530, Amit Kapila wrote:
> > > Hm, but I caused a crash *without* adding CHECK_FOR_INTERRUPTS, just
> > > kill+sleep. The kill() could come from running pg_cancel_backend(). And the
> > > sleep() just encourages a context switch, which can happen at any time.
> >
> > pg_sleep internally uses CHECK_FOR_INTERRUPTS() due to which it would
> > have accepted the signal sent via pg_cancel_backend(). Can you try
> > your scenario by temporarily removing CHECK_FOR_INTERRUPTS from
> > pg_sleep() or maybe better by using OS Sleep call?
>
> Ah, that explains it. Right, I'm not able to induce a crash with usleep().
>
> Do you want me to resend a patch without that change ? I feel like continuing
> to trade patches is more likely to introduce new errors or lose someone else's
> changes than to make much progress. The patch has been through enough
> iterations and it's very easy to miss an issue if I try to eyeball it.
>
I can do it but we have to agree on the other two points (a) I still
feel that switching to the truncate phase should be done at the place
from where we are calling lazy_truncate_heap and (b)
lazy_cleanup_index should switch back the error phase after calling
index_vacuum_cleanup. I have explained my reasoning for these points
a few emails back.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com