On Mon, Apr 18, 2016 at 05:48:17PM +0300, Teodor Sigaev wrote:
> >>Added, see attached patch (based on v3.1)
> >
> >With this applied, I am getting a couple errors I have not seen before
> >after extensive crash recovery testing:
> >ERROR: attempted to delete invisible tuple
> >ERROR: unexpected chunk number 1 (expected 2) for toast value
> >100338365 in pg_toast_16425
> Huh, seems, it's not related to GIN at all... Indexes don't play with toast
> machinery. The single place where this error can occur is a heap_delete() -
> deleting already deleted tuple.
Like you, I would not expect gin_alone_cleanup-4.patch to cause such an error.
I get the impression Jeff has a test case that he had run in many iterations
against the unpatched baseline. I also get the impression that a similar or
smaller number of its iterations against gin_alone_cleanup-4.patch triggered
these two errors (once apiece, or multiple times?). Jeff, is that right? If
so, until we determine the cause, we should assume the cause arrived in
gin_alone_cleanup-4.patch. An error in pointer arithmetic or locking might
corrupt an unrelated buffer, leading to this symptom.
> >I've restarted the test harness with intentional crashes turned off,
> >to see if the problems are related to crash recovery or are more
> >generic than that.
> >
> >I've never seen these particular problems before, so don't have much
> >insight into what might be going on or how to debug it.
Could you describe the test case in sufficient detail for Teodor to reproduce
your results?
> Check my reasoning: In version 4 I added a remebering of tail of pending
> list into blknoFinish variable. And when we read page which was a tail on
> cleanup start then we sets cleanupFinish variable and after cleaning that
> page we will stop further cleanup. Any insert caused during cleanup will be
> placed after blknoFinish (corner case: in that page), so, vacuum should not
> miss tuples marked as deleted.
Would any hacker volunteer to review Teodor's reasoning here?
Thanks,
nm