On Fri, Oct 26, 2012 at 7:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Claudio Freire <klaussfreire@gmail.com> writes:
>> Because once you've accessed that last index page, it would be rather
>> trivial finding out how many duplicate tids are in that page and, with
>> a small CPU cost (no disk access if you don't query other index pages)
>> you could verify the assumption of near-uniqueness.
>
> I thought about that too, but I'm not sure how promising the idea is.
> In the first place, it's not clear when to stop counting duplicates, and
> in the second, I'm not sure we could get away with not visiting the heap
> to check for tuple liveness. There might be a lot of apparent
> duplicates in the index that just represent unreaped old versions of a
> frequently-updated endpoint tuple. (The existing code is capable of
> returning a "wrong" answer if the endpoint tuple is dead, but I don't
> think it matters much in most cases. I'm less sure such an argument
> could be made for dup-counting.)
Would checking the visibility map be too bad? An index page worth of
tuples should also fit within a page in the visibility map.