Обсуждение: CLUSTER versus broken HOT chains

Поиск
Список
Период
Сортировка

CLUSTER versus broken HOT chains

От
Tom Lane
Дата:
I believe I've worked out what's going on in bug #5985.  The example
script contains an UPDATE on a table, then a creation of an index,
then a CLUSTER on that index, all within one transaction.  If the
UPDATE does any HOT updates, then the index is going to be marked
with indcheckxmin horizon equal to current transaction, because of
our inadequate detection of whether HOT chains are really broken
with respect to a new index.  Then when CLUSTER tries to invoke the
planner to see whether the index should be preferred over a
seqscan-and-sort, plancat.c decides that the index isn't usable yet,
so it doesn't add the index to the relation's IndexOptInfo list,
causing the reported failure "index nnn does not belong to table mmm"
in plan_cluster_use_sort.  This failure is new in 9.1 because we did
not try to use the planner in this way in previous versions, but just
always did an indexscan.

Now, over in cluster.c we find the following interesting bit of
commentary:
   /*    * Disallow if index is left over from a failed CREATE INDEX CONCURRENTLY;    * it might well not contain
entriesfor every heap row, or might not even    * be internally consistent.  (But note that we don't check
indcheckxmin;   * the worst consequence of following broken HOT chains would be that we    * might put recently-dead
tuplesout-of-order in the new table, and there    * is little harm in that.)    */   if
(!OldIndex->rd_index->indisvalid)      ereport(ERROR,               (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
   errmsg("cannot cluster on invalid index \"%s\"",                       RelationGetRelationName(OldIndex))));
 

So this leads me to a few thoughts:

1. Now that we have the seqscan-and-sort code path, it'd be possible to
support CLUSTER on a not-indisvalid index, at least when it's a btree
index.  We just have to force it into the seqscan-and-sort code path.

2. We could deal with a not-usable-because-of-indcheckxmin-horizon
index by forcing an indexscan, which is alleged to be safe by the
above comment, or (if it's btree) by forcing a seqscan-and-sort.
The problem is that we won't know which way is cheaper.  I suspect
however that the seqscan way is usually cheaper and we wouldn't lose
much by forcing that whenever we can.

3. Or we could kluge up the planner so it doesn't ignore "unusable"
indexes when invoked for this purpose.  That seems fairly messy though.

Thoughts?
        regards, tom lane


Re: CLUSTER versus broken HOT chains

От
Tom Lane
Дата:
I wrote:
> I believe I've worked out what's going on in bug #5985.
> ...
> So this leads me to a few thoughts:

> 1. Now that we have the seqscan-and-sort code path, it'd be possible to
> support CLUSTER on a not-indisvalid index, at least when it's a btree
> index.  We just have to force it into the seqscan-and-sort code path.

> 2. We could deal with a not-usable-because-of-indcheckxmin-horizon
> index by forcing an indexscan, which is alleged to be safe by the
> above comment, or (if it's btree) by forcing a seqscan-and-sort.
> The problem is that we won't know which way is cheaper.  I suspect
> however that the seqscan way is usually cheaper and we wouldn't lose
> much by forcing that whenever we can.

> 3. Or we could kluge up the planner so it doesn't ignore "unusable"
> indexes when invoked for this purpose.  That seems fairly messy though.

On closer inspection I notice that there's a related failure mode in
this code: if you have IgnoreSystemIndexes turned on, and you try to
CLUSTER any system catalog, you'll get exactly the same type of failure
because plancat.c will not generate an IndexOptInfo for the target
index.

This leads me to think that the problem is not at either end but in the
middle: plan_cluster_use_sort's error handling is a few bricks shy of a
load.  Specifically, rather than supposing that not finding the index in
the IndexOptInfo list is an error, it should suppose that that's an
expected condition indicating that it's unwise to use the index.  So it
should just return "true" as a recommendation to do seqscan-and-sort
instead.

I'm not going to bother with point #1 above, as it seems like a
feature addition not a bug fix, and a feature of little real use at
that.  The actual bug can be cured pretty easily just by adjusting
plan_cluster_use_sort to understand about this case.
        regards, tom lane