Jim Nasby wrote:
> It looks like the biggest complaint (aside from allowing a limited number of
> tuples to be moved) is in [1] and [2], where Tom is saying that you can't
> simply call heap_update() like this without holding an exclusive lock on the
> table. Is that because we're not actually changing the tuple?
That's nonsense -- obviously UPDATE can do heap_update without an
exclusive lock on the table, so the explanation must be something else.
I think his actual complaint was that you can't remove the old tuple
until concurrent readers of the table have already finished scanning it,
or you get into a situation where they might need to read the page in
which the original version resided, but your mini-vacuum already removed
it. So before removing it you need to wait until they are all finished.
This is the reason I mentioned CREATE INDEX CONCURRENTLY: if you wait
until those transactions are all gone (like CIC does), you are then free
to remove the old versions of the tuple, because you know that all
readers have a snapshot new enough to see the new version of the tuple.
> Another issue is both HOT and KeyUpdate; I think we need to completely
> ignore/over-ride that stuff for this.
You don't need anything for HOT, because cross-page updates are never
HOT. Not sure what you mean about KeyUpdate, but yeah you might need
something there -- obviously, you don't want to create multixacts
unnecessarily.
> Instead of adding forcefsm, I think it would be more useful to accept a
> target block number. That way we can actually control where the new tuple
> goes.
Whatever makes the most sense, I suppose. (Maybe we shouldn't consider
this a tweaked heap_update -- which is already complex enough -- but a
separate heapam entry point.)
--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services