On 2015-07-03 18:38:37 -0400, Tom Lane wrote:
> > Why exactly? The first truncation in the (sub)xact would have assigned a
> new relfilenode, why do we need another one? The file in question will
> go away on crash/rollback in any case, and no other transaction can see
> it yet.
Consider:
BEGIN;
CREATE TABLE;
INSERT largeval;
TRUNCATE;
INSERT 1;
COPY;
INSERT 2;
COMMIT;
INSERT 1 is going to be WAL logged. For that to work correctly TRUNCATE
has to be WAL logged, as otherwise there'll be conflicting/overlapping
tuples on the target page.
But:
The truncation itself is not fully wal logged, neither is the COPY. Both
rely on heap_sync()/immedsync(). For that to be correct the current
relfilenode's truncation may *not* be wal-logged, because the contents
of the COPY or the truncation itself will only be on-disk, not in the
WAL.
Only being on-disk but not in the WAL is a problem if we crash and
replay the truncate record.
> I'm prepared to believe that some bit of logic is doing the wrong
> thing in this state, but I do not agree that truncate-in-place is
> unworkable.
Unless we're prepared to make everything that potentially WAL logs
something do the rel->rd_createSubid == mySubid && dance, I can't see
that working.