On 2015-07-03 19:26:05 +0200, Andres Freund wrote:
> On 2015-07-03 19:02:29 +0200, Andres Freund wrote:
> > Maybe I'm just daft right now (35C outside, 32 inside, so ...), but I'm
> > right now missing how the whole "skip wal logging if relation has just
> > been truncated" optimization can ever actually be crashsafe unless we
> > use a new relfilenode (which we don't!).
>
> We actually used to use a different relfilenode, but optimized that
> away: cab9a0656c36739f59277b34fea8ab9438395869
>
> commit cab9a0656c36739f59277b34fea8ab9438395869
> Author: Tom Lane <tgl@sss.pgh.pa.us>
> Date: Sun Aug 23 19:23:41 2009 +0000
>
> Make TRUNCATE do truncate-in-place when processing a relation that was created
> or previously truncated in the current (sub)transaction. This is safe since
> if the (sub)transaction later rolls back, we'd just discard the rel's current
> physical file anyway. This avoids unreasonable growth in the number of
> transient files when a relation is repeatedly truncated. Per a performance
> gripe a couple weeks ago from Todd Cook.
>
> to me the reasoning here looks flawed.
It looks to me we need to re-neg on this a bit. I think we can still be
more efficient than the general codepath: We can drop the old
relfilenode immediately. But pg_class.relfilenode has to differ from the
old after the truncation.