Hi,
On 2015-12-10 18:36:32 +0100, Andres Freund wrote:
> On 2015-12-10 12:19:12 -0500, Robert Haas wrote:
> > > The real problem there imo isn't that the copy_relation_data() doesn't
> > > deal with 0 block tables, but that ATExecSetTableSpace() doesn't have a
> > > unlogged table specific codepath like heap_create_with_catalog() has.
> >
> > It looks to me like somewhere we need to do log_smgrcreate(...,
> > INIT_FORKNUM) in the unlogged table case.
>
> Yes.
>
> > RelationCreateStorage()
> > skips this for the main forknum of an unlogged table, which seems OK,
> > but there's nothing that even attempts it for the init fork, which
> > does not seem OK.
>
> We unfortunately can't trivially delegate that work to
> RelationCreateStorage(). E.g. heap_create() documents that only the main
> fork is created :(
>
> > I guess that logic should happen in
> > ATExecSetTableSpace just after smgrcreate(dstrel, forkNum, false).
>
> Looks like it's the easiest place.
> > > A second problem is that the smgrimmedsync() in copy_relation_data()
> > > isn't called for the init fork of unlogged relations, even if it needs
> > > to.
Here's a patch doing that. It's not yet fully polished, but I wanted to
get it out, because I noticed one thing:
In ATExecSetTableSpace(), for !main forks, we currently call
smgrcreate(), but not log_smgrcreate(). Even for PERSISTENT
relations. That seems a bit odd to me. It currently seems to be without
further consequence because, if there's actual data in the fork, we'll
just create the relation in _mdfd_getseg(); or we can cope with the
relation not being there. But to me that feels wrong.
It seems better to do the log_smgrcreate() for RELPERSISTENCE_PERMANENT,
not just INIT_FORKNUM. What do you guys think?
> It sounds worthwhile to check that other locations rewriting tables,
> e.g. cluster/vacuum full/reindex are safe.
Seems to be ok, on a first glance.
Andres