On Thu, Jul 11, 2019 at 2:30 AM Michael Paquier <michael@paquier.xyz> wrote:
On Wed, Jul 10, 2019 at 09:19:03AM -0700, Andres Freund wrote: > On July 10, 2019 9:12:18 AM PDT, Magnus Hagander <magnus@hagander.net> wrote: >> That would be fine, if we actually knew. Should we (or have we already?) >> defined a rule that they are not allowed to use the same naming standard >> unless they have the same type of header? > > No, don't think we have already. There's the related problem of > what to include in base backups, too.
Yes. This one needs a careful design and I am not sure exactly what that would be. At least one new callback would be needed, called from basebackup.c to decide if a given file should be backed up or not based on a path.
That wouldn't be at all enough, of course. We have to think of everybody who uses the pg_start_backup/pg_stop_backup functions (including the deprecated versions we don't want to get rid of :P). So whatever it is it has to be externally reachable. And just calling something before you start your backup won't be enough, as there can be files showing up during the backup etc.
Having a strict naming standard would help a lot with that, then you'd just need the metadata. For example, one could say that each non-default storage engine has to put all their files in a subdirectory, and inside that subdirectory they can name them whatever they want. If we do that, then all a backup tool would need to know about is all the possible subdirectories in the current installation (and *that* doesn't change frequently).
But then how do you make sure that a path applies to one table AM or another, by using a regex given by all table AMs to see if there is a match? How do we handle conflicts? I am not sure either that it is a good design to restrict table AMs to have a given format for paths as that actually limits the possibilities when it comes to split across data across multiple files for attributes and/or tablespaces. (I am a pessimistic guy by nature.)
As long as the restriction contains enough wildcards, it should hopefully be enough :) E.g. data/base/1234/zheap/whatever.they.like.