Re: Drop type "smgr"?

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: Drop type "smgr"?
Дата
Msg-id CA+hUKGKJqEt+sW7Q+7a9JQY6WUSdvrRS2cWByQYO+pPDnJjbwQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Drop type "smgr"?  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Drop type "smgr"?
Список pgsql-hackers
On Fri, Mar 1, 2019 at 4:09 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
> > On Thu, Feb 28, 2019 at 7:37 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Thomas Munro <thomas.munro@gmail.com> writes:
> >>> Our current thinking is that smgropen() should know how to map a small
> >>> number of special database OIDs to different smgr implementations
>
> >> Hmm.  Maybe mapping based on tablespaces would be a better idea?
>
> > In the undo log proposal (about which more soon) we are using
> > tablespaces for their real purpose, so we need that OID.  If you SET
> > undo_tablespaces = foo then future undo data created by your session
> > will be written there, which might be useful for putting that IO on
> > different storage.
>
> Meh.  That's a point, but it doesn't exactly seem like a killer argument.
> Just in the abstract, it seems much more likely to me that people would
> want per-database special rels than per-tablespace special rels.  And
> I think your notion of a GUC that can control this is probably pie in
> the sky anyway: if we can't afford to look into the catalogs to resolve
> names at this code level, how are we going to handle a GUC?

I have this working like so:

* undo logs have a small amount of meta-data in shared memory, stored
in a file at checkpoint time, with all changes WAL logged, visible to
users in pg_stat_undo_logs view
* one of the properties of an undo log is its tablespace (the point
here being that it's not in a catalog)
* you don't need access to any catalogs to find the backing files for
a RelFileNode (the path via tablespace symlinks is derivable from
spcNode)
* therefore you can find your way from an UndoLogRecPtr in (say) a
zheap page to the relevant blocks on disk without any catalog access;
this should work even in the apparently (but not actually) circular
case of a pg_tablespace catalog that is stored in zheap (not something
we can do right now, but hypothetically speaking), and has undo data
that is stored in some non-default tablespace that must be consulted
while scanning the catalog (not that I'm suggesting that would
necessarily be a good idea to suppose catalogs in non-default
tablespaces; I'm just addressing your theoretical point)
* the GUC is used to resolve tablespace names to OIDs only by sessions
that are writing, when selecting (or creating) an undo log to attach
to and begin writing into; those sessions have no trouble reading the
catalog to do so without problematic circularities, as above

Seems to work; the main complications so far were coming up with
reasonable behaviour and interlocking when you drop tablespaces that
contain undo logs (short version: if they're not needed for snapshots
or rollback, they are dropped, wasting the rest of their undo address
space; otherwise they prevents the tablespace from being dropped with
a clear message to that effect).

It doesn't make any sense to put things like clog or any other SLRU in
a non-default tablespace though.  It's perfectly OK if not all smgr
implementations know how to deal with tablespaces, and the SLRU
support should just not support that.

> The real reason I'm concerned about this, though, is that for either
> a database or a tablespace, you can *not* get away with having a magic
> OID just hanging in space with no actual catalog row matching it.
> If nothing else, you need an entry there to prevent someone from
> reusing the OID for another purpose.  And a pg_database row that
> doesn't correspond to a real database is going to break all kinds of
> code, starting with pg_upgrade and the autovacuum launcher.  Special
> rows in pg_tablespace are much less likely to cause issues, because
> of the precedent of pg_global and pg_default.

GetNewObjectId() never returns values < FirstNormalObjectId.

I don't think it's impossible for someone to want to put SMGRs in a
catalog of some kind some day.  Even though the ones for clog, undo
etc would still probably need special hard-coded treatment as
discussed, I suppose it's remotely possible that someone might some
day figure out a useful way to allow extensions that provide different
block storage (nvram?  zfs zvols?  encryption? (see Haribabu's reply))
but I don't have any specific ideas about that or feel inclined to
design something for unknown future use.

-- 
Thomas Munro
https://enterprisedb.com


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Joe Conway
Дата:
Сообщение: Re: get_controlfile() can leak fds in the backend
Следующее
От: Shawn Debnath
Дата:
Сообщение: Re: Drop type "smgr"?