Re: including PID or backend ID in relpath of temp rels

Поиск
Список
Период
Сортировка
От Jim Nasby
Тема Re: including PID or backend ID in relpath of temp rels
Дата
Msg-id 8BFFD9DF-0808-4792-94AB-739BE9C22E76@decibel.org
обсуждение исходный текст
Ответ на Re: including PID or backend ID in relpath of temp rels  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: including PID or backend ID in relpath of temp rels  (Robert Haas <robertmhaas@gmail.com>)
Re: including PID or backend ID in relpath of temp rels  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On May 6, 2010, at 10:24 PM, Robert Haas wrote:
> On Tue, May 4, 2010 at 3:03 PM, Alvaro Herrera
> <alvherre@commandprompt.com> wrote:
>>>>> [smgr.c,inval.c] Do we need to call CacheInvalidSmgr for temporary
>>>>> relations?  I think the only backend that can have an smgr reference
>>>>> to a temprel other than the owning backend is bgwriter, and AFAICS
>>>>> bgwriter will only have such a reference if it's responding to a
>>>>> request by the owning backend to unlink the associated files, in which
>>>>> case (I think) the owning backend will have no reference.
>
> This turns out to be wrong, I think.  It seems that what we do is
> prevent backends other than the opening backend from reading pages
> from non-local temp rels into private or shared buffers, but we don't
> actually prevent them from having smgr references.  This allows
> autovacuum to drop them, for example, in an anti-wraparound situation.
> (Thanks to Tom for helping me get my head around this better.)
>
>>>> Hmm, wasn't there a proposal to have the owning backend delete the files
>>>> instead of asking the bgwriter to?
>>>
>>> I did propose that upthread; it may have been proposed previously
>>> also. This might be worth doing independently of the rest of the patch
>>> (which I'm starting to fear is doomed, cue ominous soundtrack) since
>>> it would reduce the chance of orphaning data files and possibly
>>> simplify the logic also.
>>
>> +1 for doing it separately, but hopefully that doesn't mean the rest of
>> this patch is doomed ...
>
> Doom has been averted.  Proposed patch attached.  It passes regression
> tests and seems to work, but could use additional testing and, of
> course, some code-reading also.
>
> Some notes on this patch:
>
> It seems prett clear that it isn't desirable to simply add backend ID
> to RelFileNode, because there are too many places using RelFileNode
> already for purposes where the backend ID can be inferred from
> context, such as buffer headers and most of xlog.  Instead, I
> introduced BackendRelFileNode, which consists of an ordinary
> RelFileNode augmented with a backend ID, and use that only where
> needed.  In particular, the smgr layer must use BackendRelFileNode
> throughout, since it operates on both permanent and temporary
> relations. smgr invalidations must also include the backend ID.  xlog
> generally happens only for non-temporary relations and can thus
> continue to use an ordinary RelFileNode; however, commit/abort records
> must use BackendRelFileNode as there may be physical storage
> associated with temporary relations that must be unlinked.
> Communication with the bgwriter must use BackendRelFileNode for
> similar reasons. The relcache now stores rd_backend rather than
> rd_islocaltemp so that it remains straightforward to call smgropen()
> based on a relcache entry. Some smgr functions no longer require an
> isTemp argument, because they can infer the necessary information from
> their BackendRelFileNode.  smgrwrite() and smgrextend() now take a
> skipFsync argument rather than an isTemp argument.
>
> I'm not totally sure whether it makes sense to do what we were talking
> about above, viz, transfer unlink responsibility for temp rels from
> the bgwriter to the owning backend.  I haven't done that here.  Nor
> have I implemented any kind of improved temporary file cleanup
> strategy, though I hope such a thing is possible.

Any particular reason not to use directories to help organize things? IE:

base/database_oid/pg_temp_rels/backend_pid/relfilenode

Perhaps relfilenode should be something else.

This seems to have several advantages:

1: It's more organized. If you want to see all the files for a single backend you have just one place to look. Finding
everythingis still easy via filesystem find. 
2: Cleanup becomes easier. When a backend exits, it's entire directory goes away. On server start, everything under
pg_temp_relsgoes away. Unfortunately we still have a race condition with cleaning up if a backend dies and can't run
it'sown cleanup, though I think that anytime that happens we're going to restart everything anyway. 
3: It separates all the temporary stuff away from real files.

The only downside I see is some extra code to create the backend_pid directory.
--
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jim Nasby
Дата:
Сообщение: Re: Partitioning/inherited tables vs FKs
Следующее
От: Jim Nasby
Дата:
Сообщение: Re: GUCs that need restart