Re: POC: Cleaning up orphaned files using undo logs

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: POC: Cleaning up orphaned files using undo logs
Дата
Msg-id 20190814160612.74n63huwy2lscvov@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: POC: Cleaning up orphaned files using undo logs  (Dilip Kumar <dilipbalaut@gmail.com>)
Список pgsql-hackers
Hi,

On 2019-08-13 13:53:59 +0530, Dilip Kumar wrote:
> On Tue, Jul 30, 2019 at 1:32 PM Andres Freund <andres@anarazel.de> wrote:
> > > +     /* Loop until we have fetched all the buffers in which we need to write. */
> > > +     while (size > 0)
> > > +     {
> > > +             bufidx = UndoGetBufferSlot(context, rnode, cur_blk, RBM_NORMAL);
> > > +             xact_info->idx_undo_buffers[index++] = bufidx;
> > > +             size -= (BLCKSZ - starting_byte);
> > > +             starting_byte = UndoLogBlockHeaderSize;
> > > +             cur_blk++;
> > > +     }
> >
> > So, this locks a very large number of undo buffers at the same time, do
> > I see that correctly?  What guarantees that there are no deadlocks due
> > to multiple buffers locked at the same time (I guess the order inside
> > the log)? What guarantees that this is a small enough number that we can
> > even lock all of them at the same time?
> 
> I think we are locking them in the block order and that should avoid
> the deadlock.  I have explained in the comments.

Sorry for harping on this so much: But please, please, *always* document
things like this *immediately*. This is among the most crucial things to
document. There shouldn't need to be a reviewer prodding you to do so
many months after the code has been written. For one you've likely
forgotten details by then, but more importantly dependencies on the
locking scheme will have crept into further places - if it's not well
thought through that can be hrad to undo. And it wastes reviewer /
reader bandwidth.


> > Why do we need to lock all of them at the same time? That's not clear to
> > me.
> 
> Because this is called outside the critical section so we keep all the
> buffers locked what we want to update inside the critical section for
> single wal record.

I don't understand this explanation. What does keeping the buffers
locked have to do with the critical section? As explained in a later
email, I think the current approach is not acceptable - but even without
those issues, I don't see why we couldn't just lock the buffers at a
later stage?




> > > +     for (i = 0; i < context->nprepared_undo_buffer; i++)
> > > +     {
> >
> > How large do we expect this to get at most?
> >
> In BeginUndoRecordInsert we are computing it
> 
> + /* Compute number of buffers. */
> + nbuffers = (nprepared + MAX_UNDO_UPDATE_INFO) * MAX_BUFFER_PER_UNDO;

Since nprepared is variable, that doesn't really answer the question.



Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Yuli Khodorkovskiy
Дата:
Сообщение: Re: Auxiliary Processes and MyAuxProc
Следующее
От: Andres Freund
Дата:
Сообщение: Re: POC: Cleaning up orphaned files using undo logs