Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
От | Andres Freund |
---|---|
Тема | Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints |
Дата | |
Msg-id | 20210905202800.ji4fnfs3xzhvo7l6@alap3.anarazel.de обсуждение исходный текст |
Ответ на | Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints (Dilip Kumar <dilipbalaut@gmail.com>) |
Ответы |
Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
(Dilip Kumar <dilipbalaut@gmail.com>)
|
Список | pgsql-hackers |
On 2021-09-05 14:22:51 +0530, Dilip Kumar wrote: > On Sat, Sep 4, 2021 at 3:24 AM Andres Freund <andres@anarazel.de> wrote: > > > Hi, > > > > On 2021-09-03 14:25:10 +0530, Dilip Kumar wrote: > > > Yeah, we can surely lock the relation as described by Robert, but IMHO, > > > while creating the database we are already holding the exclusive lock on > > > the database and there is no one else allowed to be connected to the > > > database, so do we actually need to bother about the lock for the > > > correctness? > > > > The problem is that checkpointer, bgwriter, buffer reclaim don't care about > > the database of the buffer they're working on... The exclusive lock on the > > database doesn't change anything about that. > > > But these directly operate on the buffers and In my patch, whether we are > reading the pg_class for identifying the relfilenode or we are copying the > relation block by block we are always holding the lock on the buffer. I don't think a buffer lock is really sufficient. See e.g. code like: static void InvalidateBuffer(BufferDesc *buf) { ... /* * We assume the only reason for it to be pinned is that someone else is * flushing the page out. Wait for them to finish. (This could be an * infinite loop if the refcount is messed up... it would be nice to time * out after awhile, but there seems no way to be sure how many loops may * be needed. Note that if the other guy has pinned the buffer but not * yet done StartBufferIO, WaitIO will fall through and we'll effectively * be busy-looping here.) */ if (BUF_STATE_GET_REFCOUNT(buf_state) != 0) { UnlockBufHdr(buf, buf_state); LWLockRelease(oldPartitionLock); /* safety check: should definitely not be our *own* pin */ if (GetPrivateRefCount(BufferDescriptorGetBuffer(buf)) > 0) elog(ERROR, "buffer is pinned in InvalidateBuffer"); WaitIO(buf); goto retry; } IOW, currently we assume that you're only allowed to pin a block in a relation while you hold a lock on the relation. It might be a good idea to change that, but it's not as trivial as one might think - consider e.g. dropping a relation's buffers while holding an exclusive lock: If there's potential concurrent reads of that buffer we'd be in trouble. > 3. While copying the relation whether to use the bufmgr or directly use the > smgr? > > If we use the bufmgr then maybe we can avoid flushing some of the buffers > to the disk and save some I/O but in general we copy from the template > database so there might not be a lot of dirty buffers and we might not save > anything I would assume the big benefit would be that the *target* database does not have to be written out / shared buffer is immediately populated. Greetings, Andres Freund
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Tom LaneДата:
Сообщение: Re: use-regular-expressions-to-simplify-less_greater-and-not_equals.patch