Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

Поиск
Список
Период
Сортировка
От Dilip Kumar
Тема Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Дата
Msg-id CAFiTN-uQU+0OMQp9KAL+rt0xRWMNFANtf+755KKWwDFumSs19A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints  (Andres Freund <andres@anarazel.de>)
Ответы Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints  (Dilip Kumar <dilipbalaut@gmail.com>)
Список pgsql-hackers
On Mon, Sep 6, 2021 at 1:58 AM Andres Freund <andres@anarazel.de> wrote:
On 2021-09-05 14:22:51 +0530, Dilip Kumar wrote: 
> But these directly operate on the buffers and In my patch, whether we are
> reading the pg_class for identifying the relfilenode or we are copying the
> relation block by block we are always holding the lock on the buffer.

I don't think a buffer lock is really sufficient. See e.g. code like:

I agree that the only buffer lock is not sufficient, but here we are talking about the case where we are already holding the exclusive lock on the database + the buffer lock.   So the cases like below which should be called only from the drop relation must be protected by the database exclusive lock and the other example like buffer reclaim/checkpointer should be protected by the buffer pin + lock.   Having said that, I am not against the point that we should not acquire the relation lock in our case.  I agree that if there is an assumption that for holding the buffer pin we must be holding the relation lock then better not to break that.


static void
InvalidateBuffer(BufferDesc *buf)
{
...
        /*
         * We assume the only reason for it to be pinned is that someone else is
         * flushing the page out.  Wait for them to finish.  (This could be an
         * infinite loop if the refcount is messed up... it would be nice to time
         * out after awhile, but there seems no way to be sure how many loops may
         * be needed.  Note that if the other guy has pinned the buffer but not
         * yet done StartBufferIO, WaitIO will fall through and we'll effectively
         * be busy-looping here.)
         */
        if (BUF_STATE_GET_REFCOUNT(buf_state) != 0)
        {
                UnlockBufHdr(buf, buf_state);
                LWLockRelease(oldPartitionLock);
                /* safety check: should definitely not be our *own* pin */
                if (GetPrivateRefCount(BufferDescriptorGetBuffer(buf)) > 0)
                        elog(ERROR, "buffer is pinned in InvalidateBuffer");
                WaitIO(buf);
                goto retry;
        }

IOW, currently we assume that you're only allowed to pin a block in a relation
while you hold a lock on the relation. It might be a good idea to change that,
but it's not as trivial as one might think - consider e.g. dropping a
relation's buffers while holding an exclusive lock: If there's potential
concurrent reads of that buffer we'd be in trouble.

> 3. While copying the relation whether to use the bufmgr or directly use the
> smgr?
>
> If we use the bufmgr then maybe we can avoid flushing some of the buffers
> to the disk and save some I/O but in general we copy from the template
> database so there might not be a lot of dirty buffers and we might not save
> anything

I would assume the big benefit would be that the *target* database does not
have to be written out / shared buffer is immediately populated.

Okay, that makes sense.  Infact for using the shared buffers for the destination database's relation we don't even have the locking issue, because that database is not yet accessible to anyone right?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Postgres perl module namespace
Следующее
От: Masahiko Sawada
Дата:
Сообщение: Re: Skipping logical replication transactions on subscriber side