Re: Cache relation sizes?

Поиск
Список
Период
Сортировка
От Andy Fan
Тема Re: Cache relation sizes?
Дата
Msg-id CAKU4AWpUNQtzAggKMJ+kNZkmNwi56ghec1NuxQmJjneEPDTwYg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Cache relation sizes?  (Thomas Munro <thomas.munro@gmail.com>)
Ответы Re: Cache relation sizes?  (Thomas Munro <thomas.munro@gmail.com>)
Список pgsql-hackers
Hi Thomas,

Thank you for your quick response.

On Thu, Dec 17, 2020 at 3:05 PM Thomas Munro <thomas.munro@gmail.com> wrote:
Hi Andy,

On Thu, Dec 17, 2020 at 7:29 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:
> I spent one day studying the patch and I want to talk about one question for now.
> What is the purpose of calling smgrimmedsync to evict a DIRTY sr (what will happen
> if we remove it and the SR_SYNCING and SR_JUST_DIRTIED flags)?

The underlying theory of the patch is that after fsync() succeeds, the
new size is permanent, but before that it could change at any time
because of asynchronous activities in the kernel*.  Therefore, if you
want to evict the size from shared memory, you have to fsync() the
file first.  I used smgrimmedsync() to do that.

*I suspect that it can really only shrink with network file systems
such as NFS and GlusterFS, where the kernel has to take liberties to
avoid network round trips for every file extension, but I don't know
the details on all 9 of our supported OSes and standards aren't very
helpful for understanding buffered I/O.

Let me try to understand your point.  Suppose process 1 extends a file to
2 blocks from 1 block, and fsync is not called, then a). the lseek *may* still 
return 1 based on the comments in the ReadBuffer_common ("because
of buggy Linux kernels that sometimes return an seek SEEK_END result 
that doesn't account for a recent write").  b). When this patch is introduced, 
we would get 2 blocks for sure if we can get the size from cache. c). After
user reads the 2 blocks from cache and then the cache is evicted, and user
reads the blocks again, it is possible to get 1 again.  

Do we need to consider case a) in this patch? and Is case c). the situation you
want to avoid and do we have to introduce fsync to archive this? Basically I
think case a) should not happen by practice so we don't need to handle case
c) and finally we don't need the fsync/SR_SYNCING/SR_JUST_DIRTIED flag.
I'm not opposed to adding them, I am just interested in the background in case
you are also willing to discuss it.  

I would suggest the below change for smgrnblock_fast,  FYI

@@ -182,7 +182,11 @@ smgrnblocks_fast(SMgrRelation reln, ForkNumber forknum)
                /*
                 * The generation doesn't match, the shared relation must have been
                 * evicted since we got a pointer to it.  We'll need to do more work.
+                * and invalid the fast path for next time.
                 */
+               reln->smgr_shared = NULL;
        }

-- 
Best Regards
Andy Fan

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fujii Masao
Дата:
Сообщение: Re: Use standard SIGHUP and SIGTERM handlers in autoprewarm module
Следующее
От: Bharath Rupireddy
Дата:
Сообщение: Re: Fail Fast In CTAS/CMV If Relation Already Exists To Avoid Unnecessary Rewrite, Planning Costs