Re: Cache relation sizes?
От | Konstantin Knizhnik |
---|---|
Тема | Re: Cache relation sizes? |
Дата | |
Msg-id | d3c04ab1-9d93-6687-85de-e7b96c101759@postgrespro.ru обсуждение исходный текст |
Ответ на | Re: Cache relation sizes? (Thomas Munro <thomas.munro@gmail.com>) |
Список | pgsql-hackers |
On 16.11.2020 10:11, Thomas Munro wrote: > On Tue, Aug 4, 2020 at 2:21 PM Thomas Munro <thomas.munro@gmail.com> wrote: >> On Tue, Aug 4, 2020 at 3:54 AM Konstantin Knizhnik >> <k.knizhnik@postgrespro.ru> wrote: >>> This shared relation cache can easily store relation size as well. >>> In addition it will solve a lot of other problems: >>> - noticeable overhead of local relcache warming >>> - large memory consumption in case of larger number of relations >>> O(max_connections*n_relations) >>> - sophisticated invalidation protocol and related performance issues >>> Certainly access to shared cache requires extra synchronization.But DDL >>> operations are relatively rare. >>> So in most cases we will have only shared locks. May be overhead of >>> locking will not be too large? >> Yeah, I would be very happy if we get a high performance shared >> sys/rel/plan/... caches in the future, and separately, having the >> relation size available in shmem is something that has come up in >> discussions about other topics too (tree-based buffer mapping, >> multi-relation data files, ...). ... > After recent discussions about the limitations of relying on SEEK_END > in a nearby thread[1], I decided to try to prototype a system for > tracking relation sizes properly in shared memory. Earlier in this > thread I was talking about invalidation schemes for backend-local > caches, because I only cared about performance. In contrast, this new > system has SMgrRelation objects that point to SMgrSharedRelation > objects (better names welcome) that live in a pool in shared memory, > so that all backends agree on the size. The scheme is described in > the commit message and comments. The short version is that smgr.c > tracks the "authoritative" size of any relation that has recently been > extended or truncated, until it has been fsync'd. By authoritative, I > mean that there may be dirty buffers in that range in our buffer pool, > even if the filesystem has vaporised the allocation of disk blocks and > shrunk the file. > > That is, it's not really a "cache". It's also not like a shared > catalog, which Konstantin was talking about... it's more like the pool > of inodes in a kernel's memory. It holds all currently dirty SRs > (SMgrSharedRelations), plus as many clean ones as it can fit, with > some kind of reclamation scheme, much like buffers. Here, "dirty" > means the size changed. > > Attached is an early sketch, not debugged much yet (check undir > contrib/postgres_fdw fails right now for a reason I didn't look into), > and there are clearly many architectural choices one could make > differently, and more things to be done... but it seemed like enough > of a prototype to demonstrate the concept and fuel some discussion > about this and whatever better ideas people might have... > > Thoughts? > > [1] https://www.postgresql.org/message-id/flat/OSBPR01MB3207DCA7EC725FDD661B3EDAEF660%40OSBPR01MB3207.jpnprd01.prod.outlook.com I noticed that there are several fragments like this: if (!smgrexists(rel->rd_smgr, FSM_FORKNUM)) smgrcreate(rel->rd_smgr, FSM_FORKNUM, false); fsm_nblocks_now = smgrnblocks(rel->rd_smgr, FSM_FORKNUM); I wonder if it will be more efficient and simplify code to add "create_if_not_exists" parameter to smgrnblocks? It will avoid extra hash lookup and avoid explicit checks for fork presence in multiple places? -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
В списке pgsql-hackers по дате отправления:
Предыдущее
От: "Jonathan S. Katz"Дата:
Сообщение: Re: Heads-up: macOS Big Sur upgrade breaks EDB PostgreSQL installations