Re: caching query results
От | wieck@debis.com (Jan Wieck) |
---|---|
Тема | Re: caching query results |
Дата | |
Msg-id | m12cUIr-0003lKC@orion.SAPserv.Hamburg.dsh.de обсуждение исходный текст |
Ответ на | Re: caching query results (Karel Zak <zakkr@zf.jcu.cz>) |
Ответы |
Re: caching query results
(Karel Zak <zakkr@zf.jcu.cz>)
|
Список | pgsql-hackers |
Karel Zak wrote: > On Mon, 3 Apr 2000, Jan Wieck wrote: > > It is good idea. What exactly is a key? If I good understand this key > is for query identification only. Right? Right. Imagine a querytree (after overhaul) that looks like this: +------+ | SORT | +------+ ^ | +-----------------------------+ | JOIN | | atts:rel1.att1, rel2.att2 | | qual: rel1.att2 = rel2.att1 | +-----------------------------+ ^ ^ | | +------------------+ +------------------+ | SCAN | | SCAN | | rel: rel1 | | rel: rel2 | | atts: att1, att2 | | atts:att1, att2 | +------------------+ +------------------+ which is a node structure describing a query of: SELECT rel1.att1, rel2.att2 FROM rel1, rel2 WHERE rel1.att2 = rel2.att1; The "key" identifying this querytree now could look like SORT(JOIN(1.1,2.2;SCAN(78991;1,2),SCAN(78995;1,2);)) 78991 and 78995 are the OIDs of rel1 and rel2. So the key is a very simplified description of what the query does,and maybe the qualification should be included too. But it's enough to find a few candidates to look at closeron the node level out of hundreds of cached plans. > > These keys could be managed in a shared LRU table, and if the > > My current code is based on HASH table with keys and query&plan is > saved in special for a plan created MemoryContext (it is good for > a example SPI_freeplan()). IIRC our hash table code insists on using global, per backend memory. I thought about managing the entire querycachewith a new type of memory context, using different routines for palloc()/pfree(), working in a sharedmemory area only and eventually freeing longest unused plans until allocation fits. Let's see if usinghash tables here would be easy or not. > > same key appears a number of times (0-n), it's entire > > querytree + plan (after planning) will be saved into the > > shared mem. > > Here I not understend. Why is here any time checking? There's not that big of a win if you do all the shared memory overhead for any query at it's first occurance.Such a generic query cache only makes sense for queries that occur often. So at it's first to n-th occurancewe only count by key and after we know that it's one of these again'n'again thingies, we pay the cache overhead. Also I think, keeping the number of exclusive cache locks (for writing) as small as possible would be a good idea WRT concurrency. > IMHO users can use PREPARE / EXECUTE for same query. Suggested idea is > really good if this query cache will in shared memory and more backends > can use it. Exactly that's the idea. And since the postmaster will hold the shared memory as it does for the block and syscache, it'll survive even times of no DB activity. > Good. It is solution for 'known-query' and allow it skip any steps in the > query path. But we still not have any idea for cached plans validity. What > if user changes oid for any operator, drop column (etc)? That's why the key is only good to find "candidates". The cacheing has to look very close to the nodes in the tree and maybe compare down to pg_attribute oid's etc. to decide if it's really the same query or not. > Is really sure that this will faster? (it must create key for nodes, > search same query in any table (cache), copy new query&plan to cache > ..etc.) Only some timing code put into backends in various real world databases can tell how much of the entire processingtime is spent in the optimizer. And I'd not be surprised if most of the time is already spent during the parse step, which we cannot skip by this technique. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #========================================= wieck@debis.com (Jan Wieck) #
В списке pgsql-hackers по дате отправления:
Следующее
От: Tom LaneДата:
Сообщение: Re: Should pg_dump refuse to run if DB has different version?