Hi,
On 2016-10-11 17:27:56 +0530, Dilip Kumar wrote:
> I would like to propose a patch for pushing down the scan key to heap.
I think that makes sense. Both because scankey eval is faster than
generic expression eval, and because it saves a lot of function calls in
heavily filtered cases.
> However in future when we try to expand the support for complex
> expressions, then we need to be very careful while selecting
> pushable expression. It should not happen that we push something very
> complex, which may cause contention with other write operation (as
> HeapKeyTest is done under page lock).
I don't think it's a good idea to do this under the content lock in any
case. But luckily I don't think we have to do so at all.
Due to pagemode - which is used by pretty much everything iterating over
heaps, and definitely seqscans - the key test already happens without
the content lock held, in heapgettup_pagemode():
/** ----------------* heapgettup_pagemode - fetch next heap tuple in page-at-a-time mode** Same API as
heapgettup,but used in page-at-a-time mode** The internal logic is much the same as heapgettup's too, but there are
some*differences: *****we do not take the buffer content lock**** (that only needs to* happen inside heapgetpage), and
weiterate through just the tuples listed* in rs_vistuples[] rather than all tuples on the page. Notice that* lineindex
is0-based, where the corresponding loop variable lineoff in* heapgettup is 1-based.* ----------------*/
static void
heapgettup_pagemode(HeapScanDesc scan, ScanDirection dir, int nkeys,
ScanKeykey)
.../* * advance the scan until we find a qualifying tuple or run out of stuff * to scan */for (;;){ while (linesleft
>0) { lineoff = scan->rs_vistuples[lineindex]; lpp = PageGetItemId(dp, lineoff);
Assert(ItemIdIsNormal(lpp));
... /* * if current tuple qualifies, return it. */ if (key != NULL) {
bool valid;
HeapKeyTest(tuple, RelationGetDescr(scan->rs_rd), nkeys, key, valid); if
(valid) { scan->rs_cindex = lineindex; return; } }
> Instructions: (Cpu instructions measured with callgrind tool):
Note that callgrind's numbers aren't very meaningful in these days. CPU
pipelining and speculative execution/reordering makes them very
inaccurate.
Greetings,
Andres Freund