On Mon, Nov 14, 2016 at 9:44 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>> Also, what if we abandoned the idea of pushing qual evaluation all the
>> way down into the heap and just tried to do HeapKeyTest in SeqNext
>> itself? Would that be almost as fast, or would it give up most of the
>> benefits?
> This we can definitely test. I will test and post the data.
>
> Thanks for the suggestion.
Here are some results with this approach...
create table test(a int, b varchar, c varchar, d int);
insert into test values(generate_series(1,10000000), repeat('x', 30),
repeat('y', 30), generate_series(1,10000000));
analyze test;
query: explain analyze select * from test where a < $1;
selectivity head patch1 patch2 patch3
10.00% 840 460 582 689
20.00% 885 563 665 752
50.00% 1076 786 871 910
80.00% 1247 988 1055 1099
100.00% 1386 1123 1193 1237
patch1: Original patch (heap_scankey_pushdown_v1.patch), only
supported for fixed length datatype and use heap_getattr.
patch2: Switches memory context in HeapKeyTest + Store tuple in slot
and use slot_getattr instead of heap_getattr.
patch3: call HeapKeyTest in SeqNext after storing slot, and also
switch memory context before calling HeapKeyTest.
Summary: (performance data)
----------------------------------------
1. At 10% selectivity patch1 shows > 40% gain which reduced to 30% in
patch2 and finally it drops to 17% in patch3.
2. I think patch1 wins over patch2, because patch1 avoid call to ExecStoreTuple.
3. Patch2 wins over patch3 because patch2 can reject tuple in
heapgettup_pagemode whereas patch3 can do it in SeqNext, so patch2 can
avoid some function calls instructions.
Summary (various patches and problems)
-----------------------------------------------------
Patch1:
problem1: This approach has performance problem in some cases (quals
on many columns of the table + high selectivity (>50%)).
problem2: HeapKeyTest uses ExecutorContext so we need to block any
datatype which needs memory allocation during eval functions.
Patch2: Patch2 solves both the problems of patch1, but exposes
executor items to heap and it's not a good design.
Patch3: This solves all the problems exists in patch1+patch2, but
performance is significantly less.
I haven't yet tested patch3 with TPCH, I will do that once machine is available.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com