Hello,
At Tue, 12 Jul 2016 11:42:55 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in
<20160712.114255.156540680.horiguchi.kyotaro@lab.ntt.co.jp>
> > 3% slower for local 1*seqscan (2-parallel)
> > 14% slower for append-4*seqscan (no-prallel)
> > 19% faster for append-4*foreignscan (all scans on one connection)
> > 78% faster for append-4*foreignscan (scans have dedicate connection)
> >
> > ExecProcNode might be able to be optimized a bit.
> > ExecAppend seems to need some fix.
After some refactoring, degradation for a simple seqscan is
reduced to 1.4% and that of "Append(SeqScan())" is reduced to
1.7%. The gains are the same to the previous measurement. Scale
has been changed from previous measurement in some test cases.
t0- (SeqScan()) (2 parallel)
pl- (Append(4 * SeqScan()))
pf0 (Append(4 * ForeignScan())) all ForeignScans are on the same connection.
pf1 (Append(4 * ForeignScan())) all ForeignScans have their own connections.
patched-O2 time(ms) stddev(ms) gain from unpatched (%) t0 4121.27 1.1 -1.44 pl
1757.41 0.94 -1.73 pf0 6458.99 192.4 20.26 pf1 1747.4 24.81 78.39
unpatched-O2 t0 4062.6 1.95 pl 1727.45 9.41
pf0 8100.47 24.51 pf1 8086.52 33.53
> > Addition to the aboves, I will try reentrant ExecAsyncWaitForNode
> > or something.
After some consideration, I found that ExecAsyncWaitForNode
cannot be reentrant because it means that the control goes into
async-unaware nodes while having not-ready nodes, that is
inconsistent state. To inhibit such reentering, I allocated node
identifiers in depth-first order so that ascendant-descendant
relationship can be checked (nested-set model) in simple way and
call ExecAsyncConfigureWait only for the descendant nodes of the
parameter planstate.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center