Re: Hash Join cost estimates

Поиск
Список
Период
Сортировка
От Stephen Frost
Тема Re: Hash Join cost estimates
Дата
Msg-id 20130404201612.GM4361@tamriel.snowman.net
обсуждение исходный текст
Ответ на Re: Hash Join cost estimates  (Stephen Frost <sfrost@snowman.net>)
Ответы Re: Hash Join cost estimates  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Hash Join cost estimates  ("ktm@rice.edu" <ktm@rice.edu>)
Список pgsql-hackers
* Stephen Frost (sfrost@snowman.net) wrote:
> It does look like reducing bucket depth, as I outlined before through
> the use of a 2-level hashing system, might help speed up
> ExecScanHashBucket, as it would hopefully have very few (eg: 1-2)
> entries to consider instead of more.  Along those same lines, I really
> wonder if we're being too generous wrt the bucket-depth goal of '10'
> instead of, say, '1', especially when we've got plenty of work_mem
> available.

Rerunning using a minimally configured build (only --enable-openssl
and --enable-debug passed to configure) with NTUP_PER_BUCKET set to '1'
results in a couple of interesting things-

First, the planner actually picks the plan to hash the small table and
seqscan the big one.  That also, finally, turns out to be *faster* for
this test case.

explain analyze results here:
Hash small table / seqscan big table: http://explain.depesz.com/s/nP1
Hash big table / seqscan small table: http://explain.depesz.com/s/AUv

Here's the oprofile reports:

Hash small table / seqscan big table:
samples  cum. samples  %        cum. %     linenr info                 image name               symbol name
39023    39023         52.8574  52.8574    nodeHash.c:915              postgres                 ExecScanHashBucket
3743     42766          5.0700  57.9273    xact.c:682                  postgres
TransactionIdIsCurrentTransactionId
3110     45876          4.2126  62.1399    nodeHashjoin.c:63           postgres                 ExecHashJoin
2561     48437          3.4689  65.6088    heapam.c:711                postgres                 heapgettup_pagemode
2427     50864          3.2874  68.8962    heapam.c:300                postgres                 heapgetpage
2395     53259          3.2441  72.1403    heaptuple.c:1028            postgres                 slot_deform_tuple
2395     55654          3.2441  75.3843    heaptuple.c:1135            postgres                 slot_getattr
2383     58037          3.2278  78.6122    nodeHash.c:786              postgres                 ExecHashGetHashValue
1811     59848          2.4530  81.0652    tqual.c:1044                postgres                 HeapTupleSatisfiesMVCC
1796     61644          2.4327  83.4979    execScan.c:111              postgres                 ExecScan
1298     62942          1.7582  85.2561    hashfunc.c:517              postgres                 hash_uint32
1274     64216          1.7257  86.9817    execProcnode.c:356          postgres                 ExecProcNode
1011     65227          1.3694  88.3511    heapam.c:1453               postgres                 heap_getnext
905      66132          1.2258  89.5770    execTuples.c:333            postgres                 ExecStoreTuple
858      66990          1.1622  90.7392    fmgr.c:1291                 postgres                 FunctionCall1Coll
835      67825          1.1310  91.8702    execQual.c:668              postgres                 ExecEvalScalarVarFast
834      68659          1.1297  92.9999    mcxt.c:126                  postgres                 MemoryContextReset
818      69477          1.1080  94.1078    nodeSeqscan.c:48            postgres                 SeqNext

Hash big table / seqscan small table:
samples  cum. samples  %        cum. %     linenr info                 image name               symbol name
38612    38612         41.2901  41.2901    nodeHash.c:709              postgres                 ExecHashTableInsert
7435     46047          7.9507  49.2408    (no location information)   no-vmlinux               /no-vmlinux
4900     50947          5.2399  54.4806    aset.c:563                  postgres                 AllocSetAlloc
3803     54750          4.0668  58.5474    xact.c:682                  postgres
TransactionIdIsCurrentTransactionId
3335     58085          3.5663  62.1137    heapam.c:711                postgres                 heapgettup_pagemode
2532     60617          2.7076  64.8213    nodeHash.c:786              postgres                 ExecHashGetHashValue
2523     63140          2.6980  67.5193    memcpy-ssse3-back.S:60      libc-2.15.so             __memcpy_ssse3_back
2518     65658          2.6926  70.2119    heaptuple.c:1028            postgres                 slot_deform_tuple
2378     68036          2.5429  72.7549    heapam.c:300                postgres                 heapgetpage
2374     70410          2.5387  75.2935    heaptuple.c:1135            postgres                 slot_getattr
1852     72262          1.9805  77.2740    nodeHash.c:915              postgres                 ExecScanHashBucket
1831     74093          1.9580  79.2320    tqual.c:1044                postgres                 HeapTupleSatisfiesMVCC
1732     75825          1.8521  81.0841    heapam.c:1453               postgres                 heap_getnext
1320     77145          1.4116  82.4957    nodeHash.c:76               postgres                 MultiExecHash
1219     78364          1.3035  83.7992    heaptuple.c:1529            postgres
minimal_tuple_from_heap_tuple
1212     79576          1.2961  85.0953    execProcnode.c:356          postgres                 ExecProcNode
1209     80785          1.2929  86.3881    hashfunc.c:517              postgres                 hash_uint32
1197     81982          1.2800  87.6682    execScan.c:111              postgres                 ExecScan
1139     83121          1.2180  88.8862    execTuples.c:333            postgres                 ExecStoreTuple
1010     84131          1.0801  89.9662    execTuples.c:662            postgres
ExecFetchSlotMinimalTuple
961      85092          1.0277  90.9939    aset.c:821                  postgres                 AllocSetFree

Looking with opannotate, there's two main hotspots in
ExecScanHashBucket:
12846 17.4001 :        hashTuple = hashtable->buckets[hjstate->hj_CurBucketNo];

and
22100 29.9348 :        hashTuple = hashTuple->next;

I'm certainly curious about those, but I'm also very interested in the
possibility of making NTUP_PER_BUCKET much smaller, or perhaps variable
depending on the work_mem setting.  It's only used in
ExecChooseHashTableSize, so while making it variable or depending on
work_mem could slow planning down a bit, it's not a per-tuple cost item.

Thoughts?
Thanks,
    Stephen

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeff Davis
Дата:
Сообщение: Re: corrupt pages detected by enabling checksums
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Multi-pass planner