Re: [HACKERS] WIP: [[Parallel] Shared] Hash
От | Rafia Sabih |
---|---|
Тема | Re: [HACKERS] WIP: [[Parallel] Shared] Hash |
Дата | |
Msg-id | CAOGQiiNk5Uri44t+jS5Z3rMTEKshhcTdDEB33JRM=kYSXNwpYw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] WIP: [[Parallel] Shared] Hash (Thomas Munro <thomas.munro@enterprisedb.com>) |
Ответы |
Re: [HACKERS] WIP: [[Parallel] Shared] Hash
(Thomas Munro <thomas.munro@enterprisedb.com>)
Re: [HACKERS] WIP: [[Parallel] Shared] Hash (Thomas Munro <thomas.munro@enterprisedb.com>) |
Список | pgsql-hackers |
On Thu, Feb 2, 2017 at 1:19 AM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > On Thu, Feb 2, 2017 at 3:34 AM, Rafia Sabih > <rafia.sabih@enterprisedb.com> wrote: >> 9 | 62928.88 | 59077.909 > > Thanks Rafia. At first glance this plan is using the Parallel Shared > Hash in one place where it should pay off, that is loading the orders > table, but the numbers are terrible. I noticed that it uses batch > files and then has to increase the number of batch files, generating a > bunch of extra work, even though it apparently overestimated the > number of rows, though that's only ~9 seconds of ~60. I am > investigating. Hi Thomas, Apart from the previously reported regression, there appear one more issue in this set of patches. At times, running a query using parallel hash it hangs up and all the workers including the master shows the following backtrace, #0 0x00003fff880c7de8 in __epoll_wait_nocancel () from /lib64/power8/libc.so.6 #1 0x00000000104e2718 in WaitEventSetWaitBlock (set=0x100157bde90, cur_timeout=-1, occurred_events=0x3fffdbe69698, nevents=1) at latch.c:998 #2 0x00000000104e255c in WaitEventSetWait (set=0x100157bde90, timeout=-1, occurred_events=0x3fffdbe69698, nevents=1, wait_event_info=134217745) at latch.c:950 #3 0x0000000010512970 in ConditionVariableSleep (cv=0x3ffd736e05a4, wait_event_info=134217745) at condition_variable.c:132 #4 0x00000000104dbb1c in BarrierWaitSet (barrier=0x3ffd736e0594, new_phase=1, wait_event_info=134217745) at barrier.c:97 #5 0x00000000104dbb9c in BarrierWait (barrier=0x3ffd736e0594, wait_event_info=134217745) at barrier.c:127 #6 0x00000000103296a8 in ExecHashShrink (hashtable=0x3ffd73747dc0) at nodeHash.c:1075 #7 0x000000001032c46c in dense_alloc_shared (hashtable=0x3ffd73747dc0, size=40, shared=0x3fffdbe69eb8, respect_work_mem=1 '\001') at nodeHash.c:2618 #8 0x000000001032a2f0 in ExecHashTableInsert (hashtable=0x3ffd73747dc0, slot=0x100158f9e90, hashvalue=2389907270) at nodeHash.c:1476 #9 0x0000000010327fd0 in MultiExecHash (node=0x100158f9800) at nodeHash.c:296 #10 0x0000000010306730 in MultiExecProcNode (node=0x100158f9800) at execProcnode.c:577 The issue is not deterministic and straightforwardly reproducible, sometimes after make clean, etc. queries run sometimes they hang up again. I wanted to bring this to your notice hoping you might be faster than me in picking up the exact reason behind this anomaly. -- Regards, Rafia Sabih EnterpriseDB: http://www.enterprisedb.com/
В списке pgsql-hackers по дате отправления: