Re: Proposed Patch to Improve Performance of Multi-BatchHash Join for Skewed Data Sets
| От | Robert Haas | 
|---|---|
| Тема | Re: Proposed Patch to Improve Performance of Multi-BatchHash Join for Skewed Data Sets | 
| Дата | |
| Msg-id | 603c8f070902260522h4230869fkf91597ad31c30279@mail.gmail.com обсуждение исходный текст  | 
		
| Ответ на | Re: Proposed Patch to Improve Performance of Multi-BatchHash Join for Skewed Data Sets (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>) | 
| Ответы | 
                	
            		Re: Proposed Patch to Improve Performance of
	Multi-BatchHash Join for Skewed Data Sets
            		
            		 | 
		
| Список | pgsql-hackers | 
On Thu, Feb 26, 2009 at 4:22 AM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > I haven't been following this thread closely, so pardon if this has been > discussed already. > > The patch doesn't seem to change the cost estimates in the planner at all. > Without that, I'd imagine that the planner rarely chooses a multi-batch hash > join to begin with. AFAICS, a multi-batch hash join happens when you are joining two big, unsorted paths. The planner essentially compares the cost of sorting the two paths and then merge-joining them versus the cost of a hash join. It doesn't seem to be unusual for the hash join to come out the winner, although admittedly I haven't played with it a ton. You certainly could try to model it in the costing algorithm, but I'm not sure how much benefit you'd get out of it: if you're doing this a lot you're probably better off creating indices. > Joshua, in the tests that you've been running, did you have to rig the > planner with "enable_mergjoin=off" or similar, to get the queries to use > hash joins? I didn't have to fiddle anything, but Josh's tests were more exhaustive. ...Robert
В списке pgsql-hackers по дате отправления: