On Thu, May 27, 2010 at 3:34 PM, Slava Moudry <smoudry@4info.com> wrote:
> 1) Is there a way to force plan that uses hashaggregate for the second
> query?
No, although if you crank work_mem up high enough you should get it, I think.
> 2) I am not trying to achieve any particular execution time for the
> query, but I noticed that when "disk sort" kicks in (and that happens
> eventually once the dataset is large enough) the query drastically slows
> down, even if there is no physical IO going on. I wonder if it's possible to
> have predictable performance rather than sudden drop.
No. The planner has to choose one algorithm or the other - there's
not really a way it can do a mix.
> 3) Why hashAggregate plan uses so much less memory (work_mem) than the
> plan with groupAggregate/sort? HashAggregate plan for Query1 works even with
> work_mem='2GB'; The second plan decides to use disk sort even with
> work_mem='4GB'. Why sort is so memory greedy? Are there any plans to address
> the sorting memory efficiency issues?
Well, if you select more columns, then the tuples that are buffered in
memory take up more space, right? Twice the columns = twice the
memory.
What I'd be curious to know is how accurate the memory estimates are -
figure out what the lowest value of work_mem needed to get a
particular plan is and then compare that to the amount of memory used
when you execute the query...
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company