Обсуждение: Re: multiple runs of the same query cause out of memory - WAS [Re: capturing/viewing sort_mem utilization on a per query basis]

Поиск
Список
Период
Сортировка
Lonni J Friedman <netllama@gmail.com> writes:
> We have found that this query can indeed cause terrible things to
> happen to postgresql - it can run several times with response times of
> a few seconds, and then the next time cause postgres to go out of
> control on memory usage.  (I've had to reboot my machine twice to get
> control).  On a beefier machine with more memory and two CPUs, the
> query worked okay for a few dozen repetitions (with a few in
> parallel), then eventually one instance would cause the postgres
> process to spiral out of control and consume more
> and more memory at the rate of a megabyte every second or so.

> I'm not quite sure where to go from here, but this is definitely
> reproducable now.  Help?!

That's a bit hard to believe --- if nothing is changing, the query
should get processed the same way every time.

Can you package up a test case for other people to look at?

            regards, tom lane

On Wed, 23 Feb 2005 19:56:56 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Lonni J Friedman <netllama@gmail.com> writes:
> > We have found that this query can indeed cause terrible things to
> > happen to postgresql - it can run several times with response times of
> > a few seconds, and then the next time cause postgres to go out of
> > control on memory usage.  (I've had to reboot my machine twice to get
> > control).  On a beefier machine with more memory and two CPUs, the
> > query worked okay for a few dozen repetitions (with a few in
> > parallel), then eventually one instance would cause the postgres
> > process to spiral out of control and consume more
> > and more memory at the rate of a megabyte every second or so.
>
> > I'm not quite sure where to go from here, but this is definitely
> > reproducable now.  Help?!
>
> That's a bit hard to believe --- if nothing is changing, the query
> should get processed the same way every time.

Its definitely not the same every time after this additional testing.

>
> Can you package up a test case for other people to look at?

I could try, but i'm not sure what you'd need.  Right now, i can only
replicate this behavior with a snapshot of this one customer's DB.
Running on the same schema, with little or no data, it always
completes fine, so it looks to be partially data driven in nature, or
perhaps its the volume of data.



--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
L. Friedman                                    netllama@gmail.com
LlamaLand                       http://netllama.linux-sxs.org

On Thu, 24 Feb 2005 22:46:02 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Lonni J Friedman <netllama@gmail.com> writes:
> > Oh, also, do you see a time difference in running this query with &
> > without the genetic optimizer?
>
> Yeah, but that's no surprise.
>
> The reason I couldn't replicate your problem was I was trying 8.0.
> What I've found so far is that the 8.0 GEQO consistently finds a good
> plan while 7.4 usually finds a bad to awful plan :-(.  In about 30 tries
> with 7.4, I got two OOMs and two that I gave up on after upwards of 20
> minutes runtime.  The runtimes of the rest were all over the map ---
> up to 30 times slower than the plan found with geqo off.
>
> In about 200 tries, 8.0 all but once found a plan no worse than 50%
> slower than what it found with geqo off; and even the outlier was only
> 3x slower than the best plan.
>
> So at least on this example, the GEQO tweaking we did for 8.0 really
> paid off.
>
> As long as you're on 7.4, disabling GEQO may be a good answer.  Rather
> than just setting geqo = off, I'd advise bumping geqo_threshold up a
> notch or two.  That will allow this query to be planned by the regular
> planner, without buying into doing impossibly complex queries that way.

Thanks for your help on this.  Immediately, we've disabled the genetic
optimizer, and we'll tinker with increaseing the threashold parameter
to see if that produces more stability.


--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
L. Friedman                                    netllama@gmail.com
LlamaLand                       http://netllama.linux-sxs.org