Re: Trying to track down weird query stalls

Поиск
Список
Период
Сортировка
От dan@sidhe.org
Тема Re: Trying to track down weird query stalls
Дата
Msg-id 56921.199.172.169.36.1238438533.squirrel@localhost
обсуждение исходный текст
Ответ на Re: Trying to track down weird query stalls  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Trying to track down weird query stalls
Re: Trying to track down weird query stalls
Список pgsql-performance
> On Mon, Mar 30, 2009 at 1:50 PM,  <dan@sidhe.org> wrote:
>> I'm running a 64-bit build of Postgres 8.3.5 on AIX 5.3, and have a
>> really
>> strange, annoying transient problem with one particular query stalling.
>>
>> The symptom here is that when this query is made with X or more records
>> in
>> a temp table involved in the join (where X is constant when the problem
>> manifests, but is different between manifestations) the query takes
>> about
>> 20 minutes. When the query is made with X-1 records it takes less than a
>> second. It's just this one query -- for everything else the system's
>> nice
>> and snappy. The machine load's never above 9 (it's a 32 CPU box) and
>> hasn't had less than 60G of free system memory on it.
>>
>> An EXPLAIN ANALYZE of the two queries (with X-1 and X records) is even
>> more bizarre. Not only are the two query plans identical (save trivial
>> differences because of the record count differences) but the explain
>> EXPLAIN ANALYZE total runtimes are near-identical -- the fast case
>> showed
>> 259ms, the slow case 265ms.
>
> log_min_duration_statement is a good place to start, but it sounds
> like the query plan you're getting when you test it by hand isn't the
> same one that's actually executing, so I'm not sure how far that's
> going to get you.  contrib/auto_explain sounds like it would be just
> the thing, but unfortunately that's an 8.4 feature which hasn't been
> released yet.  I'm not sure whether it could be built and run against
> 8.3.

I'm not executing any of the EXPLAINs by hand, because I didn't want to
have to worry about typos or filling in temp tables with test data. Inside
the app the SQL for the problematic query's stored in a variable -- when
the task runs with debugging enabled it first executes the query with
EXPLAIN ANALYZE prepended and dumps the output, then it executes the query
itself. It's possible something's going wrong in that, but the code's
pretty simple.

Arguably in this case the actual query should run faster than the EXPLAIN
ANALYZE version, since the cache is hot. (Though that'd only likely shave
a few dozen ms off the runtime)

-Dan

В списке pgsql-performance по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Trying to track down weird query stalls
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Trying to track down weird query stalls