Re: odd intermittent query hanging issue

Поиск
Список
Период
Сортировка
От Steve Crawford
Тема Re: odd intermittent query hanging issue
Дата
Msg-id 4FBA6861.7080200@pinpointresearch.com
обсуждение исходный текст
Ответ на odd intermittent query hanging issue  (Aaron Burnett <aburnett@bzzagent.com>)
Список pgsql-general
On 05/20/2012 07:45 AM, Aaron Burnett wrote:
> Hey Steve,
>
> Just wanted to send a note of thanks for pointing me in a few new
> directions on this.
>
> Turns out that the query would swap but not all the time. When it swapped,
> it wouldn't finish, if it didn't swap it would finish in the expected
> time. Not sure why this isn't consistent...
>
> The work_mem was set at 100MB so as a test I increased it to 200MB on the
> session and so far it has consistently finished every time it has run. I'm
> not sure why this has reared it's head now as we run far more complex and
> data intensive queries than this one and they don't swap... so I guess I
> have some forensics to do.

You're welcome. Assuming that changing work_mem is a confirmed solution,
it sounds like the query is running right on the edge of running out of
work_mem and when nudged over that threshold starts to swap. PostgreSQL,
like most other things in computers, does not degrade linearly. Things
work great till you run out of some resource then degrade dramatically.

The work_mem setting specifies memory available for internal sorts and
hash operations. Looking at the original explain plan you posted I see
lots of such operations - perhaps your other queries don't need them:

explain plan:
------------------------------
  Unique  (cost=237234.66..239267.33 rows=406533 width=4) (actual
time=4790.823..4922.621 rows=418843 loops=1)
    ->   Sort  (cost=237234.66..238251.00 rows=406533 width=4) (actual
time=4790.818..4826.201 rows=418879 loops=1)
          Sort Key: member.id
          Sort Method: quicksort  Memory: 31923kB
          ->   Hash Join  (cost=167890.31..199359.99 rows=406533 width=4)
(actual time=3357.406..4532.952 rows=418879 loops=1)
                Hash Cond: (xrcm.member_id = member.id)
                ->   Seq Scan on xrefchannelmember xrcm
(cost=0.00..19273.69 rows=813066 width=4) (actual time=0.015..219.259 row
s=814421 loops=1)
                      Filter: (channel_id = 1)
                ->   Hash  (cost=162586.21..162586.21 rows=424328 width=4)
(actual time=3357.001..3357.001 rows=444626 loops=1)
                      Buckets: 65536  Batches: 1  Memory Usage: 15632kB
                      ->   Seq Scan on member  (cost=66114.02..162586.21
rows=424328 width=4) (actual time=2357.280..3216.076 rows
=444626 loops=1)
                            Filter: (NOT (hashed SubPlan 1))
                            SubPlan 1
                              ->   Seq Scan on xrefcampaignmember
(cost=0.00..57931.82 rows=3272882 width=4) (actual time=0.021..
670.086 rows=3272870 loops=1)
  Total runtime: 4963.134 ms

I'm still bothered by a couple things but I don't have a enough
understanding of internals to provide definitive advice:

1. Why does reindexing seem to temporarily solve the problem?...

2...especially when I don't see indexes being used in the query

I would definitely run explain on some of the alternative versions of
the query posted earlier to see if they do a better job of using
available indexes or reducing the amount of internal hashing/sorting. If
they do, you will have some "ammo" to take back to your developer.

Cheers,
Steve


В списке pgsql-general по дате отправления:

Предыдущее
От: Cédric Villemain
Дата:
Сообщение: Re: Global Named Prepared Statements
Следующее
От: Merlin Moncure
Дата:
Сообщение: Re: Libpq question