Re: Searching for Duplicates and Hosed the System

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Searching for Duplicates and Hosed the System
Дата
Msg-id 19142.1187543991@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Searching for Duplicates and Hosed the System  (Bill Thoen <bthoen@gisnet.com>)
Ответы Re: Searching for Duplicates and Hosed the System
Список pgsql-general
Bill Thoen <bthoen@gisnet.com> writes:
> I knew this would take some time, but what I didn't expect was that about
> an hour into the select, my mouse and keyboard locked up and also I
> couldn't log in from another computer via SSH. This is a Linux machine
> running Fedora Core 6 and PostgresQL is 8.1.4. There's about 50GB free on
> the disc too.

> I finally had to shut the power off and reboot to regain control of my
> computer (that wasn't good idea, either, but eventually I got everything
> working again.)

I've seen Fedora go nuts like that when it ran out of memory.  Once it
starts to swap heavily, performance goes into the tank; and once the
kernel realizes it's in memory trouble, it starts to kill processes
more or less at random.  That might explain why ssh stopped working.

One thing to do to make it more robust is to disable memory overcommit.
I suspect also that configuring it with lots of swap space is
counterproductive, because that just encourages the kernel to allow lots
of swapping.  I haven't actually experimented with that part though.

As for why PG ran the system out of memory, I suspect that the planner
drastically underestimated the number of groups to be created by your
GROUP BY, and thought it could get away with a hash aggregation.  We
don't currently have any provision for spilling hash aggregation to
disk, so if there's a very large number of groups the table just gets
very big :-(.  The planner is not supposed to choose hash agg if the
estimated table size exceeds work_mem ... but if it had out-of-date
statistics to work with it might have gotten the wrong answer.  Have
you ANALYZEd this table recently?  What does EXPLAIN show as the
estimated number of result rows?

            regards, tom lane

В списке pgsql-general по дате отправления:

Предыдущее
От: "Webb Sprague"
Дата:
Сообщение: Re: Postgresql performance in production environment
Следующее
От: "Harpreet Dhaliwal"
Дата:
Сообщение: Re: Transactional DDL