7.1.3 w/ Perl/DBI application hangs

Поиск
Список
Период
Сортировка
От Doran L. Barton
Тема 7.1.3 w/ Perl/DBI application hangs
Дата
Msg-id 20011112120656.A2531@iodynamics.com
обсуждение исходный текст
Ответы Re: 7.1.3 w/ Perl/DBI application hangs
Список pgsql-bugs
I've got a fairly mission-critical Perl/mod_perl application built on top of
PostgreSQL that is causing me grief. I've been battling it for a week or so
and have determined I think the problem is a PostgreSQL bug.

The application is a web-based event-scheduling system for an office of
accountants. The server is an Athlon 1.0Ghz CPU running Red Hat 7.2 and Red
Hat's PostgreSQL 7.1.3 build. The hard disks are UW SCSI disks connected
via software RAID level 1 (complements of Red Hat Linux). There is 256M of
RAM available. I've tweaked the max_connections, shared_buffers, and
sort_mem variables with no change in behavior resulting.

I recently revised the application to make better use of complex queries
in PostgreSQL instead of doing most of the work in Perl - which was making
things quite slow.

The problem manifests itself when these more-complex queries seem to "hang" -
never returning any kind of result whatsoever. When this happens, a 'ps' on
the server shows SELECT queries spinning - taking up CPU and memory. When
frustrated users refresh their browsers, this creates more of these
processes - until someone pages me and I log in and the load average is
something like 12.2.

I do VACUUM ANALYZE daily - even though the data in the database doesn't
change that much. There are approximately 16,000 records in the events
database which comprises about three years of events data (this application
has been running for a couple years). Doing a VACUUM ANALYZE when the
database is "hanging" doesn't change anything. The VACUUM ANALYZE does
return a successful result.

Stopping the postgresql service and restarting also does not fix the
proble. It does kill off the "hanging" processes, but once the database
server is running again, new queries hang just as before.

The ONLY (temporary) solution I've found is:

  (1) shut down Apache
  (2) restart PostgreSQL
  (3) pg_dump the database
  (4) stop Postgresql
  (5) rename the data/ directory and create a new data/ directory
  (6) Initalize the database in the new data/ directory
  (7) recreate users and my database
  (8) restore the data from the previous dump
  (9) restart Apache

After I do the above steps, the application works fine for about 24 hours.

Then the problem begins manifesting itself again.  Another thing I've
noticed is that this problem devlops independently of actual application
usage. For example, I will go through the above steps on a Saturday evening
and users will complain of not being able to use the application on Monday
morning. The application is not used at all on Sundays.

Now, I'm not familiar with some more the more down-and-dirty tools of
debugging PostgreSQL (although I've been using PG for various web-based
application projects since 1995), so please bear with me. I'll run
whatever commands I am instructed. If a developer needs ssh access to
the machine to witness first-hand what is going on, that can be arranged.

This is a serious bug and it is affecting a potentially cool application.
Please help me. :) I'm a PostgreSQL poster child.

BTW: The queries I'm submitting look like this:

    -- Find weekly/biweekly recurring events for user
    SELECT event.* FROM
      event, u_participant WHERE
      event.event_id = u_participant.event_id AND
      u_participant.username = 'rld' AND
      (
        event.recur = 'Bi-Weekly' OR
        event.recur = 'Weekly' ) AND
      event.t_begin < '2001-08-05' AND
      event.recur_end >= '2001-07-29'
    UNION
    -- Find weekly/biweekly recurring events for groups user is in
    SELECT event.* FROM
       event, g_participant, group_user WHERE
       event.event_id = g_participant.event_id AND
       group_user.group_id = g_participant.group_id AND
       group_user.username = 'rld' AND
    (
        event.recur = 'Bi-Weekly' OR
        event.recur = 'Weekly' ) AND
      event.t_begin < '2001-08-05' AND
      event.recur_end >= '2001-07-29'
    UNION
    -- Find monthly recurring events for user
    SELECT event.* FROM
      event, u_participant WHERE
      event.event_id = u_participant.event_id AND
      u_participant.username = 'rld' AND
      event.recur = 'Monthly' AND
      (
         (
           date_part('month', DATE '2001-07-29') =
             date_part('month', DATE '2001-08-05') AND
           (
             date_part('day', event.t_begin) >=
               date_part('day', DATE '2001-07-29')  AND
             date_part('day', event.t_begin) <
               date_part('day', DATE '2001-08-05'))) OR
         (
           (
             date_part('month', DATE '2001-07-29') =
               date_part('month', event.t_begin) AND
             date_part('day', event.t_begin) >=
               date_part('day', DATE '2001-07-29')) OR
           (
             date_part('month', DATE '2001-08-05') =
               date_part('month', event.t_begin) AND
             date_part('day', event.t_begin) <
               date_part('day', DATE '2001-08-05')))) AND

       event.t_begin < '2001-08-05' AND
       event.recur_end >= '2001-07-29'
       UNION ...

... You get the picture.

-=Fozz

--
-------------------------------------------------------------------------
Doran L. Barton <fozz@iodynamics.com> - Chief Super Hero - Iodynamics LLC
< http://www.iodynamics.com/ > - Linux solutions and dynamic websites
 "Cars will not have intercourse on this bridge."
    -- Seen in a Tokyo traffic handbook

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Bug #514: Backend crashes periodically
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Bug #514: Backend crashes periodically