Обсуждение: Postgres processes getting stuck (bug?)

Поиск
Список
Период
Сортировка

Postgres processes getting stuck (bug?)

От
Ciprian Grigoras
Дата:
Hi guys,

I had a question. We're running Postgresql 9.0.7 , and all of a sudden we started seeing unexpected behavior. One table got locked after we ran one simple INSERT on one test item.
Nothing else was running against that table as far as we know, and the query is frozen there now. After some time we tried to terminate it forcefully (pg_terminate_backend), ran that and the return of the command is "true" but the query still stays on, shows up on pg_stat_activity etc.
Reading from the table is fine, but we believe any other inserts / updates are not possible.

Restarting the Postgresql server is not an option at this moment, since it's a production box and another high-demand database is running from that.

Has anyone seen a similar issue (maybe a bug in the 9.0 version ?) where simple statements don't finish and get locked there and can't be killed by the pg_terminate_backed ? What is the cause of this ?
Any reasonable way to find out more details on what caused this, how to prevent it in the future, and how it can be fixed sensitively now ? Thanks !

just fyi, checking the data in the "pg_locks" (for that process that is frozen), shows a bunch of rows all with the same virtualtransactionid. Only one of them has the mode of "ExclusiveLock" (the only record there with a locktype of "virtualxid"), a few have the "RowExclusiveLock" mode and the vast majority have the "AccessShareLock" mode.

Thanks,
Ciprian

Re: Postgres processes getting stuck (bug?)

От
Bob Lunney
Дата:
Check the wait state for the backend process.  I’ve seen this happen when a process ran a per-row trigger that tried to connect to something else, and it filled the ip_conntrack table.  So, kernel level wait along with a whole bunch of locks on the table in question.  

Running pg_terminate_backend() didnt work, as the signal queued behind the kernel wait.  We had to bounce the database to get rid of the problem.  Immediately afterwards we disabled the trigger.

HTH,

Bob Lunney

On Apr 29, 2016, at 1:30 PM, Ciprian Grigoras <ciprian.grigoras@vitals.com> wrote:

Hi guys,

I had a question. We're running Postgresql 9.0.7 , and all of a sudden we started seeing unexpected behavior. One table got locked after we ran one simple INSERT on one test item.
Nothing else was running against that table as far as we know, and the query is frozen there now. After some time we tried to terminate it forcefully (pg_terminate_backend), ran that and the return of the command is "true" but the query still stays on, shows up on pg_stat_activity etc.
Reading from the table is fine, but we believe any other inserts / updates are not possible.

Restarting the Postgresql server is not an option at this moment, since it's a production box and another high-demand database is running from that.

Has anyone seen a similar issue (maybe a bug in the 9.0 version ?) where simple statements don't finish and get locked there and can't be killed by the pg_terminate_backed ? What is the cause of this ?
Any reasonable way to find out more details on what caused this, how to prevent it in the future, and how it can be fixed sensitively now ? Thanks !

just fyi, checking the data in the "pg_locks" (for that process that is frozen), shows a bunch of rows all with the same virtualtransactionid. Only one of them has the mode of "ExclusiveLock" (the only record there with a locktype of "virtualxid"), a few have the "RowExclusiveLock" mode and the vast majority have the "AccessShareLock" mode.

Thanks,
Ciprian

Re: Postgres processes getting stuck (bug?)

От
John R Pierce
Дата:
On 4/29/2016 10:30 AM, Ciprian Grigoras wrote:
We're running Postgresql 9.0.7...

do note, the 9.0 series got up to 9.0.23 before it was discontinued last year.   9.0.7 was released in February 2012, 4+ years ago.

9.0.18 fixed some index corruption bugs in GIST indexes, 9.0.15 fixed a bunch of other data corruption problems, 9.0.13 fixed yet more GiST index problems.      of course, every one of these incremental updates fixed dozens of relatively obscure bugs, you'd need to read the release notes for each version between 9.0.8 and 9.0.23 for the complete list.




-- 
john r pierce, recycling bits in santa cruz

Re: Postgres processes getting stuck (bug?)

От
Jerry Sievers
Дата:
Ciprian Grigoras <ciprian.grigoras@vitals.com> writes:

> Hi guys,
>
> I had a question. We're running Postgresql 9.0.7 , and all of a sudden we started seeing unexpected behavior. One
tablegot locked after we ran one simple INSERT on one 
> test item.
> Nothing else was running against that table as far as we know, and the query is frozen there now. After some time we
triedto terminate it forcefully 
> (pg_terminate_backend), ran that and the return of the command is "true" but the query still stays on, shows up on
pg_stat_activityetc. 
> Reading from the table is fine, but we believe any other inserts / updates are not possible.

What do you get from strace -p $pid?

> Restarting the Postgresql server is not an option at this moment, since it's a production box and another high-demand
databaseis running from that. 
>
> Has anyone seen a similar issue (maybe a bug in the 9.0 version ?) where simple statements don't finish and get
lockedthere and can't be killed by the 
> pg_terminate_backed ? What is the cause of this ?
> Any reasonable way to find out more details on what caused this, how to prevent it in the future, and how it can be
fixedsensitively now ? Thanks ! 
>
> just fyi, checking the data in the "pg_locks" (for that process that is frozen), shows a bunch of rows all with the
samevirtualtransactionid. Only one of them has the 
> mode of "ExclusiveLock" (the only record there with a locktype of "virtualxid"), a few have the "RowExclusiveLock"
modeand the vast majority have the "AccessShareLock" 
> mode.
>
> Thanks,
> Ciprian
>

--
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consulting@comcast.net
p: 312.241.7800


Re: Postgres processes getting stuck (bug?)

От
Ciprian Grigoras
Дата:
Thanks everyone, it could be that the Linux server is problematic (it's got a long uptime).
I ran the strace thing,

  1. [root@box ~]#  strace -p 20825
  2. Process 20825 attached - interrupt to quit
  3. connect(90, {sa_family=AF_FILE, path="/var/centrifydc/daemon2"}, 25
and it kept waiting there with minimal information, nothing else was shown...

On Fri, Apr 29, 2016 at 2:33 PM, Jerry Sievers <gsievers19@comcast.net> wrote:
Ciprian Grigoras <ciprian.grigoras@vitals.com> writes:

> Hi guys,
>
> I had a question. We're running Postgresql 9.0.7 , and all of a sudden we started seeing unexpected behavior. One table got locked after we ran one simple INSERT on one
> test item.
> Nothing else was running against that table as far as we know, and the query is frozen there now. After some time we tried to terminate it forcefully
> (pg_terminate_backend), ran that and the return of the command is "true" but the query still stays on, shows up on pg_stat_activity etc.
> Reading from the table is fine, but we believe any other inserts / updates are not possible.

What do you get from strace -p $pid?

> Restarting the Postgresql server is not an option at this moment, since it's a production box and another high-demand database is running from that.
>
> Has anyone seen a similar issue (maybe a bug in the 9.0 version ?) where simple statements don't finish and get locked there and can't be killed by the
> pg_terminate_backed ? What is the cause of this ?
> Any reasonable way to find out more details on what caused this, how to prevent it in the future, and how it can be fixed sensitively now ? Thanks !
>
> just fyi, checking the data in the "pg_locks" (for that process that is frozen), shows a bunch of rows all with the same virtualtransactionid. Only one of them has the
> mode of "ExclusiveLock" (the only record there with a locktype of "virtualxid"), a few have the "RowExclusiveLock" mode and the vast majority have the "AccessShareLock"
> mode.
>
> Thanks,
> Ciprian
>

--
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consulting@comcast.net
p: 312.241.7800



--


Description: vitals_logo_150.png

Ciprian Grigoras

Software Engineer 
Vitals | 160 Chubb Avenue, Suite 301, Lyndhurst, NJ 07071, USA 

P : 201.459.6275 |  M : 845.517.8710

http://www.vitals.com


Вложения

Re: Postgres processes getting stuck (bug?)

От
Merlin Moncure
Дата:
On Fri, Apr 29, 2016 at 1:09 PM, John R Pierce <pierce@hogranch.com> wrote:
> On 4/29/2016 10:30 AM, Ciprian Grigoras wrote:
>
> We're running Postgresql 9.0.7...
>
>
> do note, the 9.0 series got up to 9.0.23 before it was discontinued last
> year.   9.0.7 was released in February 2012, 4+ years ago.
>
> 9.0.18 fixed some index corruption bugs in GIST indexes, 9.0.15 fixed a
> bunch of other data corruption problems, 9.0.13 fixed yet more GiST index
> problems.      of course, every one of these incremental updates fixed
> dozens of relatively obscure bugs, you'd need to read the release notes for
> each version between 9.0.8 and 9.0.23 for the complete list.


+1 this -- OP is running EOL version of postgres minus 16 or so bugfix
releases.

merlin


Re: Postgres processes getting stuck (bug?)

От
Alvaro Herrera
Дата:
Ciprian Grigoras wrote:

> Has anyone seen a similar issue (maybe a bug in the 9.0 version ?) where
> simple statements don't finish and get locked there and can't be killed by
> the pg_terminate_backed ? What is the cause of this ?

Interesting.  I wonder if you just wrapped around pg_multixact and the
backend is stuck in the loop there.  I have never heard of anyone with
this problem, and wraparound is supposed to work fine in 9.0.  (As I
recall that code is there since 8.1 so it'd be very surprising that it
would have bugs there and not notice all this time).

Since you've been evidently running with this for years and never seen
this problem, I would be surprised if you ever see it again.  Still, if
it does happen, please do grab a stack trace with GDB on the blocked
processes.  Make sure to have debug symbols.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services