Обсуждение: Never kill -9 postgres client processes on Linux... but why not?

Поиск
Список
Период
Сортировка

Never kill -9 postgres client processes on Linux... but why not?

От
Wells Oliver
Дата:
Had an issue tonight where I had a bunch of stalled queries from a client connection and I just... could... not... kill... them. We disconnected the client machine, turned it off, picked it up, shook it around, yelled at it, and still these idle queries remained in pg_stat_activity.

Then I did select pg_cancel_backend(pid) from pg_stat_activity where client_addr = '..' and they just would... not... go.. away.

So me being the big smart system administrator guy with shell access, I logged in, and did a kill -9 xxx where xxx was the sme pid from the pg_stat_activity result and... they finally went away!

Felt good about myself until I realized, well, so did every other connection, and in fact PG momentarily went into recovery mode.

Everything was fine, but a) why is it a bad idea to kill -9 a client PG process, but pg_cancel_backend() is OK-- and b) what to do about stalled PG queries that won't die when you disconnect AND when you pg_cancel_backend() them?

Thanks!

--

Re: Never kill -9 postgres client processes on Linux... but why not?

От
Mark Kirkwood
Дата:
Hi,


On 19/04/18 16:40, Wells Oliver wrote:
> Had an issue tonight where I had a bunch of stalled queries from a 
> client connection and I just... could... not... kill... them. We 
> disconnected the client machine, turned it off, picked it up, shook it 
> around, yelled at it, and still these idle queries remained in 
> pg_stat_activity.
>
> Then I did select pg_cancel_backend(pid) from pg_stat_activity where 
> client_addr = '..' and they just would... not... go.. away.
>
> So me being the big smart system administrator guy with shell access, 
> I logged in, and did a kill -9 xxx where xxx was the sme pid from the 
> pg_stat_activity result and... they finally went away!
>
> Felt good about myself until I realized, well, so did every other 
> connection, and in fact PG momentarily went into recovery mode.
>
> Everything was fine, but a) why is it a bad idea to kill -9 a client 
> PG process, but pg_cancel_backend() is OK-- and b) what to do about 
> stalled PG queries that won't die when you disconnect AND when you 
> pg_cancel_backend() them?
>

Did you try pg_terminate_backend? I'm guessing that might not have 
worked either...but is worth it trying before belting them with kill -9!

Kill -9 leaves the shared memory the backend was using intact but 
possibly corrupted (partial writes etc), so Postgres shuts down 
everything! This is pretty 'normal' in the db world (I think DB2 does 
the same if you kill -9 a connection from the OS).

regards
Mark


Re: Never kill -9 postgres client processes on Linux... but why not?

От
Tim Cross
Дата:


On 19 April 2018 at 14:40, Wells Oliver <wells.oliver@gmail.com> wrote:
Had an issue tonight where I had a bunch of stalled queries from a client connection and I just... could... not... kill... them. We disconnected the client machine, turned it off, picked it up, shook it around, yelled at it, and still these idle queries remained in pg_stat_activity.

Then I did select pg_cancel_backend(pid) from pg_stat_activity where client_addr = '..' and they just would... not... go.. away.

So me being the big smart system administrator guy with shell access, I logged in, and did a kill -9 xxx where xxx was the sme pid from the pg_stat_activity result and... they finally went away!

Felt good about myself until I realized, well, so did every other connection, and in fact PG momentarily went into recovery mode.

Everything was fine, but a) why is it a bad idea to kill -9 a client PG process, but pg_cancel_backend() is OK-- and b) what to do about stalled PG queries that won't die when you disconnect AND when you pg_cancel_backend() them?



Not sure about postgres specific reasons not to kill -9, but from a more general perspective, kill -9 should only be used once all other more 'polite' kill requests have been attempted. For example, kill -TERM  The problem with kill -9 is that it is a hard, non-catchable kill signal - this means there is no opportunity for the process to handle the request and cleanup before it quits. This can result in memory not being released or being corrupted as well as other resources not being released. Rather than going straight for kill -9, at least try kill -15 (TERM) first.  If that doesn't work, then pull out the big gun.
--
regards,

Tim

--
Tim Cross

Re: Never kill -9 postgres client processes on Linux... but why not?

От
Ron
Дата:

On 04/18/2018 11:53 PM, Mark Kirkwood wrote:
> Hi,
>
>
> On 19/04/18 16:40, Wells Oliver wrote:
>> Had an issue tonight where I had a bunch of stalled queries from a client 
>> connection and I just... could... not... kill... them. We disconnected 
>> the client machine, turned it off, picked it up, shook it around, yelled 
>> at it, and still these idle queries remained in pg_stat_activity.
>>
>> Then I did select pg_cancel_backend(pid) from pg_stat_activity where 
>> client_addr = '..' and they just would... not... go.. away.
>>
>> So me being the big smart system administrator guy with shell access, I 
>> logged in, and did a kill -9 xxx where xxx was the sme pid from the 
>> pg_stat_activity result and... they finally went away!
>>
>> Felt good about myself until I realized, well, so did every other 
>> connection, and in fact PG momentarily went into recovery mode.
>>
>> Everything was fine, but a) why is it a bad idea to kill -9 a client PG 
>> process, but pg_cancel_backend() is OK-- and b) what to do about stalled 
>> PG queries that won't die when you disconnect AND when you 
>> pg_cancel_backend() them?
>>
>
> Did you try pg_terminate_backend? I'm guessing that might not have worked 
> either...but is worth it trying before belting them with kill -9!

+1 to pg_terminate_backend.  On the rare occasion pg_cancel_backend doesn't 
work, I hit the pid with pg_terminate_backend, and that always works.

-- 
Angular momentum makes the world go 'round.


Re: Never kill -9 postgres client processes on Linux... but why not?

От
Jerry Sievers
Дата:
Wells Oliver <wells.oliver@gmail.com> writes:

> Had an issue tonight where I had a bunch of stalled queries from a
> client connection and I just... could... not... kill... them. We
> disconnected the client machine, turned it off, picked it up, shook
> it around, yelled at it, and still these idle queries remained in
> pg_stat_activity.

Idle backends won't respond to a cancel since if they're "idle" there's
nothing to cancel :-)

If OTOH you tried pg_terminate_backend and this too drew no response,
then you might have a case of backend   blocked in a critical section
such as SendV, symtomatic of a backend trying to write to a full pipe.

HTH

>
> Then I did select pg_cancel_backend(pid) from pg_stat_activity where
> client_addr = '..' and they just would... not... go.. away.
>
> So me being the big smart system administrator guy with shell access,
> I logged in, and did a kill -9 xxx where xxx was the sme pid from the
> pg_stat_activity result and... they finally went away!
>
> Felt good about myself until I realized, well, so did every other
> connection, and in fact PG momentarily went into recovery mode.
>
> Everything was fine, but a) why is it a bad idea to kill -9 a client
> PG process, but pg_cancel_backend() is OK-- and b) what to do about
> stalled PG queries that won't die when you disconnect AND when you
> pg_cancel_backend() them?
>
> Thanks!
>
> --
> Wells Oliver
> wells.oliver@gmail.com
>
>

-- 
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consulting@comcast.net
p: 312.241.7800