Обсуждение: Backends "hanging" with strace showing selects?
hi
had strange situation today.
very high load, cpu saturated (and this machine has lots of cores).
i straced one of backends that was using lots of cpu (it was doing some
select, but I don't know what as i wasn't able to start psql).
strace looked like this:
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
i.e. lots (literally hundreds) of such messages. very quickly adding new ones.
pg version is:
PostgreSQL 8.3.12 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-48)
I know it's not much of information, but perhaps it will ring someones bell, and there will be ready answer what went
wrong?
Best regards,
depesz
--
Linkedin: http://www.linkedin.com/in/depesz / blog: http://www.depesz.com/
jid/gtalk: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007
hubert depesz lubaczewski <depesz@depesz.com> writes:
> hi
> had strange situation today.
> very high load, cpu saturated (and this machine has lots of cores).
> i straced one of backends that was using lots of cpu (it was doing some
> select, but I don't know what as i wasn't able to start psql).
> strace looked like this:
> select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
> select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
> select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
> select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
That suggests a lot of contention for a spinlock, but without any
information about what the system was really doing, it's hard to go
further than that.
regards, tom lane
On Mon, Nov 15, 2010 at 02:52:16PM -0500, Tom Lane wrote:
> hubert depesz lubaczewski <depesz@depesz.com> writes:
> > hi
> > had strange situation today.
>
> > very high load, cpu saturated (and this machine has lots of cores).
>
> > i straced one of backends that was using lots of cpu (it was doing some
> > select, but I don't know what as i wasn't able to start psql).
>
> > strace looked like this:
> > select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
> > select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
> > select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
> > select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
>
> That suggests a lot of contention for a spinlock, but without any
> information about what the system was really doing, it's hard to go
> further than that.
we had ~ 700 active connections, but it is virtually impossible to tell
what they were doing, as I couldn't connect to get pg_stat_activity.
Best regards,
depesz
--
Linkedin: http://www.linkedin.com/in/depesz / blog: http://www.depesz.com/
jid/gtalk: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007