Re: Odd blocking (or massively latent) issue - even with EXPLAIN

Поиск
Список
Период
Сортировка
От Jim Vanns
Тема Re: Odd blocking (or massively latent) issue - even with EXPLAIN
Дата
Msg-id 1343054763.12579.110.camel@sys367.ldn.framestore.com
обсуждение исходный текст
Ответ на Re: Odd blocking (or massively latent) issue - even with EXPLAIN  ("Martin French" <Martin.French@romaxtech.com>)
Ответы Re: Odd blocking (or massively latent) issue - even with EXPLAIN  (Craig Ringer <ringerc@ringerc.id.au>)
Re: Odd blocking (or massively latent) issue - even with EXPLAIN  ("Martin French" <Martin.French@romaxtech.com>)
Список pgsql-performance
Thank you all for your replies, I shall try and qualify and confirm...

On Mon, 2012-07-23 at 14:46 +0100, Martin French wrote:
> Hi
>
> >
> > We're seeing SELECT statements and even EXPLAIN (no ANAYLZE)
> > statements hang indefinitely until *something* (we don't know what)
> > releases some kind of resource or no longer becomes a massive bottle
> > neck. These are the symptoms.
>
> Is this in pgAdmin? Or psql on the console?
>
psql

> > However, the system seems healthy - no table ('heavyweight') locks
> are
> > held by any session (this happens with only a few connected
> sessions),
> > all indexes are used correctly, other transactions are writing data
> (we
> > generally only have a few sessions running at a time - perhaps 10)
> etc.
> > etc. In fact, we are writing (or bgwriter is), 2-3 hundred MB/s
> > sometimes.
>
> What is shown in "top" and "iostat" whilst the queries are running?

Generally, lots of CPU churn (90-100%) and a fair bit of I/O wait.
iostat reports massive reads (up to 300MB/s).

> >
> > We regularly run vacuum analyze at quiet periods - generally 1-2s
> daily.

(this is to answer to someone who didn't reply to the list)

We run full scans using vacuumdb so don't just rely on autovacuum. The
small table is so small (<50 tuples) a sequence scan is always
performed.

> > These sessions (that only read data) that are blocked can block from
> > anything from between only 5 minutes to 10s of hours then
> miraculously
> > complete successfully at once.
> >
>
> Are any "blockers" shown in pg_stat_activity?

None. Ever. Nothing in pg_locks either.

> >
> > checkpoint_segments = 128
> > maintenance_work_mem = 256MB
> > synchronous_commit = off
> > random_page_cost = 3.0
> > wal_buffers = 16MB
> > shared_buffers = 8192MB
> > checkpoint_completion_target = 0.9
> > effective_cache_size = 18432MB
> > work_mem = 32MB
> > effective_io_concurrency = 12
> > max_stack_depth = 8MB
> > log_autovacuum_min_duration = 0
> > log_lock_waits = on
> > autovacuum_vacuum_scale_factor = 0.1
> > autovacuum_naptime = 8
> > autovacuum_max_workers = 4
>
> Memory looks reasonably configured to me. effective_cache_size is only
> an indication to the planner and is not actually allocated.

I realise that.

> Is anything being written to the logfiles?

Nothing obvious - and we log a fair amount. No tmp table creations,
no locks held.

To add to this EXPLAIN reports it took only 0.23ms to run (for example)
whereas the wall clock time is more like 20-30 minutes (or up to n hours
as I said where everything appears to click back into place at the same
time).

Thanks.

Jim

> Cheers============================================= Romax Technology
> Limited Rutherford House Nottingham Science & Technology Park
> Nottingham, NG7 2PZ England Telephone numbers: +44 (0)115 951 88 00
> (main) For other office locations see:
> http://www.romaxtech.com/Contact =================================
> =============== E-mail: info@romaxtech.com Website: www.romaxtech.com
> ================================= ================ Confidentiality
> Statement This transmission is for the addressee only and contains
> information that is confidential and privileged. Unless you are the
> named addressee, or authorised to receive it on behalf of the
> addressee you may not copy or use it, or disclose it to anyone else.
> If you have received this transmission in error please delete from
> your system and contact the sender. Thank you for your cooperation.
> =================================================
>

--
Jim Vanns
Systems Programmer
Framestore


В списке pgsql-performance по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: Odd blocking (or massively latent) issue - even with EXPLAIN
Следующее
От: Jim Vanns
Дата:
Сообщение: Re: Odd blocking (or massively latent) issue - even with EXPLAIN