Обсуждение: vacuum won't even start

Поиск
Список
Период
Сортировка

vacuum won't even start

От
Jean-Christophe Praud
Дата:
Hi all,

I've a problem on a heavy loaded database: vacuums don't work since
about a week. All I got is:

mybase=# vacuum verbose analyze public.mytable;
INFO:  vacuuming "public.mytable"
(I stop it after hours)

Looking with top and iotop, I see the process takes some cpu and disk io
time during several minutes, then it seems to fall asleep.
The process isn't locked according to pg_stat_activity.

My setup:
- postgresql 8.3.7 with contribs ltree and pgcrypto
- OS: debian etch kernel 2.6.24
- HW: 8cores Xeon/32GB RAM/3RAID10 volumes(index, data, pgxlog)
- dbase size: about 240GB
- millions of queries/day
- 1000 locks continually
- about 200 simultanous connections
- load: 30%iowait, 60%user, 10%sys


Autovacuum is disabled to prevent it from loading the server during peak
hours.
Regular vacuums running each night as cron job

Since about a week the nightly vacuums don't work. I tried manual ones
with no avail, same symptoms as above on small tables (350 rows) as well
as on big ones (almost 1 billion rows)

As the croned vacuums don't run anymore, I see now autovacuums (to
prevent wraparound) running all the time, but their process don't use
any cpu time nor disk io.

Autovacuum seems to work well on the pg_catalog schema.

The problem seems to start with some queries lasting more 15 hours. I
tried to kill them (signal 15) with no avail.

I can't restart the server as it's a big production server.

We're planning to upgrade the hardware soon, but I suspect we'll have
the same problems in the future as our platform is growing.

Does anyone have any info about this problem, and the means to prevent it ?

Thanks in advance.

Regards,


--
JC
Ph'nglui  mglw'nafh  Cthulhu  n'gah  Bill  R'lyeh  Wgah'nagl fhtagn!


Re: vacuum won't even start

От
Alvaro Herrera
Дата:
Jean-Christophe Praud wrote:
> Hi all,
>
> I've a problem on a heavy loaded database: vacuums don't work since
> about a week. All I got is:
>
> mybase=# vacuum verbose analyze public.mytable;
> INFO:  vacuuming "public.mytable"
> (I stop it after hours)
>
> Looking with top and iotop, I see the process takes some cpu and
> disk io time during several minutes, then it seems to fall asleep.
> The process isn't locked according to pg_stat_activity.

What are your vacuum_cost_% parameters?

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: vacuum won't even start

От
Tom Lane
Дата:
Jean-Christophe Praud <jc@steek.com> writes:
> I've a problem on a heavy loaded database: vacuums don't work since
> about a week. All I got is:

> mybase=# vacuum verbose analyze public.mytable;
> INFO:  vacuuming "public.mytable"
> (I stop it after hours)

> Looking with top and iotop, I see the process takes some cpu and disk io
> time during several minutes, then it seems to fall asleep.
> The process isn't locked according to pg_stat_activity.

When vacuum wants to clean up a particular table page, it will wait
until no other process is examining that page; and this wait is not
visible in pg_locks.  Perhaps you have got some queries referencing
those tables that have stopped midway and are just sitting?

Although pg_locks won't immediately show the wait, it could be useful
to help identify the culprit --- look for other processes holding
any type of lock on the table the vacuum is stuck on, and then go to
pg_stat_activity to see how old their current query is.

            regards, tom lane

Re: vacuum won't even start

От
Jean-Christophe Praud
Дата:
Alvaro Herrera a écrit :
Jean-Christophe Praud wrote: 
Hi all,

I've a problem on a heavy loaded database: vacuums don't work since
about a week. All I got is:

mybase=# vacuum verbose analyze public.mytable;
INFO:  vacuuming "public.mytable"
(I stop it after hours)

Looking with top and iotop, I see the process takes some cpu and
disk io time during several minutes, then it seems to fall asleep.
The process isn't locked according to pg_stat_activity.   
What are your vacuum_cost_% parameters?
 
I've let the default values (not even uncommented in the conf file ;) ):

#vacuum_cost_delay = 0                  # 0-1000 milliseconds
#vacuum_cost_page_hit = 1               # 0-10000 credits
#vacuum_cost_page_miss = 10             # 0-10000 credits
#vacuum_cost_page_dirty = 20            # 0-10000 credits
#vacuum_cost_limit = 200                # 1-10000 credits


-- 
JC
Ph'nglui  mglw'nafh  Cthulhu  n'gah  Bill  R'lyeh  Wgah'nagl fhtagn!

Re: vacuum won't even start

От
Jean-Christophe Praud
Дата:
Tom Lane a écrit :
Jean-Christophe Praud <jc@steek.com> writes: 
I've a problem on a heavy loaded database: vacuums don't work since 
about a week. All I got is:   
 
mybase=# vacuum verbose analyze public.mytable;
INFO:  vacuuming "public.mytable"
(I stop it after hours)   
 
Looking with top and iotop, I see the process takes some cpu and disk io 
time during several minutes, then it seems to fall asleep.
The process isn't locked according to pg_stat_activity.   
When vacuum wants to clean up a particular table page, it will wait
until no other process is examining that page; and this wait is not
visible in pg_locks.  Perhaps you have got some queries referencing
those tables that have stopped midway and are just sitting?

Although pg_locks won't immediately show the wait, it could be useful
to help identify the culprit --- look for other processes holding
any type of lock on the table the vacuum is stuck on, and then go to
pg_stat_activity to see how old their current query is.
		regards, tom lane 
Indeed, the tables I tried to vacuum have locks on them.  AccessShareLock belonging to queries which seem sleeping. I tried to kill these queries but pg_cancel_backend() has no effect, and the process doesn't get the 15 signal.

How can I get rid of these blocking queries without restarting the server ? They are not listed as "waiting" in pg_stat_activity.

These queries are MOVE FORWARD on cursors, the underlying query is a rather complex one (unions, joins, functions calls)

Regards,

-- 
JC
Ph'nglui  mglw'nafh  Cthulhu  n'gah  Bill  R'lyeh  Wgah'nagl fhtagn!

Re: vacuum won't even start

От
Tom Lane
Дата:
Jean-Christophe Praud <jc@steek.com> writes:
> Indeed, the tables I tried to vacuum have locks on them.
> AccessShareLock belonging to queries which seem sleeping. I tried to
> kill these queries but pg_cancel_backend() has no effect, and the
> process doesn't get the 15 signal.

> How can I get rid of these blocking queries without restarting the
> server ? They are not listed as "waiting" in pg_stat_activity.

Have you tried killing the connected client sessions?

            regards, tom lane

Re: vacuum won't even start

От
Jean-Christophe Praud
Дата:
Tom Lane a écrit :
Jean-Christophe Praud <jc@steek.com> writes: 
Indeed, the tables I tried to vacuum have locks on them.  
AccessShareLock belonging to queries which seem sleeping. I tried to 
kill these queries but pg_cancel_backend() has no effect, and the 
process doesn't get the 15 signal.   
 
How can I get rid of these blocking queries without restarting the 
server ? They are not listed as "waiting" in pg_stat_activity.   
Have you tried killing the connected client sessions?
		regards, tom lane 
It works !

I had pgbouncer connections hanging for several days.

Thanks for your help :)

Regards,

-- 
JC
Ph'nglui  mglw'nafh  Cthulhu  n'gah  Bill  R'lyeh  Wgah'nagl fhtagn!