Re: Startup process thrashing

Поиск

Список

Период

Сортировка

От	Greg Smith
Тема	Re: Startup process thrashing
Дата	11 декабря 2008 г. 06:03:06
Msg-id	Pine.GSO.4.64.0812110139310.19736@westnet.com обсуждение исходный текст
Ответ на	Re: Startup process thrashing (Phillip Berry <pberry@stellaconcepts.com>)
Ответы	Re: Startup process thrashing
Список	pgsql-general

Дерево обсуждения

On Thu, 11 Dec 2008, Phillip Berry wrote:

> I'm not running PITR and checkpoint_segments is set to 100 as this is
> home to a very write intensive app.

That's weird then.  It shouldn't ever keep around more than 201 WAL
segments.  I've heard one report of a similarly mysterious excess of them,
from Robert Treat, but that was probably caused by a hardware failure.

> 1. Will copying the pg_xlog dir onto a larger partition cause any
> problems?

Ah, I didn't realize the disk it's on was still 100% full from what you
said.  You're not going to get anywhere until that's sorted out.  You
absolutely can move them all to another disk, you just can't delete them
altogether or split them across logical disks.  The usual technique is
something like this:

cd $PGDATA
mv pg_xlog $BIGGER
ln -s $BIGGER/pg_xlog

And then fire the server up in single-user mode to issue the VACUUM it's
begging for.  Once it's back to running in regular mode again, issue a
"CHECKPOINT" command, and it should erase most of those WAL files.  Then
you could stop the server and reverse the above:

cd $PGDATA
unlink pg_xlog
mv $BIGGER/pg_xlog .

While filling that disk and grinding to a halt is bad, it's not
necessarily any worse then the alternative if you didn't have it on a
dedicated filesystem--that this out of control xlog creation might create
an even greater backlog of segments.

> 2. Is there any reason that vacuuming would create so many WAL files?
> I've looked around and can't find any explaination, though it's a
> problem that seems to have happened before.

Nothing should make that many WAL files, normally the only way you can get
into that position is if you're running PITR recovery but not archiving
the segments--so they just keep piling up but the server can't recycle
them.  What's supposed to happen is that every time there is a checkpoint,
it deletes anything beyond 2*checkpoint_segments+1 worth of segments,
that's where the 201 figure I mentioned comes from.

> 3. I installed pgbouncer yesterday for connection pooling, does the fact
> that it holds connections open have any bearing on how rapidly the
> transaction IDs are used up? As in are transaction IDs in anyway related
> to connections? It's a pretty interesting coincidence.

I'm not really familiar with pgbouncer to comment on the specifics of what
it does.  In general, keeping transactions open longer will impact the
pg_clog section of the database, but that shouldn't cause the WAL to grow
out of control.  Once you get this back into production, you certainly
should carefully monitor how much disk space is being taken up on the WAL
disk moving forward.  If the count of them there gets back over 201 again,
that's an indication there's something that shouldn't be happening going
on.  That's the sort of thing you can get more information about

This isn't a real popular time of day for this list, some get some more
ideas might show up once our many European members start posting in
volume.  As a general aside, if you ever find yourself in this position
again, where you've got an urgent database problem, something you might do
in parallel with posting here is trying the IRC channel:
http://wiki.postgresql.org/wiki/IRC2RWNames

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

В списке pgsql-general по дате отправления:

Предыдущее

От: Maximilian Tyrtania
Дата: 11 декабря 2008 г., 05:37:41
Сообщение: Re: tcp_keepalives_idle setting

Следующее

От: "Tim Uckun"
Дата: 11 декабря 2008 г., 07:41:18
Сообщение: Re: Data Replication

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Startup process thrashing

Предыдущее

Следующее