Обсуждение: vacuum, dead rows, usual solutions didn't help

Поиск
Список
Период
Сортировка

vacuum, dead rows, usual solutions didn't help

От
Gábor Farkas
Дата:
hi,

i have a postgresql-8.2.4 db,

and vacuuming it does not remove the dead rows


basically, the problem is this part of the vacuum-output:

"
HINT:  Close open transactions soon to avoid wraparound problems.
INFO:  vacuuming "public.sessions"
INFO:  scanned index "sessions_pkey" to remove 2 row versions
DETAIL:  CPU 0.60s/0.25u sec elapsed 61.57 sec.
INFO:  "sessions": removed 2 row versions in 2 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "sessions_pkey" now contains 6157654 row versions in 52923
pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "sessions": found 2 removable, 6157654 nonremovable row versions
in 478069 pages
DETAIL:  6155746 dead row versions cannot be removed yet.
There were 8735 unused item pointers.
107 pages contain useful free space.
0 pages are entirely empty.
CPU 6.02s/1.58u sec elapsed 598.05 sec.
WARNING:  oldest xmin is far in the past
HINT:  Close open transactions soon to avoid wraparound problems.
INFO:  vacuuming "pg_toast.pg_toast_5525738"
INFO:  index "pg_toast_5525738_index" now contains 13957669 row versions
in 38328 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.40s/0.04u sec elapsed 22.26 sec.
INFO:  "pg_toast_5525738": found 0 removable, 13957669 nonremovable row
versions in 3461686 pages
DETAIL:  13938280 dead row versions cannot be removed yet.
There were 154 unused item pointers.
69 pages contain useful free space.
0 pages are entirely empty.
CPU 39.95s/6.19u sec elapsed 1139.50 sec.
INFO:  analyzing "public.sessions"
INFO:  "sessions": scanned 3000 of 478438 pages, containing 12 live rows
and 38419 dead rows; 12 rows in sample, 1914 estimated total rows
INFO:  free space map contains 26849 pages in 444 relations
DETAIL:  A total of 30736 page slots are in use (including overhead).
30736 page slots are required to track all free space.
Current limits are:  153600 page slots, 1000 relations, using 1005 kB.
"

(the full vacuum-log is at http://www.nekomancer.net/tmp/vacuum.txt)

the "sessions" table hold session-data for a web-application (the code
uses the perl Apache::Session module btw.), so it
changes very often, and is vacuumed every hour (using a cronjob).

previously we were running this application with postgresql-7.4, and
there the vacuuming worked fine. now we migrated this to
postgresql-8.2.4, and it does not want to vacuum it properly.

the migration to 8.2.4 happened approx. one month ago, and this dead-row
count has been growing since then.

what i tried:

ps aux | grep postgres on the db-server, and found some connections that
were quite old. i restarted the applications that "caused" those
connections, so right now there are no too old connections.

pg_stat_activity: the query_start of every entry is on today, for the
entries with null query_start the postgres processes are not older than
2 days.

on the db-server, 4 postgres processes are "idle in transaction", but
none is older than 2 days.

in pg_locks, all the locks that are for the "sessions" table are from
"young" (today-created) connections, and their locks are RowShareLock or
AccessShareLock.


so currently i am out of ideas what to check...

well, actually there is one more idea: maybe the autovacuuming process
somehow "conflicts" with the manual-vacuuming cronjob? is that possible?

any other ideas?

thanks,
gabor

Re: vacuum, dead rows, usual solutions didn't help

От
Tom Lane
Дата:
=?iso-8859-1?Q?G=E1bor?= Farkas <gabor@nekomancer.net> writes:
> basically, the problem is this part of the vacuum-output:

> INFO:  "sessions": found 2 removable, 6157654 nonremovable row versions
> in 478069 pages
> DETAIL:  6155746 dead row versions cannot be removed yet.

The problem is that you've got some old open transactions that could
potentially see those recently-dead rows, so VACUUM can't remove the
rows without breaking MVCC rules for those transactions.

Find the clients that are holding open transactions, and zap 'em.

> previously we were running this application with postgresql-7.4, and
> there the vacuuming worked fine. now we migrated this to
> postgresql-8.2.4, and it does not want to vacuum it properly.

I wonder whether you updated the client-side support libraries?

            regards, tom lane

Re: vacuum, dead rows, usual solutions didn't help

От
"Joshua D. Drake"
Дата:
Gábor Farkas wrote:
> hi,
>
> i have a postgresql-8.2.4 db,
>
> and vacuuming it does not remove the dead rows
>
>
> basically, the problem is this part of the vacuum-output:

> on the db-server, 4 postgres processes are "idle in transaction", but
> none is older than 2 days.

If you have something idle in transaction, your vacuums are useless. You
need to fix your app.

Sincerely,

Joshua D. Drake



Re: vacuum, dead rows, usual solutions didn't help

От
Gábor Farkas
Дата:
Tom Lane wrote:
> =?iso-8859-1?Q?G=E1bor?= Farkas <gabor@nekomancer.net> writes:
>> basically, the problem is this part of the vacuum-output:
>
>> INFO:  "sessions": found 2 removable, 6157654 nonremovable row versions
>> in 478069 pages
>> DETAIL:  6155746 dead row versions cannot be removed yet.
>
> The problem is that you've got some old open transactions that could
> potentially see those recently-dead rows, so VACUUM can't remove the
> rows without breaking MVCC rules for those transactions.
>
> Find the clients that are holding open transactions, and zap 'em.

well, that's the problem :-)

if i do a "ps aux | grep postgres" on the db-server, then

(ignoring the bin/postgres, writer-process and stats-collector-process)

then the oldest process is 3 days old.

but this dead-row-issue we have since weeks.

is it possible that a recently opened connection is blocking the
vacuuming of older-dead-rows?

>
>> previously we were running this application with postgresql-7.4, and
>> there the vacuuming worked fine. now we migrated this to
>> postgresql-8.2.4, and it does not want to vacuum it properly.
>
> I wonder whether you updated the client-side support libraries?

yes, they were updated too.


a related question:

on the db-server, let's say i have 3 databases: A, B, C.

if i have problems vacuuming B, then i only have to look for
processes/transactions touching B, correct? in other words, if  i do a
"ps aux | grep postgres", then i see also the username and the
database-name in the process-list. and if a postgres-process is not
touching B, then i do not have to check it. is this correct?

gabor

Re: vacuum, dead rows, usual solutions didn't help

От
Gábor Farkas
Дата:
Joshua D. Drake wrote:
> Gábor Farkas wrote:
>> hi,
>>
>> i have a postgresql-8.2.4 db,
>>
>> and vacuuming it does not remove the dead rows
>>
>>
>> basically, the problem is this part of the vacuum-output:
>
>> on the db-server, 4 postgres processes are "idle in transaction", but
>> none is older than 2 days.
>
> If you have something idle in transaction, your vacuums are useless. You
> need to fix your app.
>

maybe i described it the wrong way:

there is only 1 process, that is constantly "idle in transaction".

the remaining 3 were only idle-in-transaction at that point. so if i
would keep checking for idle-in-transaction processes, the list of them
would keep changing.

are you saying, that a process should NEVER be idle-in-transaction? not
even for a short time? (like some seconds?)

also, even if it is wrong, can an 'idle-in-transaction' connection that
was opened today block the vacuuming of rows that were deleted yesterday?


thanks,
gabor

Re: vacuum, dead rows, usual solutions didn't help

От
Simon Riggs
Дата:
On Thu, 2008-01-10 at 07:52 +0100, Gábor Farkas wrote:

> the remaining 3 were only idle-in-transaction at that point. so if i
> would keep checking for idle-in-transaction processes, the list of them
> would keep changing.
>
> are you saying, that a process should NEVER be idle-in-transaction? not
> even for a short time? (like some seconds?)

It's OK to be idle-in-transaction, but not OK for that state to last for
days.

> also, even if it is wrong, can an 'idle-in-transaction' connection that
> was opened today block the vacuuming of rows that were deleted yesterday?

Yes, if the rows were deleted after the connection started.

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


Re: vacuum, dead rows, usual solutions didn't help

От
Gábor Farkas
Дата:
Simon Riggs wrote:
>
>> also, even if it is wrong, can an 'idle-in-transaction' connection that
>> was opened today block the vacuuming of rows that were deleted yesterday?
>
> Yes, if the rows were deleted after the connection started.
>

to avoid any potential misunderstandings, i will summarize the situation:

1. the vacuum-cronjob refuses to remove dead rows since 1.jan.2008.

2. i know that no postgres-process is older than 7.jan.2008. (from "ps
aux | grep postgres", and except the postgres-system-processes)

how can this happen?


gabor

Re: vacuum, dead rows, usual solutions didn't help

От
Lars Heidieker
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On 10 Jan 2008, at 11:18, Gábor Farkas wrote:

> Simon Riggs wrote:
>>> also, even if it is wrong, can an 'idle-in-transaction'
>>> connection that was opened today block the vacuuming of rows that
>>> were deleted yesterday?
>> Yes, if the rows were deleted after the connection started.
>
> to avoid any potential misunderstandings, i will summarize the
> situation:
>
> 1. the vacuum-cronjob refuses to remove dead rows since 1.jan.2008.
>
> 2. i know that no postgres-process is older than 7.jan.2008. (from
> "ps aux | grep postgres", and except the postgres-system-processes)
>
> how can this happen?
>
>

To my understanding the question how old the processes are is only
partially of interest,
if a process touches those rows in a transaction just before the
vacuum runs, it can't remove those rows.
So all you need to get in this situation is a transaction that
touches the rows in a transaction and
keeps the transaction alive before vacuum runs.

- --

Viele Grüße,
Lars Heidieker

lars@heidieker.de
http://paradoxon.info

- ------------------------------------

Mystische Erklärungen:
Die mystischen Erklärungen gelten für tief;
die Wahrheit ist, dass sie noch nicht einmal oberflächlich sind.

   -- Friedrich Nietzsche
   [ Die Fröhliche Wissenschaft Buch 3, 126 ]




-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)

iD8DBQFHhfwXcxuYqjT7GRYRAoKcAKCZgW/RI9rWN0/Gkd+c7F3T4WmV0gCg4Y6p
VBxOBw50HJYsHBPFUjuaPa4=
=8d+w
-----END PGP SIGNATURE-----

Re: vacuum, dead rows, usual solutions didn't help

От
Simon Riggs
Дата:
On Thu, 2008-01-10 at 11:18 +0100, Gábor Farkas wrote:
> Simon Riggs wrote:
> >
> >> also, even if it is wrong, can an 'idle-in-transaction' connection that
> >> was opened today block the vacuuming of rows that were deleted yesterday?
> >
> > Yes, if the rows were deleted after the connection started.
> >
>
> to avoid any potential misunderstandings, i will summarize the situation:
>
> 1. the vacuum-cronjob refuses to remove dead rows since 1.jan.2008.
>
> 2. i know that no postgres-process is older than 7.jan.2008. (from "ps
> aux | grep postgres", and except the postgres-system-processes)
>
> how can this happen?

They might be different set of dead rows, just roughly the same numbers
each day.

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


Re: vacuum, dead rows, usual solutions didn't help

От
Erik Jones
Дата:
On Jan 10, 2008, at 6:01 AM, Simon Riggs wrote:

> On Thu, 2008-01-10 at 11:18 +0100, Gábor Farkas wrote:
>> Simon Riggs wrote:
>>>
>>>> also, even if it is wrong, can an 'idle-in-transaction'
>>>> connection that
>>>> was opened today block the vacuuming of rows that were deleted
>>>> yesterday?
>>>
>>> Yes, if the rows were deleted after the connection started.
>>>
>>
>> to avoid any potential misunderstandings, i will summarize the
>> situation:
>>
>> 1. the vacuum-cronjob refuses to remove dead rows since 1.jan.2008.
>>
>> 2. i know that no postgres-process is older than 7.jan.2008. (from
>> "ps
>> aux | grep postgres", and except the postgres-system-processes)
>>
>> how can this happen?
>
> They might be different set of dead rows, just roughly the same
> numbers
> each day.

Or, put another way, this is probably the same problem recurring, not
one constant instance of the issue.

Erik Jones

DBA | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com




Re: vacuum, dead rows, usual solutions didn't help

От
Gábor Farkas
Дата:
Erik Jones wrote:
>
> On Jan 10, 2008, at 6:01 AM, Simon Riggs wrote:
>
>> On Thu, 2008-01-10 at 11:18 +0100, Gábor Farkas wrote:
>>> Simon Riggs wrote:
>>>>
>>>>> also, even if it is wrong, can an 'idle-in-transaction' connection
>>>>> that
>>>>> was opened today block the vacuuming of rows that were deleted
>>>>> yesterday?
>>>>
>>>> Yes, if the rows were deleted after the connection started.
>>>>
>>>
>>> to avoid any potential misunderstandings, i will summarize the
>>> situation:
>>>
>>> 1. the vacuum-cronjob refuses to remove dead rows since 1.jan.2008.
>>>
>>> 2. i know that no postgres-process is older than 7.jan.2008. (from "ps
>>> aux | grep postgres", and except the postgres-system-processes)
>>>
>>> how can this happen?
>>
>> They might be different set of dead rows, just roughly the same numbers
>> each day.
>
> Or, put another way, this is probably the same problem recurring, not
> one constant instance of the issue.


unfortunately, i do not think that's the case, here is why:

this vacuum-process is running every hour, and i have the logs from
roughly 450 vacuum runs.

so, for one specific table, that had these unremovable rows:


the number of "removable dead rows" was between 0 and 11,
and the number of "unremovable dead rows" grew by a number between 0 and
41106 every hour (it was three times zero, and the rest was between 86
and 41106).

so i do not think it happened with different rows, just roughly the same
number.


on the good side, we changed the code for that one process, that kept
being in "idle in transaction", and now the vacuuming works nicely.

and this is still a mystery for me, because i understand that
idle-in-transaction is wrong, but even so, a process that i start today,
in my opinion simply cannot block the recovery of dead rows, that were
deleted yesterday.

but i'm probably misunderstanding something, so if i will have some more
time for this in the future, i will read more about mvcc, and maybe
start a thread here :-)

thanks for all your help,

gabor

Re: vacuum, dead rows, usual solutions didn't help

От
Tom Lane
Дата:
=?ISO-8859-1?Q?G=E1bor_Farkas?= <gabor@nekomancer.net> writes:
> and this is still a mystery for me, because i understand that
> idle-in-transaction is wrong, but even so, a process that i start today,
> in my opinion simply cannot block the recovery of dead rows, that were
> deleted yesterday.

Well, it's not one but two longest-transaction-lifetimes.  That is,
take the oldest transaction that's running now, and consider the
oldest transaction that was running when it started.  VACUUM can
reclaim rows that were deleted by transactions that started before
that one.  You didn't explain your application's behavior exactly,
but does that help?

            regards, tom lane