Обсуждение: Weird error
I have a Postgres application running right now. The thing is constantly doing 3-5 updates/sec and 1-2 multi-join selects/sec and performance is actually doing all right. Unfortunately, as the system runs, performance degrades, which I guess has been documented, although I still don't understand why. To work around this, I have a cron job that runs every hour and vacuum analyzes the three tables that are actually updated significantly. Most of the time, it works fine, but recently, I've beengetting this error: NOTICE: Child itemid in update-chain marked as unused - can't continue repair_frag What causes this and how do I make it stop? When this happens, whatever table is affected doesn't get analyzed and the database continues its downward resource spiral. Thanks in advance, Philip * Philip Molter * DataFoundry.net * http://www.datafoundry.net/ * philip@datafoundry.net
On Tue, 26 Jun 2001, Philip Molter wrote: > I have a Postgres application running right now. The thing is > constantly doing 3-5 updates/sec and 1-2 multi-join selects/sec and > performance is actually doing all right. Unfortunately, as the system > runs, performance degrades, which I guess has been documented, although > I still don't understand why. > > To work around this, I have a cron job that runs every hour and vacuum > analyzes the three tables that are actually updated significantly. Most of the time, it works fine, but recently, I'vebeen getting this error: > > NOTICE: Child itemid in update-chain marked as unused - can't continue > repair_frag > > What causes this and how do I make it stop? When this happens, > whatever table is affected doesn't get analyzed and the database > continues its downward resource spiral. I'm fairly sure you are _suppose_ to run VACUUM ANALYZE when there are no clients connected to the database. You may have to have your cron job temporarily suspend remote connectivity while the actions are performed. -Knight > > Thanks in advance, > Philip > > * Philip Molter > * DataFoundry.net > * http://www.datafoundry.net/ > * philip@datafoundry.net > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster >
On Tue, Jun 26, 2001 at 07:58:55PM -0700, Alex Knight wrote: : > NOTICE: Child itemid in update-chain marked as unused - can't continue : > repair_frag : : I'm fairly sure you are _suppose_ to run VACUUM ANALYZE when there are no : clients connected to the database. You may have to have your cron job : temporarily suspend remote connectivity while the actions are performed. Hrmm, if that's the case, then that REALLY sucks. I have a system that's constantly running, forcing me to run VACUUM ANALYZE on it because Postgres will constantly consume CPU if I don't. However, I need to stop or suspend my constantly running system to solve the problem. What causes that resource issue again? And 90% of the time, I can run VACUUM ANALYZE just fine, without any errors (in fact, it was running on the hour for about 10 hours before I got the first warning). * Philip Molter * DataFoundry.net * http://www.datafoundry.net/ * philip@datafoundry.net
On Tue, 26 Jun 2001, Alex Knight wrote: > On Tue, 26 Jun 2001, Philip Molter wrote: > > > I have a Postgres application running right now. The thing is > > constantly doing 3-5 updates/sec and 1-2 multi-join selects/sec and > > performance is actually doing all right. Unfortunately, as the system > > runs, performance degrades, which I guess has been documented, although > > I still don't understand why. > > > > To work around this, I have a cron job that runs every hour and vacuum > > analyzes the three tables that are actually updated significantly. Most of the time, it works fine, but recently, I'vebeen getting this error: > > > > NOTICE: Child itemid in update-chain marked as unused - can't continue > > repair_frag > > > > What causes this and how do I make it stop? When this happens, > > whatever table is affected doesn't get analyzed and the database > > continues its downward resource spiral. > > I'm fairly sure you are _suppose_ to run VACUUM ANALYZE when there are no > clients connected to the database. You may have to have your cron job > temporarily suspend remote connectivity while the actions are performed. This is definitely FALSE. Vacuum does not lock the database, it acquires certain locks while its vacuuming certain tables. I.E. your clients may not be able to modify table while its being vacuumed. Regarding the error you are getting: Which postgres version is it? See if 7.1.2 has it fixed...
On Wed, Jun 27, 2001 at 06:16:35AM -0400, Alex Pilosov wrote: : This is definitely FALSE. Vacuum does not lock the database, it acquires : certain locks while its vacuuming certain tables. I.E. your clients may : not be able to modify table while its being vacuumed. : : Regarding the error you are getting: Which postgres version is it? See if : 7.1.2 has it fixed... I am using 7.1.2. Plenty of memory (512MB, about 300MB used), Linux 2.4.2, SMP. The problem is real intermittent. It's happened twice in the 24 hours the system has been running (this particular action happens once an hour). * Philip Molter * DataFoundry.net * http://www.datafoundry.net/ * philip@datafoundry.net
Philip Molter <philip@datafoundry.net> writes: > NOTICE: Child itemid in update-chain marked as unused - can't continue > repair_frag What Postgres version is this? I think we fixed some bugs in that general vicinity in 7.1. regards, tom lane
Philip Molter <philip@datafoundry.net> writes: > I am using 7.1.2. Drat. Don't suppose you want to dig in there with a debugger when it happens? You must be seeing some hard-to-replicate problem in VACUUM's tuple-chain-moving logic. That stuff is pretty hairy, and I doubt anyone will be able to intuit what's wrong without close examination of a failure case. regards, tom lane
On Wed, Jun 27, 2001 at 11:30:54AM -0400, Tom Lane wrote: : Philip Molter <philip@datafoundry.net> writes: : > I am using 7.1.2. : : Don't suppose you want to dig in there with a debugger when it happens? : You must be seeing some hard-to-replicate problem in VACUUM's : tuple-chain-moving logic. That stuff is pretty hairy, and I doubt : anyone will be able to intuit what's wrong without close examination : of a failure case. Well, considering that we're pushing this into production and the server was installed from Rawhide RPMs, no, not really. :) Reproducing the RedHat install locations for this stuff is a pain in the ass. However, considering that it's not consistent and not continuous, I can work around it. In the meantime, I'll try to get some detailed logging so that perhaps I can get a good look at what goes on during a failure case. * Philip Molter * DataFoundry.net * http://www.datafoundry.net/ * philip@datafoundry.net
Tom Lane wrote: > > Philip Molter <philip@datafoundry.net> writes: > > I am using 7.1.2. > > Drat. > > Don't suppose you want to dig in there with a debugger when it happens? > You must be seeing some hard-to-replicate problem in VACUUM's > tuple-chain-moving logic. I had a pretty reproducible example 2 years ago. IIRC the situation was like When vacuum starts xid=10001 and 10004 are alive. If vacuum finds an update chain (10002 -> 10000 -> 10003), it removes the tuple (10000) because no xids <= 10000 is alive. Then the chain is broken. The problem seems to lie in scan_heap. How could vacuum know that the tuple (10000) must be alive after vacuum ? regards, Hiroshi Inoue
Alex Pilosov wrote: > > This is definitely FALSE. Vacuum does not lock the database, it acquires > certain locks while its vacuuming certain tables. I.E. your clients may > not be able to modify table while its being vacuumed. > I've had a vacuum deadlock my database. When I killed the vacuum client (^C from the command line) my program continued. -- Joseph Shraibman jks@selectacast.net Increase signal to noise ratio. http://www.targabot.com
On Thu, 28 Jun 2001, Joseph Shraibman wrote: > Alex Pilosov wrote: > > > > > This is definitely FALSE. Vacuum does not lock the database, it acquires > > certain locks while its vacuuming certain tables. I.E. your clients may > > not be able to modify table while its being vacuumed. > > > I've had a vacuum deadlock my database. When I killed the vacuum client > (^C from the command line) my program continued. Are you sure you mean 'deadlock'? Deadlock is when neither client nor vacuum can proceed. What most likely happened is vacuum locking the table until its done, and that is a normal behavior. -alex
No, it was deadlocked. Neither vacuum nor my program were doing anything. Alex Pilosov wrote: > > On Thu, 28 Jun 2001, Joseph Shraibman wrote: > > > Alex Pilosov wrote: > > > > > > > > This is definitely FALSE. Vacuum does not lock the database, it acquires > > > certain locks while its vacuuming certain tables. I.E. your clients may > > > not be able to modify table while its being vacuumed. > > > > > I've had a vacuum deadlock my database. When I killed the vacuum client > > (^C from the command line) my program continued. > > Are you sure you mean 'deadlock'? Deadlock is when neither client nor > vacuum can proceed. What most likely happened is vacuum locking the table > until its done, and that is a normal behavior. > > -alex > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly -- Joseph Shraibman jks@selectacast.net Increase signal to noise ratio. http://www.targabot.com