Обсуждение: could not write to log -> PANIC -> System down
Hello friends, we have some strange problem, postmaster (pg 8.1 /win32) suddenly shutdown because of "no reason". The interesting thing is that this occurs always at almost same time (between 0.00 and 0.30h), in that time period is running system backup (Veristas backupexec agent) - starts at 23:30. The problem occurs randomly. In two of cases we have UPDATE/INSERT operation, but in third case - no. P.S. Why "autovacuum" runs every minute almost? Is this related? Thanks in advanced! ================ LOG (2006-12-14) ================== 2006-12-14 00:00:51 LOG: autovacuum: processing database "mtdb" 2006-12-14 00:01:52 LOG: autovacuum: processing database "mtdb" 2006-12-14 00:02:53 LOG: autovacuum: processing database "mtdb" 2006-12-14 00:03:54 LOG: autovacuum: processing database "mtdb" 2006-12-14 00:04:56 LOG: autovacuum: processing database "mtdb" 2006-12-14 00:06:02 LOG: autovacuum: processing database "mtdb" 2006-12-14 00:06:02 ERROR: could not write block 14725 of relation 1663/16388/61387: Permission denied 2006-12-14 00:06:02 CONTEXT: writing block 14725 of relation 1663/16388/61387 2006-12-14 00:06:03 ERROR: could not write block 14725 of relation 1663/16388/61387: Permission denied 2006-12-14 00:06:03 CONTEXT: writing block 14725 of relation 1663/16388/61387 2006-12-14 00:06:03 WARNING: could not write block 14725 of 1663/16388/61387 2006-12-14 00:06:03 DETAIL: Multiple failures --- write error may be permanent. 2006-12-14 00:06:04 ERROR: could not write block 14725 of relation 1663/16388/61387: Permission denied 2006-12-14 00:06:04 CONTEXT: writing block 14725 of relation 1663/16388/61387 2006-12-14 00:06:04 WARNING: could not write block 14725 of 1663/16388/61387 2006-12-14 00:06:04 DETAIL: Multiple failures --- write error may be permanent. 2006-12-14 00:06:06 ERROR: could not write block 14725 of relation 1663/16388/61387: Permission denied 2006-12-14 00:06:06 CONTEXT: writing block 14725 of relation 1663/16388/61387 2006-12-14 00:06:06 WARNING: could not write block 14725 of 1663/16388/61387 2006-12-14 00:06:06 DETAIL: Multiple failures --- write error may be permanent. 2006-12-14 00:06:06 ERROR: could not write block 14725 of relation 1663/16388/61387: Permission denied 2006-12-14 00:06:06 CONTEXT: writing block 14725 of relation 1663/16388/61387 2006-12-14 00:06:06 WARNING: could not write block 14725 of 1663/16388/61387 2006-12-14 00:06:06 DETAIL: Multiple failures --- write error may be permanent. 2006-12-14 00:06:07 ERROR: could not write block 14725 of relation 1663/16388/61387: Permission denied 2006-12-14 00:06:07 CONTEXT: writing block 14725 of relation 1663/16388/61387 2006-12-14 00:06:07 WARNING: could not write block 14725 of 1663/16388/61387 2006-12-14 00:06:07 DETAIL: Multiple failures --- write error may be permanent. 2006-12-14 00:06:08 ERROR: could not write block 14725 of relation 1663/16388/61387: Permission denied 2006-12-14 00:06:08 CONTEXT: writing block 14725 of relation 1663/16388/61387 2006-12-14 00:06:08 WARNING: could not write block 14725 of 1663/16388/61387 2006-12-14 00:06:08 DETAIL: Multiple failures --- write error may be permanent. 2006-12-14 00:06:09 ERROR: could not write block 14725 of relation 1663/16388/61387: Permission denied 2006-12-14 00:06:09 CONTEXT: writing block 14725 of relation 1663/16388/61387 2006-12-14 00:06:09 WARNING: could not write block 14725 of 1663/16388/61387 2006-12-14 00:06:09 DETAIL: Multiple failures --- write error may be permanent. 2006-12-14 00:06:10 ERROR: could not write block 14725 of relation 1663/16388/61387: Permission denied 2006-12-14 00:06:10 CONTEXT: writing block 14725 of relation 1663/16388/61387 2006-12-14 00:06:10 WARNING: could not write block 14725 of 1663/16388/61387 2006-12-14 00:06:10 DETAIL: Multiple failures --- write error may be permanent. 2006-12-14 00:06:10 ERROR: could not write block 14725 of relation 1663/16388/61387: Permission denied 2006-12-14 00:06:10 CONTEXT: writing block 14725 of relation 1663/16388/61387 2006-12-14 00:06:10 WARNING: could not write block 14725 of 1663/16388/61387 2006-12-14 00:06:10 DETAIL: Multiple failures --- write error may be permanent. 2006-12-14 00:06:11 ERROR: could not write block 14725 of relation 1663/16388/61387: Permission denied 2006-12-14 00:06:11 CONTEXT: writing block 14725 of relation 1663/16388/61387 2006-12-14 00:06:11 WARNING: could not write block 14725 of 1663/16388/61387 2006-12-14 00:06:11 DETAIL: Multiple failures --- write error may be permanent. 2006-12-14 00:06:23 ERROR: could not open relation 1663/16388/61399: Invalid argument 2006-12-14 00:06:39 PANIC: could not write to log file 4, segment 158 at offset 15286272, length 32768: Permission denied 2006-12-14 00:06:39 STATEMENT: INSERT INTO docs_pages_stats(doc_id, lang_id, page, text_mode, rip, cookie) VALUES(2066, 1, 1, 'f', '87.118.162.34', 'a2446edc627fd6f2572b68af83d1b6d4') This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information. 2006-12-14 00:06:39 LOG: server process (PID 9912) was terminated by signal 3 2006-12-14 00:06:39 LOG: terminating any other active server processes 2006-12-14 00:06:39 LOG: all server processes terminated; reinitializing 2006-12-14 00:06:39 LOG: database system was interrupted at 2006-12-14 00:06:12 FLE Standard Time 2006-12-14 00:06:39 LOG: could not read from log file 4, segment 158 at offset 15237120: Permission denied 2006-12-14 00:06:39 LOG: invalid primary checkpoint record 2006-12-14 00:06:39 LOG: could not read from log file 4, segment 158 at offset 14417920: Permission denied 2006-12-14 00:06:39 LOG: invalid secondary checkpoint record 2006-12-14 00:06:39 PANIC: could not locate a valid checkpoint record This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information. 2006-12-14 00:06:39 LOG: startup process (PID 1092) was terminated by signal 3 2006-12-14 00:06:39 LOG: aborting startup due to startup process failure 2006-12-14 00:06:40 LOG: logger shutting down ================ LOG (2006-12-13) ================== 2006-12-13 00:24:34 LOG: autovacuum: processing database "mtdb" 2006-12-13 00:24:46 ERROR: could not read block 14667 of relation 1663/16388/61387: Permission denied 2006-12-13 00:24:56 ERROR: could not read block 14667 of relation 1663/16388/61387: Permission denied 2006-12-13 00:24:58 ERROR: could not read block 14667 of relation 1663/16388/61387: Permission denied 2006-12-13 00:25:01 ERROR: could not read block 14667 of relation 1663/16388/61387: Permission denied 2006-12-13 00:25:07 ERROR: could not read block 14667 of relation 1663/16388/61387: Permission denied 2006-12-13 00:25:23 ERROR: could not open relation 1663/16388/61399: Invalid argument 2006-12-13 00:25:32 ERROR: could not read block 0 of relation 1663/16388/61414: Permission denied 2006-12-13 00:25:35 LOG: autovacuum: processing database "mtdb" 2006-12-13 00:25:36 ERROR: could not read block 0 of relation 1663/16388/61414: Permission denied 2006-12-13 00:25:49 PANIC: could not write to log file 4, segment 153 at offset 7487488, length 8192: Permission denied 2006-12-13 00:25:49 STATEMENT: UPDATE docs SET hits = hits + 1 WHERE id=1449 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information. 2006-12-13 00:25:49 LOG: server process (PID 6200) was terminated by signal 3 2006-12-13 00:25:49 LOG: terminating any other active server processes 2006-12-13 00:25:49 LOG: all server processes terminated; reinitializing 2006-12-13 00:25:49 LOG: database system was interrupted at 2006-12-13 00:23:00 FLE Standard Time 2006-12-13 00:25:49 LOG: could not read from log file 4, segment 153 at offset 7258112: Permission denied 2006-12-13 00:25:49 LOG: invalid primary checkpoint record 2006-12-13 00:25:49 LOG: could not read from log file 4, segment 153 at offset 7020544: Permission denied 2006-12-13 00:25:49 LOG: invalid secondary checkpoint record 2006-12-13 00:25:49 PANIC: could not locate a valid checkpoint record This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information. 2006-12-13 00:25:49 LOG: startup process (PID 9652) was terminated by signal 3 2006-12-13 00:25:49 LOG: aborting startup due to startup process failure 2006-12-13 00:25:50 LOG: logger shutting down ================ LOG (2006-12-05) ================== 2006-12-05 00:05:17 LOG: autovacuum: processing database "mtdb" 2006-12-05 00:05:42 ERROR: value too long for type character varying(255) 2006-12-05 00:05:53 ERROR: value too long for type character varying(255) 2006-12-05 00:06:18 LOG: autovacuum: processing database "mtdb" 2006-12-05 00:06:55 ERROR: could not read block 985 of relation 1663/16388/16605: Permission denied 2006-12-05 00:06:55 ERROR: could not read block 113 of relation 1663/16388/16605: Permission denied 2006-12-05 00:06:55 ERROR: could not read block 4 of relation 1663/16388/16605: Permission denied 2006-12-05 00:07:18 LOG: autovacuum: processing database "mtdb" 2006-12-05 00:07:22 ERROR: could not read block 14023 of relation 1663/16388/61387: Permission denied 2006-12-05 00:07:39 ERROR: could not read block 0 of relation 1663/16388/61407: Invalid argument 2006-12-05 00:07:53 PANIC: could not write to log file 4, segment 93 at offset 13664256, length 8192: Permission denied 2006-12-05 00:07:53 CONTEXT: writing block 8851 of relation 1663/16388/61399 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information. 2006-12-05 00:07:53 LOG: background writer process (PID 8748) was terminated by signal 3 2006-12-05 00:07:53 LOG: terminating any other active server processes 2006-12-05 00:07:53 WARNING: terminating connection because of crash of another server process 2006-12-05 00:07:53 DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2006-12-05 00:07:53 HINT: In a moment you should be able to reconnect to the database and repeat your command. 2006-12-05 00:07:53 WARNING: terminating connection because of crash of another server process 2006-12-05 00:07:53 DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2006-12-05 00:07:53 HINT: In a moment you should be able to reconnect to the database and repeat your command. 2006-12-05 00:07:53 WARNING: terminating connection because of crash of another server process 2006-12-05 00:07:53 DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2006-12-05 00:07:53 HINT: In a moment you should be able to reconnect to the database and repeat your command. 2006-12-05 00:07:53 WARNING: terminating connection because of crash of another server process 2006-12-05 00:07:53 DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2006-12-05 00:07:53 HINT: In a moment you should be able to reconnect to the database and repeat your command. 2006-12-05 00:07:53 LOG: all server processes terminated; reinitializing 2006-12-05 00:07:54 LOG: database system was interrupted at 2006-12-05 00:03:10 FLE Standard Time 2006-12-05 00:07:54 LOG: could not read from log file 4, segment 93 at offset 13574144: Permission denied 2006-12-05 00:07:54 LOG: invalid primary checkpoint record 2006-12-05 00:07:54 LOG: could not read from log file 4, segment 93 at offset 13369344: Permission denied 2006-12-05 00:07:54 LOG: invalid secondary checkpoint record 2006-12-05 00:07:54 PANIC: could not locate a valid checkpoint record This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information. 2006-12-05 00:07:54 LOG: startup process (PID 10032) was terminated by signal 3 2006-12-05 00:07:54 LOG: aborting startup due to startup process failure 2006-12-05 00:07:54 LOG: logger shutting down
On Thu, 2006-12-14 at 11:28, dev wrote:
> Hello friends,
>
> we have some strange problem, postmaster (pg 8.1 /win32)
> suddenly shutdown because of "no reason".
>
> The interesting thing is that this occurs always at
> almost same time (between 0.00 and 0.30h), in that time period is
> running system backup (Veristas backupexec agent) - starts at 23:30.
> The problem occurs randomly.
> In two of cases we have UPDATE/INSERT operation, but in third case - no.
>
> P.S. Why "autovacuum" runs every minute almost? Is this related?
>
> Thanks in advanced!
>
>
> ================ LOG (2006-12-14) ==================
> 2006-12-14 00:00:51 LOG: autovacuum: processing database "mtdb"
>
> 2006-12-14 00:01:52 LOG: autovacuum: processing database "mtdb"
>
> 2006-12-14 00:02:53 LOG: autovacuum: processing database "mtdb"
>
> 2006-12-14 00:03:54 LOG: autovacuum: processing database "mtdb"
>
> 2006-12-14 00:04:56 LOG: autovacuum: processing database "mtdb"
>
> 2006-12-14 00:06:02 LOG: autovacuum: processing database "mtdb"
>
> 2006-12-14 00:06:02 ERROR: could not write block 14725 of relation
> 1663/16388/61387: Permission denied
>
> 2006-12-14 00:06:02 CONTEXT: writing block 14725 of relation
> 1663/16388/61387
Looks like something is changing the permissions on this file or
something.
Is your backup agent (vertias backupexec) doing a file system backup of
the $PGDATA directory? This is probably not a good idea, especially if
it changes perms / locks the files while it is doing a backup. Either
way, a file system backup is not the proper way to backup a pgsql
instance, unless it is combined with PITR recovery.
Best answer is to not let the backup agent hit the $PGDATA directory,
but rather to run a backup with pg_dump or pg_dumpall to some other
directory and have your backup agent back that file up.
dev wrote: > Hello friends, > we have some strange problem, postmaster (pg 8.1 /win32) > suddenly shutdown because of "no reason". > The interesting thing is that this occurs always at > almost same time (between 0.00 and 0.30h), in that time period is > running system backup (Veristas backupexec agent) - starts at 23:30. > The problem occurs randomly. > In two of cases we have UPDATE/INSERT operation, but in third case - no. > ... > ================ LOG (2006-12-14) ================== > ... > 2006-12-14 00:06:02 ERROR: could not write block 14725 of relation > 1663/16388/61387: Permission denied Win32? I would suspect some sort of file sharing violation. That's the nature of the beast. -Glen
On 12/14/06, Scott Marlowe <smarlowe@g2switchworks.com> wrote: > > > > On Thu, 2006-12-14 at 11:28, dev wrote: > > Hello friends, > > > > we have some strange problem, postmaster (pg 8.1 /win32) > > suddenly shutdown because of "no reason". > > > > The interesting thing is that this occurs always at > > almost same time (between 0.00 and 0.30h), in that time period is > > running system backup (Veristas backupexec agent) - starts at 23:30. > > The problem occurs randomly. > > In two of cases we have UPDATE/INSERT operation, but in third case - no. > > > > P.S. Why "autovacuum" runs every minute almost? Is this related? > > > > Thanks in advanced! > > > > > > ================ LOG (2006-12-14) ================== > > 2006-12-14 00:00:51 LOG: autovacuum: processing database "mtdb" > > > > 2006-12-14 00:01:52 LOG: autovacuum: processing database "mtdb" > > > > 2006-12-14 00:02:53 LOG: autovacuum: processing database "mtdb" > > > > 2006-12-14 00:03:54 LOG: autovacuum: processing database "mtdb" > > > > 2006-12-14 00:04:56 LOG: autovacuum: processing database "mtdb" > > > > 2006-12-14 00:06:02 LOG: autovacuum: processing database "mtdb" > > > > 2006-12-14 00:06:02 ERROR: could not write block 14725 of relation > > 1663/16388/61387: Permission denied > > > > 2006-12-14 00:06:02 CONTEXT: writing block 14725 of relation > > 1663/16388/61387 > Is your backup agent (vertias backupexec) doing a file system backup of > the $PGDATA directory? This is probably not a good idea, especially if > it changes perms / locks the files while it is doing a backup. Either > way, a file system backup is not the proper way to backup a pgsql > instance, unless it is combined with PITR recovery. > > Best answer is to not let the backup agent hit the $PGDATA directory, > but rather to run a backup with pg_dump or pg_dumpall to some other > directory and have your backup agent back that file up. > problem is veritas which has a special kernel driver which can lock any file even if it is in use by an application. obviously, you do not want to do raw file system backup of your database folder. I would check out eSilo (disclaimer: I work at this company) for backup solutions that are specialized towards databases. merlin
That should not occur if NetBackup is at all a recent version and you're on WinXP or Win2k3. The backup client should be using Volume Shadow Copy. You should only have file locking issues on Windows 2000 or if your partitions are FAT32 (which is a terrible idea). Of course, it's Windows. "Should not" is often a suggestion, it seems. As a port, postmaster.exe was presumably not written with VSS in mind, so it might object to the shadow copy instantiation (which, again, it *should* not be able to do). No idea on the frequent autovacuuming. Do you do a lot of deletes? -- Brandon Aiken CS/IT Systems Engineer -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Merlin Moncure Sent: Thursday, December 14, 2006 4:00 PM To: Scott Marlowe Cc: dev; pgsql general Subject: Re: [GENERAL] could not write to log -> PANIC -> System down On 12/14/06, Scott Marlowe <smarlowe@g2switchworks.com> wrote: > > > > On Thu, 2006-12-14 at 11:28, dev wrote: > > Hello friends, > > > > we have some strange problem, postmaster (pg 8.1 /win32) > > suddenly shutdown because of "no reason". > > > > The interesting thing is that this occurs always at > > almost same time (between 0.00 and 0.30h), in that time period is > > running system backup (Veristas backupexec agent) - starts at 23:30. > > The problem occurs randomly. > > In two of cases we have UPDATE/INSERT operation, but in third case - no. > > > > P.S. Why "autovacuum" runs every minute almost? Is this related? > > > > Thanks in advanced! > > > > > > ================ LOG (2006-12-14) ================== > > 2006-12-14 00:00:51 LOG: autovacuum: processing database "mtdb" > > > > 2006-12-14 00:01:52 LOG: autovacuum: processing database "mtdb" > > > > 2006-12-14 00:02:53 LOG: autovacuum: processing database "mtdb" > > > > 2006-12-14 00:03:54 LOG: autovacuum: processing database "mtdb" > > > > 2006-12-14 00:04:56 LOG: autovacuum: processing database "mtdb" > > > > 2006-12-14 00:06:02 LOG: autovacuum: processing database "mtdb" > > > > 2006-12-14 00:06:02 ERROR: could not write block 14725 of relation > > 1663/16388/61387: Permission denied > > > > 2006-12-14 00:06:02 CONTEXT: writing block 14725 of relation > > 1663/16388/61387 > Is your backup agent (vertias backupexec) doing a file system backup of > the $PGDATA directory? This is probably not a good idea, especially if > it changes perms / locks the files while it is doing a backup. Either > way, a file system backup is not the proper way to backup a pgsql > instance, unless it is combined with PITR recovery. > > Best answer is to not let the backup agent hit the $PGDATA directory, > but rather to run a backup with pg_dump or pg_dumpall to some other > directory and have your backup agent back that file up. > problem is veritas which has a special kernel driver which can lock any file even if it is in use by an application. obviously, you do not want to do raw file system backup of your database folder. I would check out eSilo (disclaimer: I work at this company) for backup solutions that are specialized towards databases. merlin ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly
On Thu, Dec 14, 2006 at 05:13:30PM -0500, Brandon Aiken wrote: > Of course, it's Windows. "Should not" is often a suggestion, it seems. > As a port, postmaster.exe was presumably not written with VSS in mind, > so it might object to the shadow copy instantiation (which, again, it > *should* not be able to do). Any backup system that is not transparent to processes running on the system seems to be flawed by design. > No idea on the frequent autovacuuming. Do you do a lot of deletes? In those messages "processing" just means it woke up to see if there was anything to do. It probably didn't do anything. Waking up every minute is not that big a deal... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Вложения
Martijn van Oosterhout <kleptog@svana.org> writes: > On Thu, Dec 14, 2006 at 05:13:30PM -0500, Brandon Aiken wrote: >> No idea on the frequent autovacuuming. Do you do a lot of deletes? > In those messages "processing" just means it woke up to see if there > was anything to do. It probably didn't do anything. Waking up every > minute is not that big a deal... And more to the point, that *is* the default cycle interval, so the behavior is expected. You can change autovacuum_naptime or whatever it's called if you want something different. regards, tom lane
My understanding of VSS is that only one non-VSS aware app can access the data at any one time. All I meant was that if their NetBackup version was old that they probably cannot benefit from VSS since I doubt the Win32 PG port knows about it either. Brandon Aiken -----Original Message----- From: Martijn van Oosterhout [mailto:kleptog@svana.org] Sent: Thursday, December 14, 2006 5:23 PM To: Brandon Aiken Cc: Merlin Moncure; pgsql-general@postgresql.org Subject: Re: [GENERAL] could not write to log -> PANIC -> System down On Thu, Dec 14, 2006 at 05:13:30PM -0500, Brandon Aiken wrote: > Of course, it's Windows. "Should not" is often a suggestion, it seems. > As a port, postmaster.exe was presumably not written with VSS in mind, > so it might object to the shadow copy instantiation (which, again, it > *should* not be able to do). Any backup system that is not transparent to processes running on the system seems to be flawed by design. > No idea on the frequent autovacuuming. Do you do a lot of deletes? In those messages "processing" just means it woke up to see if there was anything to do. It probably didn't do anything. Waking up every minute is not that big a deal... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
On Fri, Dec 15, 2006 at 03:54:01AM -0500, Brandon Aiken wrote: > My understanding of VSS is that only one non-VSS aware app can access > the data at any one time. All I meant was that if their NetBackup > version was old that they probably cannot benefit from VSS since I doubt > the Win32 PG port knows about it either. VSS in general can back up files that are open, *regardless* of wether the application that holds it open knows about it. It works the same way as a SAN snapshot. Also, most pre-VSS software have special modules to deal with open files. But they were usually an addon, so if they're on an old enough version of NetBackup not to have VSS, it's certainly possible that there's an addon module required that's not present. //Magnus