Re: [bug fix] PITR corrupts the database cluster
| От | Andres Freund | 
|---|---|
| Тема | Re: [bug fix] PITR corrupts the database cluster | 
| Дата | |
| Msg-id | 20130724105943.GB27288@alap2.anarazel.de обсуждение исходный текст | 
| Ответ на | [bug fix] PITR corrupts the database cluster ("MauMau" <maumau307@gmail.com>) | 
| Ответы | Re: [bug fix] PITR corrupts the database cluster | 
| Список | pgsql-hackers | 
On 2013-07-24 19:30:09 +0900, MauMau wrote: > I've encountered a bug of PITR that corrupts the database. I'm willing to > submit the patch to fix it, but I'm wondering what approach is appropriate. > Could you give me your opinions? > > [Problem] > I cannot connect to the database after performing the following steps: > > 1. CREATE DATABASE mydb; > 2. Take a base backup with pg_basebackup. > 3. DROP DATABASE mydb; > 4. Shutdown the database server with "pg_ctl stop". > 5. Recover the database cluster to the point where the base backup > completed, i.e., before dropping mydb. The contents of recovery.conf is: > restore_command = 'cp /arc/dir/%f %p' > recovery_target_timeline = 'latest' > recovery_target_time = 'STOP TIME recorded in the backup history file which > was created during base backup' For a second I wanted to say that it's a user error because you should just set recovery_target_lsn based on the END WAL location in the backup history file. Unfortunately recovery_target_lsn doesn't exist. It should. > I expected to be able to connect to mydb because I recovered to the point > before dropping mydb. However, I cannot connect to mydb because the > directory for mydb does not exist. The entry for mydb exists in > pg_database. > [Cause] > DROP DATABASE emits the below WAL records: > > 1. System catalog changes including deletion of a tuple for mydb in > pg_database > 2. Deletion of directories for the database > 3. Transaction commit > <Approach 1> > During recovery, when the WAL record for directory deletion is found, just > record that fact for later replay (in a hash table keyed by xid). When the > corresponding transaction commit record is found, replay the directory > deletion record. I think that's too much of a special case implementation. > <Approach 2> > Like the DROP TABLE/INDEX case, piggyback the directory deletion record on > the transaction commit record, and eliminate the directory deletion record > altogether. I don't think burdening commit records with that makes sense. It's just not a common enough case. What we imo could do would be to drop the tablespaces in a *separate* transaction *after* the transaction that removed the pg_tablespace entry. Then an "incomplete actions" logic similar to btree and gin could be used to remove the database directory if we crashed between the two transactions. SO: TXN1 does: * remove catalog entries * drop buffers * XLogInsert(XLOG_DBASE_DROP_BEGIN) TXN2: * remove_dbtablespaces * XLogInsert(XLOG_DBASE_DROP_FINISH) The RM_DBASE_ID resource manager would then grow a rm_cleanup callback (which would perform TXN2 if we failed inbetween) and a rm_safe_restartpoint which would prevent restartpoints from occuring on standby between both. The same should probably done for CREATE DATABASE because that currently can result in partially copied databases lying around. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
В списке pgsql-hackers по дате отправления: