Re: Allow WAL information to recover corrupted pg_controldata
| От | Amit kapila | 
|---|---|
| Тема | Re: Allow WAL information to recover corrupted pg_controldata | 
| Дата | |
| Msg-id | 6C0B27F7206C9E4CA54AE035729E9C382850626B@szxeml509-mbx обсуждение исходный текст | 
| Ответ на | Re: Allow WAL information to recover corrupted pg_controldata (Cédric Villemain <cedric@2ndquadrant.com>) | 
| Ответы | Re: Allow WAL information to recover corrupted pg_controldata | 
| Список | pgsql-hackers | 
> > > I guess my first question is: why do we need this? There are lots of > > > things in the TODO list that someone wanted once upon a time, but > > > they're not all actually important. Do you have reason to believe > > > that this one is? It's been six years since that email, so it's worth > > > asking if this is actually relevant. > >> As far as I know the pg_control is not WAL protected, which means if it >> gets corrupt due >> to any reason (disk crash during flush, so written partially), it might >> lead to failure in recovery of database. > AFAIR pg_controldata fit on a disk sector so it can not be half written. It can be corrupt due to some other reasons aswell like torn disk sector. As already pg_resetxlog has a mechanism to recover corrupt pg_control file, so it is alreadyconsidered that it can be corrupt in some case.The suggested patch improves the logic to recover corrupt control file.So that is the reason I felt it will be relevant to do this patch. ________________________________________ From: Cédric Villemain [cedric@2ndquadrant.com] Sent: Saturday, June 16, 2012 2:19 AM To: pgsql-hackers@postgresql.org Cc: Amit kapila; 'Robert Haas' Subject: Re: [HACKERS] Allow WAL information to recover corrupted pg_controldata Le vendredi 15 juin 2012 03:27:11, Amit Kapila a écrit : > > I guess my first question is: why do we need this? There are lots of > > things in the TODO list that someone wanted once upon a time, but > > they're not all actually important. Do you have reason to believe > > that this one is? It's been six years since that email, so it's worth > > asking if this is actually relevant. > > As far as I know the pg_control is not WAL protected, which means if it > gets corrupt due > to any reason (disk crash during flush, so written partially), it might > lead to failure in recovery of database. AFAIR pg_controldata fit on a disk sector so it can not be half written. > So user can use pg_resetxlog to recover the database. Currently > pg_resetxlog works on guessed values for pg_control. > However this implementation can improve the logic that instead of guessing, > it can try to regenerate the values from > WAL. > This implementation can allow better recovery in certain circumstances. > > > The deadline for patches for this CommitFest is today, so I think you > > should target any work you're starting now for the NEXT CommitFest. > > Oh, I am sorry, as this was my first time I was not fully aware of the > deadline. > > However I still seek your opinion whether it makes sense to work on this > feature. > > > -----Original Message----- > From: Robert Haas [mailto:robertmhaas@gmail.com] > Sent: Friday, June 15, 2012 12:40 AM > To: Amit Kapila > Cc: pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Allow WAL information to recover corrupted > pg_controldata > > On Thu, Jun 14, 2012 at 11:39 AM, Amit Kapila <amit.kapila@huawei.com> > > wrote: > > I am planning to work on the below Todo list item for this CommitFest > > Allow WAL information to recover corrupted pg_controldata > > http://archives.postgresql.org/pgsql-patches/2006-06/msg00025.php > > The deadline for patches for this CommitFest is today, so I think you > should target any work you're starting now for the NEXT CommitFest. > > > I wanted to confirm my understanding about the work involved for this > > patch: > > The existing patch has following set of problems: > > 1. Memory leak and linked list code path is not proper > > 2. lock check for if the server is already running, is removed in > > patch which needs to be reverted > > 3. Refactoring of the code. > > > > Apart from above what I understood from the patch is that its intention > > is to generate values for ControlFile using WAL logs when -r option is > > used. > > > > The change in algorithm from current will be if control file is corrupt > > which essentialy means ReadControlFile() will return False, then it > > should generate values (checkPointCopy, checkPoint, prevCheckPoint, > > state) using WAL if -r option is enabled. > > > > Also for -r option, it doesn't need to call function FindEndOfXLOG() as > > the > > > that work will be achieved by above point. > > > > It will just rewrite the control file and don’t do other resets. > > > > > > The algorithm of restoring the pg_control value from old xlog file: > > 1. Retrieve all of the active xlog files from xlog direcotry into a > > list > > > by increasing order, according their timeline, log id, segment id. > > 2. Search the list to find the oldest xlog file of the lastest time > > line. > > > 3. Search the records from the oldest xlog file of latest time line to > > the latest xlog file of latest time line, if the checkpoint record > > has been found, update the latest checkpoint and previous > > checkpoint. > > > Apart from above some changes in code will be required after the Xlog > > patch > > > by Heikki. > > > > Suggest me if my understanding is correct? > > I guess my first question is: why do we need this? There are lots of > things in the TODO list that someone wanted once upon a time, but > they're not all actually important. Do you have reason to believe > that this one is? It's been six years since that email, so it's worth > asking if this is actually relevant. -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation
В списке pgsql-hackers по дате отправления: