Re: 'SGT DETAIL: Could not open file "pg_clog/05DC": No such file or directory' - what to do now?

Поиск
Список
Период
Сортировка
От Tomasz Chmielewski
Тема Re: 'SGT DETAIL: Could not open file "pg_clog/05DC": No such file or directory' - what to do now?
Дата
Msg-id 4DC19CD1.7070600@wpkg.org
обсуждение исходный текст
Ответ на Re: 'SGT DETAIL: Could not open file "pg_clog/05DC": No such file or directory' - what to do now?  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Ответы Re: 'SGT DETAIL: Could not open file "pg_clog/05DC": No such file or directory' - what to do now?  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Re: 'SGT DETAIL: Could not open file "pg_clog/05DC": No such file or directory' - what to do now?  (Scott Marlowe <scott.marlowe@gmail.com>)
Список pgsql-admin
On 04.05.2011 20:14, Kevin Grittner wrote:
> Tomasz Chmielewski<mangoo@wpkg.org>  wrote:
>
>> On 1st May, I saw this message in my postgres log:
>>
>> May  2 06:52:02 db10 postgres[3590]: [29829-1] 2011-05-02 06:52:02
>> SGT ERROR:  could not access status of transaction 1573786613
>> May  2 06:52:02 db10 postgres[3590]: [29829-2] 2011-05-02 06:52:02
>> SGT DETAIL:  Could not open file "pg_clog/05DC": No such file or
>> directory.
>> May  2 06:52:02 db10 postgres[3590]: [29829-3] 2011-05-02 06:52:02
>> SGT STATEMENT:  SELECT 1 FROM core_bill_id_seq FOR UPDATE
>
> You saw errors on the 1st dated for the 2nd?

My bad; it was 2nd, not 1st.


>> Now, I'm not sure what I should do about it. Database behaves
>> "funny", some inserts do not work.
>
> Define "funny".  What happens when you attempt the inserts which
> don't work.  (Copy and paste any error messages.)  Is it all tables?
> All inserts to one table?  Any other discernible pattern?

This repeated many times:

/var/log/postgresql/postgresql_log.1:May  3 18:24:49 db10 postgres[21363]: [26999-1] 2011-05-03 18:24:49 SGT ERROR:
couldnot access status of transaction 1573786613 
/var/log/postgresql/postgresql_log.1-May  3 18:24:49 db10 postgres[21363]: [26999-2] 2011-05-03 18:24:49 SGT DETAIL:
Couldnot open file "pg_clog/05DC": No such file or directory. 
/var/log/postgresql/postgresql_log.1-May  3 18:24:49 db10 postgres[21363]: [26999-3] 2011-05-03 18:24:49 SGT STATEMENT:
SELECT 1 FROM core_wot_seq FOR UPDATE 


Today I have this:

/var/log/postgresql/postgresql_log:May  4 22:43:44 db10 postgres[15773]: [555-1] 2011-05-04 22:43:44 SGT ERROR:  could
notaccess status of transaction 1612337841 
/var/log/postgresql/postgresql_log-May  4 22:43:44 db10 postgres[15773]: [555-2] 2011-05-04 22:43:44 SGT DETAIL:  Could
notopen file "pg_clog/0601": No such file or directory. 
/var/log/postgresql/postgresql_log-May  4 22:43:44 db10 postgres[15773]: [555-3] 2011-05-04 22:43:44 SGT STATEMENT:
SELECT1 FROM core_wbl_seq FOR UPDATE 

Only such two (different) occurrences; repeated 10-20 times; two different tables.

The system is used heavily, so it would show lots of other errors in other places if it was some major fault.
Which does not include some "minor" fault.


>> 4) I may have hardware problems - but this server is running for
>> almost 1 year now, is super stable - servers with hardware issues
>> are likely to show some issues as well
>
> Does the server have ECC memory?  Do you have SMART monitoring of
> the storage system, or something similar?  Any errors showing in any
> system logs?

No errors at all anywhere (dmesg, syslog etc.).

It's ProLiant DL180 G6, and I think it should have ECC. At least I see it being mentioned in dmidecode.

Assuming we can't determine what caused the corruption (bitflip, kernel bug, bad RAM, postgres bug, silent HDD error
etc.)- how should I best recover from this? 


--
Tomasz Chmielewski
http://wpkg.org

В списке pgsql-admin по дате отправления:

Предыдущее
От: "Kevin Grittner"
Дата:
Сообщение: Re: 'SGT DETAIL: Could not open file "pg_clog/05DC": No such file or directory' - what to do now?
Следующее
От: "Kevin Grittner"
Дата:
Сообщение: Re: 'SGT DETAIL: Could not open file "pg_clog/05DC": No such file or directory' - what to do now?