Re: production server down

Поиск
Список
Период
Сортировка
От Joe Conway
Тема Re: production server down
Дата
Msg-id 41BFD08A.5000501@joeconway.com
обсуждение исходный текст
Ответ на Re: production server down  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: production server down
Список pgsql-hackers
Tom Lane wrote:
>>...
>>pg_control last modified:             Tue Dec 14 15:39:26 2004
>>...
>>Time of latest checkpoint:            Tue Nov  2 17:05:32 2004
> 
> [ blink... ]  That seems like an unreasonable gap between checkpoints,
> especially for a production server.  Can you see an explanation?

Hmmm, this is even more scary. We have two database clusters on this 
server, one on /replica/pgdata, and one on /production/pgdata (ignore 
the names -- /replica is actually the "production" instance at the moment).

# pg_controldata /replica/pgdata
pg_control version number:            72
Catalog version number:               200310211
Database cluster state:               shutting down
pg_control last modified:             Tue Dec 14 15:39:26 2004
Current log file ID:                  0
Next log file segment:                1
Latest checkpoint location:           0/9B0B8C
Prior checkpoint location:            0/9AA1B4
Latest checkpoint's REDO location:    0/9B0B8C
Latest checkpoint's UNDO location:    0/0
Latest checkpoint's StartUpID:        12
Latest checkpoint's NextXID:          536
Latest checkpoint's NextOID:          17142
Time of latest checkpoint:            Tue Nov  2 17:05:32 2004
Database block size:                  8192
Blocks per segment of large relation: 131072
Maximum length of identifiers:        64
Maximum number of function arguments: 32
Date/time type storage:               64-bit integers
Maximum length of locale name:        128
LC_COLLATE:                           C
LC_CTYPE:                             C

# pg_controldata /production/pgdata
pg_control version number:            72
Catalog version number:               200310211
Database cluster state:               shutting down
pg_control last modified:             Tue Nov  2 21:57:49 2004
Current log file ID:                  0
Next log file segment:                1
Latest checkpoint location:           0/9B0B8C
Prior checkpoint location:            0/9AA1B4
Latest checkpoint's REDO location:    0/9B0B8C
Latest checkpoint's UNDO location:    0/0
Latest checkpoint's StartUpID:        12
Latest checkpoint's NextXID:          536
Latest checkpoint's NextOID:          17142
Time of latest checkpoint:            Tue Nov  2 17:05:32 2004
Database block size:                  8192
Blocks per segment of large relation: 131072
Maximum length of identifiers:        64
Maximum number of function arguments: 32
Date/time type storage:               64-bit integers
Maximum length of locale name:        128
LC_COLLATE:                           C
LC_CTYPE:                             C

I have no idea how this happened, but those look too similar except for 
the "last modified" date. The space used is quite what I'd expect:

# du -h --max-depth=1 /replica
403G    /replica/pgdata

# du -h --max-depth=1 /production
201G    /production/pgdata

The "/production/pgdata" cluster has not been in use since Nov 2. But 
we've been loading data aggressively into "/replica/pgdata".

Any theories on how we screwed up?

Joe


В списке pgsql-hackers по дате отправления:

Предыдущее
От: strk@refractions.net
Дата:
Сообщение: Re: [postgis-devel] RE: join selectivity
Следующее
От: Tom Lane
Дата:
Сообщение: Re: production server down