Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again

Поиск
Список
Период
Сортировка
От Maxim Boguk
Тема Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again
Дата
Msg-id CAK-MWwT_5N1jwSH8F84ZD2iPrPCuuBoA0Tu-Ujw8+SXMg1OELQ@mail.gmail.com
обсуждение исходный текст
Ответы Re: Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
One of servers under my support 2 days ago produced the next error:

ERROR:  could not read block 135 in file "base/16404/118881486": read only 0 of 8192 bytes

Server version 9.0.6. No db or server crashes or db recovery happen on that server since setup.
Server is db backend for very large web project under quite a heavy load (10k+ request per second to db).
No other strange errors found in the logs since server was put to production half year ago.

Now what happened (from DB log):

2012-02-17 22:35:58 MSK 14333 [vxid:340/18229999 txid:2341883282] [DELETE] LOG:  duration: 5.669 ms  execute <unnamed>: delete from agency_statistics
2012-02-17 22:35:58 MSK 14333 [vxid:340/18230000 txid:0] [BIND] LOG:  duration: 2.787 ms  bind <unnamed>: insert into agency_statistics (employer_id, area_id, area_pa
th, professional_area_id, vacancies_number) select e.employer_id, vb.area_id, a.path, s.professional_area_id, count(distinct v.vacancy_id) from vacancy v inner join employer e on v.employer
_id = e.employer_id inner join vacancy_body vb on v.vacancy_body_id = vb.vacancy_body_id inner join vacancy_body_specialization vs on vb.vacancy_body_id = vs.vacancy_body_id inner join spec
ialization s on s.specialization_id = vs.specialization_id inner join area a on vb.area_id = a.area_id where v.archived_status in (0) and v.disabled = false and e.category = $1 and e.state
= $2 and e.manager_id > 0 group by e.employer_id, vb.area_id, a.path, s.professional_area_id
2012-02-17 22:35:59 MSK 14333 [vxid:340/18230000 txid:2341883336] [INSERT] LOG:  duration: 726.700 ms  execute <unnamed>: insert into agency_statistics (employer_id,
area_id, area_path, professional_area_id, vacancies_number) select e.employer_id, vb.area_id, a.path, s.professional_area_id, count(distinct v.vacancy_id) from vacancy v inner join employer
 e on v.employer_id = e.employer_id inner join vacancy_body vb on v.vacancy_body_id = vb.vacancy_body_id inner join vacancy_body_specialization vs on vb.vacancy_body_id = vs.vacancy_body_id
 inner join specialization s on s.specialization_id = vs.specialization_id inner join area a on vb.area_id = a.area_id where v.archived_status in (0) and v.disabled = false and e.category =
 $1 and e.state = $2 and e.manager_id > 0 group by e.employer_id, vb.area_id, a.path, s.professional_area_id
2012-02-17 22:35:59 MSK 14333 [vxid:340/18230070 txid:0] [BIND] LOG:  duration: 30.195 ms  bind S_35: insert into agency_statistics (employer_id, area_id, area_path,
professional_area_id, vacancies_number, rank) values ($1, $2, $3, $4, $5, 0)
2012-02-17 22:35:59 MSK 14333 [vxid:340/18230082 txid:2341883427] [INSERT] ERROR:  could not read block 135 in file "base/16404/118881486": read only 0 of 8192 bytes
2012-02-17 22:35:59 MSK 14333 [vxid:340/18230082 txid:2341883427] [INSERT] STATEMENT:  insert into agency_statistics (employer_id, area_id, area_path, professional_area_id, vacancies_number, rank) values ($1, $2, $3, $4, $5, 0)

No concurrent writes was happened to that table outside of that connection.
May be autovacuum were fired on that table between delete and first or second inserts (unfortunately it wasn't logged).

I recreated table from scratch and keep the damaged table under another name (through alter table agency_statistics rename to agency_statistics_old).
So I have files to dig into.

What I see in file system:
hh=# SELECT relfilenode from pg_class where relname='agency_statistics_old';
 relfilenode
-------------
   118881486
postgres@db10:~/tmp$ ls -la /var/lib/postgresql/9.0/main/base/16404/118881486
-rw------- 1 postgres postgres 0 2012-02-20 12:04 /var/lib/postgresql/9.0/main/base/16404/118881486

So table file size zero bytes (seems autovacuum truncated that table to 0 bytes).

But index show:
hh=#  SELECT relfilenode from pg_class where relname='agency_statistics_pkey';
 relfilenode
-------------
   118881489
postgres@db10:~/tmp$ ls -la /var/lib/postgresql/9.0/main/base/16404/118881489
-rw------- 1 postgres postgres 4956160 2012-02-20 12:07 /var/lib/postgresql/9.0/main/base/16404/118881489

hh=# SELECT * from pgstatindex('agency_statistics_pkey');
-[ RECORD 1 ]------+--------
version            | 2
tree_level         | 2
index_size         | 4947968
root_block_no      | 295
internal_pages     | 1
leaf_pages         | 601
empty_pages        | 0
deleted_pages      | 1
avg_leaf_density   | 0.45
leaf_fragmentation | 13.14

I out of ideas now.
Any suggestions where and what I should look next?

Kind Regards,
Maksym

--
Maxim Boguk
Senior Postgresql DBA.

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

Skype: maxim.boguk
Jabber: maxim.boguk@gmail.com

LinkedIn profile: http://nz.linkedin.com/in/maximboguk
"If they can send one man to the moon... why can't they send them all?"

МойКруг: http://mboguk.moikrug.ru/
"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."


В списке pgsql-general по дате отправления:

Предыдущее
От: John R Pierce
Дата:
Сообщение: Re: Question on Rules
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Unfamous 'could not read block ... in file "...": read only 0 of 8192 bytes' again