TOAST table repeatedly corrupted

Поиск
Список
Период
Сортировка
От Niles Oien
Тема TOAST table repeatedly corrupted
Дата
Msg-id CANQ3m6Nhq8VqtZ8SnC3ZLVdQf_1Upb1R+hsPpbFsS7fJOw+Jqg@mail.gmail.com
обсуждение исходный текст
Ответы TOAST table repeatedly corrupted  ("David G. Johnston" <david.g.johnston@gmail.com>)
Re: TOAST table repeatedly corrupted  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: TOAST table repeatedly corrupted  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-bugs

I have what I think is a bug in postgres. I'd like to help track it down, if that
is indeed what's going on.

I am running a reasonably recent version of postgres :

data=# select version();
                                                 version                                                  
----------------------------------------------------------------------------------------------------------
 PostgreSQL 9.5.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-17), 64-bit
(1 row)

About every six months, the logs show that vacuuming of
one of our tables (the same table every time) starts to fail :

 grep ERROR postgresql-09.csv

2018-05-09 16:14:03.834 GMT,,,27018,,5af31e4b.698a,1,,2018-05-09 16:14:03 GMT,12/611211,0,ERROR,XX001,"invalid page in block 1374551 of relation base/16384/36298640",,,,,"automatic vacuum of table ""data.pg_toast.pg_toast_36298637""",,,,""
2018-05-09 16:15:03.865 GMT,,,27696,,5af31e87.6c30,1,,2018-05-09 16:15:03 GMT,12/611219,0,ERROR,XX001,"invalid page in block 1374551 of relation base/16384/36298640",,,,,"automatic vacuum of table ""data.pg_toast.pg_toast_36298637""",,,,""
2018-05-09 16:16:03.897 GMT,,,31666,,5af31ec3.7bb2,1,,2018-05-09 16:16:03 GMT,10/1330957,0,ERROR,XX001,"invalid page in block 1374551 of relation base/16384/36298640",,,,,"automatic vacuum of table ""data.pg_toast.pg_toast_36298637""",,,,""


So it looks like a TOAST table has been corrupted. I can see which
table the TOAST table is supporting :

data=# select relname from pg_class where reltoastrelid = 'pg_toast.pg_toast_36298637'::regclass;
    relname    
---------------
 rdvtrack_fd05
(1 row)


And sure enough, I now cannot dump that table :

linux> pg_dump -p5432 -Upostgres -t hmi.rdvtrack_fd05 data > hmi_rdvtrack_fd05.sql
pg_dump: Dumping the contents of table "rdvtrack_fd05" failed: PQgetResult() failed.
pg_dump: Error message from server: ERROR:  compressed data is corrupted
pg_dump: The command was: COPY hmi.rdvtrack_fd05 (recnum, sunum, slotnum, sessionid, sessionns, cparms_sg000, v_bzero, v_bscale, carrrot, cmlon, lonhg, lathg, loncm, midtime, crval1, crval2, crval3, t_start, t_stop, lonspan, coverage, zonalvel, mai, module, source, input, created, bld_vers, backgrnd, rejectlist, cmlon_index, loncm_index, lathg_index, lonhg_index, sg_000_file, sg_001_file, history, datavals, missvals, datamin, datamax, datamean, datarms, dataskew, datakurt, meanmu, meanlat) TO stdout;


I think I can see the disk files where the tables are stored :

data=# select pg_relation_filepath('hmi.rdvtrack_fd05');
 pg_relation_filepath 
----------------------
 base/16384/36298637
(1 row)

data=# select pg_relation_filepath('pg_toast.pg_toast_36298637');
 pg_relation_filepath 
----------------------
 base/16384/36298640
(1 row)


I can run pg_filedump on those two files, in both cases all
the flags seem unremarkable (although I may not be knowing
what to look for) and it runs to the end of the file, giving
me a message like :

*** End of File Encountered. Last Block Read: 131071 ***

I think this is probably a bug? Every time it happens
it affects the same table, hmi.rdvtrack_fd05. The hardware
we're running on is fairly new, disks and memory are
not throwing errors, and we've never had problems with any
other tables. We're running on CentOS 6.8 - again, not brand
new but hardly decrepit, either. I've rebuilt the table
several times. This time I'd like to give the PG community
a chance to chase it down before I do that.

I've put the two data files, the postgres config and
essentially what I've said here online at :



If it helps, the TOAST table looks like this :

data=# \d pg_toast.pg_toast_36298637
TOAST table "pg_toast.pg_toast_36298637"
   Column   |  Type   
------------+---------
 chunk_id   | oid
 chunk_seq  | integer
 chunk_data | bytea


And the table the TOAST table is supporting looks like this :

data=# \d hmi.rdvtrack_fd05
          Table "hmi.rdvtrack_fd05"
    Column    |       Type       | Modifiers 
--------------+------------------+-----------
 recnum       | bigint           | not null
 sunum        | bigint           | 
 slotnum      | integer          | 
 sessionid    | bigint           | 
 sessionns    | text             | 
 cparms_sg000 | text             | 
 v_bzero      | double precision | 
 v_bscale     | double precision | 
 carrrot      | integer          | 
 cmlon        | real             | 
 lonhg        | real             | 
 lathg        | real             | 
 loncm        | real             | 
 midtime      | double precision | 
 crval1       | real             | 
 crval2       | real             | 
 crval3       | double precision | 
 t_start      | double precision | 
 t_stop       | double precision | 
 lonspan      | real             | 
 coverage     | real             | 
 zonalvel     | real             | 
 mai          | real             | 
 module       | text             | 
 source       | text             | 
 input        | text             | 
 created      | double precision | 
 bld_vers     | text             | 
 backgrnd     | text             | 
 rejectlist   | text             | 
 cmlon_index  | bigint           | 
 loncm_index  | bigint           | 
 lathg_index  | bigint           | 
 lonhg_index  | bigint           | 
 sg_000_file  | text             | 
 sg_001_file  | text             | 
 history      | text             | 
 datavals     | integer          | 
 missvals     | integer          | 
 datamin      | real             | 
 datamax      | real             | 
 datamean     | double precision | 
 datarms      | double precision | 
 dataskew     | double precision | 
 datakurt     | double precision | 
 meanmu       | real             | 
 meanlat      | real             | 
Indexes:
    "rdvtrack_fd05_pkey" PRIMARY KEY, btree (recnum)
    "rdvtrack_fd05_carrrot" btree (carrrot)
    "rdvtrack_fd05_cmlon_index" btree (cmlon_index)
    "rdvtrack_fd05_lathg_index" btree (lathg_index)
    "rdvtrack_fd05_loncm_index" btree (loncm_index)
    "rdvtrack_fd05_lonhg_index" btree (lonhg_index)

Niles Oien, noien@nso.edu




--
Niles Oien, National Solar Observatory, Boulder Colorado USA

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Marc Van Olmen
Дата:
Сообщение: Re: BUG #15190: Build configure should have option to disableclock_gettime on MacOS 10.12 and up.
Следующее
От: "David G. Johnston"
Дата:
Сообщение: TOAST table repeatedly corrupted