Обсуждение: unexpected chunk number 2 (expected 0) for toast value ... inpg_toast_18536
unexpected chunk number 2 (expected 0) for toast value ... inpg_toast_18536
От
Karsten Hilbert
Дата:
Dear community, we are seeing the below error on PG 9.6.16 on Debian: (different issue from the psycopg2 one recently posted by me) > /home/marc# pg_dump -p 5432 --username=gm-dbo --dbname=gnumed_v22 --compress=0 --no-sync --format=custom --file=/dev/null > pg_dump: Ausgabe des Inhalts der Tabelle »doc_obj« fehlgeschlagen: PQgetResult() fehlgeschlagen. > pg_dump: Fehlermeldung vom Server: ERROR: unexpected chunk number 2 (expected 0) for toast value 99027 in pg_toast_18536 > pg_dump: Die Anweisung war: COPY blobs.doc_obj (pk, fk_doc, seq_idx, comment, fk_intended_reviewer, data, filename) TOstdout; (to note: column "data" is of type BYTEA) We have been able to identify the row (there may be more) in blobs.doc_obj which leads to the above error. blobs.doc_obj.pk -> 82224 We have ruled out (?) below-PG hardware problems by a successful run of: cp -rv —preserve=all /var/lib/postgresql/9.6 /tmp/ We then tried gnumed_v22=# REINDEX TABLE pg_toast.pg_toast_18536; REINDEX gnumed_v22=# REINDEX TABLE blobs.doc_obj ; REINDEX gnumed_v22=# VACUUM ANALYZE pg_toast.pg_toast_18536; VACUUM gnumed_v22=# VACUUM FULL pg_toast.pg_toast_18536; VACUUM gnumed_v22=# VACUUM ANALYZE blobs.doc_obj ; VACUUM gnumed_v22=# VACUUM FULL blobs.doc_obj ; ERROR: unexpected chunk number 2 (expected 0) for toast value 99027 in pg_toast_18536 We then tried to DELETE the offending row delete from blobs.doc_obj where pk = 82224; but that, again, shows the "unexpected chunk" problem. Now, what else can we try to address the problem short of doing the pg_dump --exclude-table-data=blobs.doc_obj judicious use of COPY-FROM-with-subselect from blobs.doc_obj restore dance ? Many thanks, Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
Re: unexpected chunk number 2 (expected 0) for toast value ... inpg_toast_18536
От
Karsten Hilbert
Дата:
On Sun, Mar 15, 2020 at 07:23:49PM +0100, Karsten Hilbert wrote: > We then tried to DELETE the offending row > > delete from blobs.doc_obj where pk = 82224; > > but that, again, shows the "unexpected chunk" problem. According to http://www.databasesoup.com/2013/10/de-corrupting-toast-tables.html an UPDATE of the row is recommended -- should that work better than a DELETE ? I can't find documentation pointing to a fundamental implementation difference that suggests so. Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
Re: unexpected chunk number 2 (expected 0) for toast value ... inpg_toast_18536
От
Adrian Klaver
Дата:
On 3/15/20 12:20 PM, Karsten Hilbert wrote: > On Sun, Mar 15, 2020 at 07:23:49PM +0100, Karsten Hilbert wrote: > >> We then tried to DELETE the offending row >> >> delete from blobs.doc_obj where pk = 82224; >> >> but that, again, shows the "unexpected chunk" problem. > > According to > > http://www.databasesoup.com/2013/10/de-corrupting-toast-tables.html > > an UPDATE of the row is recommended -- should that work > better than a DELETE ? > > I can't find documentation pointing to a fundamental > implementation difference that suggests so. https://www.postgresql.org/docs/12/storage-toast.html#STORAGE-TOAST-ONDISK "During an UPDATE operation, values of unchanged fields are normally preserved as-is; so an UPDATE of a row with out-of-line values incurs no TOAST costs if none of the out-of-line values change." > > Karsten > -- > GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B > > -- Adrian Klaver adrian.klaver@aklaver.com
Re: unexpected chunk number 2 (expected 0) for toast value ... inpg_toast_18536
От
Karsten Hilbert
Дата:
On Sun, Mar 15, 2020 at 12:58:53PM -0700, Adrian Klaver wrote: > > > We then tried to DELETE the offending row > > > > > > delete from blobs.doc_obj where pk = 82224; > > > > > > but that, again, shows the "unexpected chunk" problem. > > > > According to > > > > http://www.databasesoup.com/2013/10/de-corrupting-toast-tables.html > > > > an UPDATE of the row is recommended -- should that work > > better than a DELETE ? > > > > I can't find documentation pointing to a fundamental > > implementation difference that suggests so. > > https://www.postgresql.org/docs/12/storage-toast.html#STORAGE-TOAST-ONDISK > > "During an UPDATE operation, values of unchanged fields are normally > preserved as-is; so an UPDATE of a row with out-of-line values incurs no > TOAST costs if none of the out-of-line values change." However, where is the fault in my thinking ? -> An UPDATE actually *would* change the TOASTed BYTEA field (which is corrupt). I had hoped that the DELETE would NOT have to touch the TOAST table at all (and thereby not check the chunks) as "all it needs to do" is mark the row in the *primary* table as not-needed-anymore. I must be misunderstanding something. Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
Have you tried to reindex the table? Toast internally forces an index scan, so missing index tuples or an otherwise corrupted toast index would have the same symptoms as toast chunks actually missing.
Regards, Jan
On Sun, Mar 15, 2020, 16:21 Karsten Hilbert <Karsten.Hilbert@gmx.net> wrote:
On Sun, Mar 15, 2020 at 12:58:53PM -0700, Adrian Klaver wrote:
> > > We then tried to DELETE the offending row
> > >
> > > delete from blobs.doc_obj where pk = 82224;
> > >
> > > but that, again, shows the "unexpected chunk" problem.
> >
> > According to
> >
> > http://www.databasesoup.com/2013/10/de-corrupting-toast-tables.html
> >
> > an UPDATE of the row is recommended -- should that work
> > better than a DELETE ?
> >
> > I can't find documentation pointing to a fundamental
> > implementation difference that suggests so.
>
> https://www.postgresql.org/docs/12/storage-toast.html#STORAGE-TOAST-ONDISK
>
> "During an UPDATE operation, values of unchanged fields are normally
> preserved as-is; so an UPDATE of a row with out-of-line values incurs no
> TOAST costs if none of the out-of-line values change."
However, where is the fault in my thinking ?
-> An UPDATE actually *would* change the TOASTed BYTEA field (which is corrupt).
I had hoped that the DELETE would NOT have to touch the TOAST
table at all (and thereby not check the chunks) as "all it
needs to do" is mark the row in the *primary* table as
not-needed-anymore.
I must be misunderstanding something.
Karsten
--
GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
Re: unexpected chunk number 2 (expected 0) for toast value ... inpg_toast_18536
От
Karsten Hilbert
Дата:
On Sun, Mar 15, 2020 at 05:04:06PM -0400, Jan Wieck wrote: > Have you tried to reindex the table? Toast internally forces an index scan, > so missing index tuples or an otherwise corrupted toast index would have > the same symptoms as toast chunks actually missing. We sure did, but thanks for reminding. Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
Re: unexpected chunk number 2 (expected 0) for toast value ... inpg_toast_18536
От
Adrian Klaver
Дата:
On 3/15/20 1:21 PM, Karsten Hilbert wrote: > On Sun, Mar 15, 2020 at 12:58:53PM -0700, Adrian Klaver wrote: > >>>> We then tried to DELETE the offending row >>>> >>>> delete from blobs.doc_obj where pk = 82224; >>>> >>>> but that, again, shows the "unexpected chunk" problem. >>> >>> According to >>> >>> http://www.databasesoup.com/2013/10/de-corrupting-toast-tables.html >>> >>> an UPDATE of the row is recommended -- should that work >>> better than a DELETE ? >>> >>> I can't find documentation pointing to a fundamental >>> implementation difference that suggests so. >> >> https://www.postgresql.org/docs/12/storage-toast.html#STORAGE-TOAST-ONDISK >> >> "During an UPDATE operation, values of unchanged fields are normally >> preserved as-is; so an UPDATE of a row with out-of-line values incurs no >> TOAST costs if none of the out-of-line values change." > > However, where is the fault in my thinking ? > > -> An UPDATE actually *would* change the TOASTed BYTEA field (which is corrupt). > > I had hoped that the DELETE would NOT have to touch the TOAST > table at all (and thereby not check the chunks) as "all it > needs to do" is mark the row in the *primary* table as > not-needed-anymore. > > I must be misunderstanding something. Except it would also need to delete the toast entries as well. > > Karsten > -- > GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B > > -- Adrian Klaver adrian.klaver@aklaver.com
Re: unexpected chunk number 2 (expected 0) for toast value ... inpg_toast_18536
От
Karsten Hilbert
Дата:
On Sun, Mar 15, 2020 at 02:35:39PM -0700, Adrian Klaver wrote: > On 3/15/20 1:21 PM, Karsten Hilbert wrote: > > On Sun, Mar 15, 2020 at 12:58:53PM -0700, Adrian Klaver wrote: > > > > > > > We then tried to DELETE the offending row > > > > > > > > > > delete from blobs.doc_obj where pk = 82224; > > > > > > > > > > but that, again, shows the "unexpected chunk" problem. > > > > > > > > According to > > > > > > > > http://www.databasesoup.com/2013/10/de-corrupting-toast-tables.html > > > > > > > > an UPDATE of the row is recommended -- should that work > > > > better than a DELETE ? > > > > > > > > I can't find documentation pointing to a fundamental > > > > implementation difference that suggests so. > > > > > > https://www.postgresql.org/docs/12/storage-toast.html#STORAGE-TOAST-ONDISK > > > > > > "During an UPDATE operation, values of unchanged fields are normally > > > preserved as-is; so an UPDATE of a row with out-of-line values incurs no > > > TOAST costs if none of the out-of-line values change." > > > > However, where is the fault in my thinking ? > > > > -> An UPDATE actually *would* change the TOASTed BYTEA field (which is corrupt). > > > > I had hoped that the DELETE would NOT have to touch the TOAST > > table at all (and thereby not check the chunks) as "all it > > needs to do" is mark the row in the *primary* table as > > not-needed-anymore. > > > > I must be misunderstanding something. > > Except it would also need to delete the toast entries as well. OK, got that. What I now don't understand is how the UPDATE won't have to touch the TOAST table when the TOASTed value *is* UPDATEd: update blobs.doc_obj set data = '' where pk = the_faulty_row; (data is the BYTEA column) Slightly confused :-) Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
Karsten Hilbert <Karsten.Hilbert@gmx.net> writes: >>> According to >>> http://www.databasesoup.com/2013/10/de-corrupting-toast-tables.html >>> an UPDATE of the row is recommended -- should that work >>> better than a DELETE ? > OK, got that. What I now don't understand is how the UPDATE > won't have to touch the TOAST table when the TOASTed value > *is* UPDATEd: > update blobs.doc_obj set data = '' where pk = the_faulty_row; > (data is the BYTEA column) It makes no sense to me either; I wonder if Josh's recipe ever really worked? But it's clearly not working now, and that's what I'd expect, because any mechanism for removing the busted toast reference is going to cause the system to try to mark the toast rows deleted. Since you reindexed the toast table and it still doesn't find the missing chunks, I think the easiest "fix" would be to manually insert rows with the correct chunk_id and chunk_seq, and ideally with chunk_data of the appropriate length. Then deletion of the reference should work. Unfortunately, it seems like you can't do that either, short of hacking up the backend or writing some custom C code, because the executor won't let you open a toast table as result relation :-(. I wonder if we should change it to allow that when allow_system_table_mods is true? This isn't the first time we've seen people need to be able to do surgery on a toast table. regards, tom lane
Re: unexpected chunk number 2 (expected 0) for toast value ... inpg_toast_18536
От
Andres Freund
Дата:
Hi, On 2020-03-15 19:23:49 +0100, Karsten Hilbert wrote: > > /home/marc# pg_dump -p 5432 --username=gm-dbo --dbname=gnumed_v22 --compress=0 --no-sync --format=custom --file=/dev/null > > pg_dump: Ausgabe des Inhalts der Tabelle »doc_obj« fehlgeschlagen: PQgetResult() fehlgeschlagen. > > pg_dump: Fehlermeldung vom Server: ERROR: unexpected chunk number 2 (expected 0) for toast value 99027 in pg_toast_18536 > > pg_dump: Die Anweisung war: COPY blobs.doc_obj (pk, fk_doc, seq_idx, comment, fk_intended_reviewer, data, filename) TOstdout; > > (to note: column "data" is of type BYTEA) > > We have been able to identify the row (there may be more) > in blobs.doc_obj which leads to the above error. > > blobs.doc_obj.pk -> 82224 > > We have ruled out (?) below-PG hardware problems by a > successful run of: > > cp -rv —preserve=all /var/lib/postgresql/9.6 /tmp/ FWIW, I don't think that rules out hardware problems at all. In plenty cases of corruption you can just end up with corrupted on-disk data (swapped blocks, zeroed blocks, randomly different values ...). But obviously it is not at all guaranteed that is the case. Could you describe the "history" of the database? Replication set up, failovers, etc? > Now, what else can we try to address the problem short of > doing the > > pg_dump --exclude-table-data=blobs.doc_obj > > judicious use of COPY-FROM-with-subselect from blobs.doc_obj > > restore > > dance ? A plpgsql function that returns the rows one-by-one and catches the exception is probably your best bet. It could roughly look something like: CREATE OR REPLACE FUNCTION salvage(p_tblname regclass) RETURNS SETOF text LANGUAGE plpgsql AS $$ DECLARE v_row record; BEGIN FOR v_row IN EXECUTE 'SELECT * FROM '||p_tblname::text LOOP BEGIN -- this forces detoasting RETURN NEXT v_row::text; EXCEPTION WHEN internal_error OR data_corrupted OR index_corrupted THEN -- add pkey or something else RAISE NOTICE 'failed to return data'; END; END LOOP; END $$ should work. You can call it like SELECT (salvaged_rec.rec).* FROM (SELECT salvaged_text::salvage_me FROM salvage('salvage_me') AS salvaged_text) AS salvaged_rec(rec) Greetings, Andres Freund
Re: unexpected chunk number 2 (expected 0) for toast value ... inpg_toast_18536
От
Andres Freund
Дата:
Hi, On 2020-03-15 20:11:18 -0400, Tom Lane wrote: > Unfortunately, it seems like you can't do that either, short of > hacking up the backend or writing some custom C code, because the > executor won't let you open a toast table as result relation :-(. > I wonder if we should change it to allow that when > allow_system_table_mods is true? This isn't the first time we've > seen people need to be able to do surgery on a toast table. I'd be mildly in favor. But it's considerably more than just the executor check that'd need to change. We don't the right thing for toast relations in plenty places right now, because we just check for RELKIND_RELATION - which will break junkvars etc. Greetings, Andres Freund
Re: unexpected chunk number 2 (expected 0) for toast value ... inpg_toast_18536
От
Andres Freund
Дата:
Hi, On 2020-03-16 12:44:53 -0700, Andres Freund wrote: > On 2020-03-15 20:11:18 -0400, Tom Lane wrote: > > Unfortunately, it seems like you can't do that either, short of > > hacking up the backend or writing some custom C code, because the > > executor won't let you open a toast table as result relation :-(. > > I wonder if we should change it to allow that when > > allow_system_table_mods is true? This isn't the first time we've > > seen people need to be able to do surgery on a toast table. > > I'd be mildly in favor. But it's considerably more than just the > executor check that'd need to change. We don't the right thing for toast > relations in plenty places right now, because we just check for > RELKIND_RELATION - which will break junkvars etc. Hm, and I wonder if there could be problems with HeapTupleSatisfiesToast() too? It doesn't really forsee much DML being done. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2020-03-16 12:44:53 -0700, Andres Freund wrote: >> On 2020-03-15 20:11:18 -0400, Tom Lane wrote: >>> I wonder if we should change it to allow that when >>> allow_system_table_mods is true? This isn't the first time we've >>> seen people need to be able to do surgery on a toast table. >> I'd be mildly in favor. But it's considerably more than just the >> executor check that'd need to change. We don't the right thing for toast >> relations in plenty places right now, because we just check for >> RELKIND_RELATION - which will break junkvars etc. > Hm, and I wonder if there could be problems with > HeapTupleSatisfiesToast() too? It doesn't really forsee much DML being > done. We've always allowed people to select from toast tables, so if there are planner or executor problems with that, I'd think they'd mostly be bugs that need fixed anyway. Your point about HeapTupleSatisfiesToast is better though. Actually though ... now that I look at the code, I don't understand why tuple deletion would provoke any errors. toast_delete_datum() is not picky about whether it finds consecutive chunk_seq values, or indeed any matching rows at all. So now I think that Karsten's problem traces to writing the query in such a way that something thinks it needs to fetch the pre-update or pre-delete toasted value. Or maybe the query as such is fine, and the problem stems from something like a trigger or logical replication that's trying to fetch the old value? regards, tom lane
Re: unexpected chunk number 2 (expected 0) for toast value ... in pg_toast_18536
От
Andres Freund
Дата:
Hi, On March 16, 2020 1:22:18 PM PDT, Tom Lane <tgl@sss.pgh.pa.us> wrote: >Andres Freund <andres@anarazel.de> writes: >> On 2020-03-16 12:44:53 -0700, Andres Freund wrote: >>> On 2020-03-15 20:11:18 -0400, Tom Lane wrote: >>>> I wonder if we should change it to allow that when >>>> allow_system_table_mods is true? This isn't the first time we've >>>> seen people need to be able to do surgery on a toast table. > >>> I'd be mildly in favor. But it's considerably more than just the >>> executor check that'd need to change. We don't the right thing for >toast >>> relations in plenty places right now, because we just check for >>> RELKIND_RELATION - which will break junkvars etc. > >> Hm, and I wonder if there could be problems with >> HeapTupleSatisfiesToast() too? It doesn't really forsee much DML >being >> done. > >We've always allowed people to select from toast tables, so if there >are planner or executor problems with that, I'd think they'd mostly be >bugs that need fixed anyway. Your point about HeapTupleSatisfiesToast >is better though. The logic to add/extract junkvars for updated/deleted tables, as well as other parts of the modification code paths, weren'texposed so far though. I've tried allowing updates/deletes before (at least deletes are needed to e.g handle duplicate values), I'm fairly confidentthat the junkvar issue is real. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: unexpected chunk number 2 (expected 0) for toast value ... inpg_toast_18536
От
Karsten Hilbert
Дата:
On Sun, Mar 15, 2020 at 08:11:18PM -0400, Tom Lane wrote: > Karsten Hilbert <Karsten.Hilbert@gmx.net> writes: > >>> According to > >>> http://www.databasesoup.com/2013/10/de-corrupting-toast-tables.html > >>> an UPDATE of the row is recommended -- should that work > >>> better than a DELETE ? > > > OK, got that. What I now don't understand is how the UPDATE > > won't have to touch the TOAST table when the TOASTed value > > *is* UPDATEd: > > update blobs.doc_obj set data = '' where pk = the_faulty_row; > > (data is the BYTEA column) > > It makes no sense to me either; I wonder if Josh's recipe ever > really worked? But it's clearly not working now, and that's > what I'd expect, because any mechanism for removing the busted > toast reference is going to cause the system to try to mark > the toast rows deleted. > > Since you reindexed the toast table and it still doesn't find > the missing chunks, The user has reported that gratuitious and repeated use of REINDEX/VACUUM has eventually led to a consistent database. That one row went missing but can be re-created. Unfortunately, I neither have the original data for testing (it is a medical record database and the client won't hand out copies for obvious reasons) nor can I ascertain the exact order of steps they eventually took. For the record. Thanks, Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
Re: unexpected chunk number 2 (expected 0) for toast value ... inpg_toast_18536
От
Karsten Hilbert
Дата:
On Mon, Mar 16, 2020 at 12:38:35PM -0700, Andres Freund wrote: > > We have ruled out (?) below-PG hardware problems by a > > successful run of: > > > > cp -rv —preserve=all /var/lib/postgresql/9.6 /tmp/ > > FWIW, I don't think that rules out hardware problems at all. In plenty > cases of corruption you can just end up with corrupted on-disk data > (swapped blocks, zeroed blocks, randomly different values ...). ... hence the (?) ... > But obviously it is not at all guaranteed that is the case. Could you > describe the "history" of the database? Replication set up, failovers, > etc? No replication, no failovers. There may have been hard shutdowns as in power failure but there's no history of that to relate. > A plpgsql function that returns the rows one-by-one and catches the > exception is probably your best bet. We have done that (in Python) for good measure during recovery procedures. > It could roughly look something like: > > CREATE OR REPLACE FUNCTION salvage(p_tblname regclass) > RETURNS SETOF text > LANGUAGE plpgsql AS > $$ > DECLARE > v_row record; > BEGIN > FOR v_row IN EXECUTE 'SELECT * FROM '||p_tblname::text LOOP > BEGIN > -- this forces detoasting > RETURN NEXT v_row::text; > EXCEPTION WHEN internal_error OR data_corrupted OR index_corrupted THEN > -- add pkey or something else > RAISE NOTICE 'failed to return data'; > END; > END LOOP; > END > $$ > > should work. You can call it like > SELECT (salvaged_rec.rec).* FROM (SELECT salvaged_text::salvage_me FROM salvage('salvage_me') AS salvaged_text) AS salvaged_rec(rec) Thanks for taking the time. Would something like this be a useful addition to the adminpack extension ? Best, Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B