Обсуждение: found xmin x from before relfrozenxid y
Hi, after upgrading to version 11, I see the error pattern "found xmin x from before relfrozenxid y" in different databases on different hosts. From https://www.postgresql.org/docs/10/static/release-10-3.html, I learned that this was an error caused by pg_upgrade, which apparently had been fixed in 10.3. This page also states that refreshing the affected materialized view non-concurrently would fix the problem. My question is now how to infer the affected materialized view from the error message. Is there a way to tell which one to refresh from the xmin or relfrozenxid value? Best Johannes
Вложения
=?UTF-8?Q?Johannes_Gra=c3=abn?= <johannes@selfnet.de> writes: > after upgrading to version 11, I see the error pattern "found xmin x > from before relfrozenxid y" in different databases on different hosts. > From https://www.postgresql.org/docs/10/static/release-10-3.html, I > learned that this was an error caused by pg_upgrade, which apparently > had been fixed in 10.3. This page also states that refreshing the > affected materialized view non-concurrently would fix the problem. > My question is now how to infer the affected materialized view from the > error message. Is there a way to tell which one to refresh from the xmin > or relfrozenxid value? No :-(. I wonder why in the world we didn't make that error message include the relation and block number the tuple was found in. (Well, I see the short answer: the code layer throwing the error doesn't know. But that could be fixed easily enough.) In the meantime, the only answer I can think of offhand is to manually do VACUUM FREEZE on each of your MVs, and then refresh anything that shows up with an error. regards, tom lane
=?UTF-8?Q?Johannes_Gra=c3=abn?= <johannes@selfnet.de> writes: > after upgrading to version 11, I see the error pattern "found xmin x > from before relfrozenxid y" in different databases on different hosts. > From https://www.postgresql.org/docs/10/static/release-10-3.html, I > learned that this was an error caused by pg_upgrade, which apparently > had been fixed in 10.3. This page also states that refreshing the > affected materialized view non-concurrently would fix the problem. > My question is now how to infer the affected materialized view from the > error message. Is there a way to tell which one to refresh from the xmin > or relfrozenxid value? No :-(. I wonder why in the world we didn't make that error message include the relation and block number the tuple was found in. (Well, I see the short answer: the code layer throwing the error doesn't know. But that could be fixed easily enough.) In the meantime, the only answer I can think of offhand is to manually do VACUUM FREEZE on each of your MVs, and then refresh anything that shows up with an error. regards, tom lane
Thanks for your answer. On 21/10/2018 16.24, Tom Lane wrote: > In the meantime, the only answer I can think of offhand is to manually > do VACUUM FREEZE on each of your MVs, and then refresh anything that > shows up with an error. Since I have so many of them, I decided to go for a quick-and-dirty solution (what about REFRESH ALL MATERIALIZED VIEWS in the future?): > DO > $$ > DECLARE command text; > BEGIN > FOR command IN SELECT 'REFRESH MATERIALIZED VIEW '|| nspname||'.'||relname||';' > FROM pg_catalog.pg_class c > LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace > WHERE c.relkind = 'm' > LOOP > RAISE NOTICE '%', command; > EXECUTE command; > END LOOP; > END > $$;
Вложения
Hi, On 2018-10-21 10:24:16 -0400, Tom Lane wrote: > =?UTF-8?Q?Johannes_Gra=c3=abn?= <johannes@selfnet.de> writes: > > after upgrading to version 11, I see the error pattern "found xmin x > > from before relfrozenxid y" in different databases on different hosts. > > From https://www.postgresql.org/docs/10/static/release-10-3.html, I > > learned that this was an error caused by pg_upgrade, which apparently > > had been fixed in 10.3. This page also states that refreshing the > > affected materialized view non-concurrently would fix the problem. > > My question is now how to infer the affected materialized view from the > > error message. Is there a way to tell which one to refresh from the xmin > > or relfrozenxid value? > > No :-(. I wonder why in the world we didn't make that error message > include the relation and block number the tuple was found in. Because it was a really complicated bugfix already, I don't think the answer is more complicated than that. > (Well, I see the short answer: the code layer throwing the error > doesn't know. But that could be fixed easily enough.) I wonder if the better approach wouldn't be to add an errcontext for vaccuum, where continually update the block number etc. Theres plenty of different sources of corruption that'd potentially cause debug messages or errors, and that should get most of them. Greetings, Andres Freund
Hi, On 2018-10-21 10:24:16 -0400, Tom Lane wrote: > =?UTF-8?Q?Johannes_Gra=c3=abn?= <johannes@selfnet.de> writes: > > after upgrading to version 11, I see the error pattern "found xmin x > > from before relfrozenxid y" in different databases on different hosts. > > From https://www.postgresql.org/docs/10/static/release-10-3.html, I > > learned that this was an error caused by pg_upgrade, which apparently > > had been fixed in 10.3. This page also states that refreshing the > > affected materialized view non-concurrently would fix the problem. > > My question is now how to infer the affected materialized view from the > > error message. Is there a way to tell which one to refresh from the xmin > > or relfrozenxid value? > > No :-(. I wonder why in the world we didn't make that error message > include the relation and block number the tuple was found in. Because it was a really complicated bugfix already, I don't think the answer is more complicated than that. > (Well, I see the short answer: the code layer throwing the error > doesn't know. But that could be fixed easily enough.) I wonder if the better approach wouldn't be to add an errcontext for vaccuum, where continually update the block number etc. Theres plenty of different sources of corruption that'd potentially cause debug messages or errors, and that should get most of them. Greetings, Andres Freund