Обсуждение: pg_upgrade and frozen xids

Поиск
Список
Период
Сортировка

pg_upgrade and frozen xids

От
Natalie Wenz
Дата:
What happens to the xid of rows/tables when you upgrade postgres via pg_upgrade?

I upgraded a very large database from 9.6 to 10.1 via pg_upgrade recently, and ever since, the auto vacuum has been
busyon a large legacy table that has experienced no changes since the upgrade. If the whole table had been frozen prior
tothe upgrade, would you expect it to stay frozen?  

Thanks,
Natalie

Re: pg_upgrade and frozen xids

От
Peter Geoghegan
Дата:
On Tue, Mar 6, 2018 at 3:12 PM, Natalie Wenz <nataliewenz@ebureau.com> wrote:
> What happens to the xid of rows/tables when you upgrade postgres via pg_upgrade?
>
> I upgraded a very large database from 9.6 to 10.1 via pg_upgrade recently, and ever since, the auto vacuum has been
busyon a large legacy table that has experienced no changes since the upgrade. If the whole table had been frozen prior
tothe upgrade, would you expect it to stay frozen? 

Yes, you're right to expect it to stay frozen.

I wonder if this is due to a (thankfully relatively benign) bug that
was fixed very recently [1]. Are the relations that autovacuum is
processing materialized views?

[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=38b41f182a66b67e36e2adf53d078599b1b65483
--
Peter Geoghegan


Re: pg_upgrade and frozen xids

От
Natalie Wenz
Дата:
They are not, unfortunately (fortunately?). Just a standard table.

If it helps, the autovacuum always seems to be in the phase "cleaning up indexes" when I look at pg_stat_progress_vacuum. 

On Mar 6, 2018, at 5:25 PM, Peter Geoghegan <pg@bowt.ie> wrote:

 Are the relations that autovacuum is
processing materialized views?

Re: pg_upgrade and frozen xids

От
bricklen
Дата:


On Tue, Mar 6, 2018 at 3:41 PM, Natalie Wenz <nataliewenz@ebureau.com> wrote:
They are not, unfortunately (fortunately?). Just a standard table.

If it helps, the autovacuum always seems to be in the phase "cleaning up indexes" when I look at pg_stat_progress_vacuum.

​We recently upgraded about 1500 Postgres clusters from 9.3 to 10.1 and we are seeing similar issues. Specifically:
* An autovacuum worker stuck vacuuming regular tables for days until they are killed (I've seen "vacuuming indexes" phase), and they are showing 99% CPU in top. pg_cancel|terminate_backend does not usually kill the process, only a hard kill does. In the latter case, two (out of four) times it also crashed Postgres.
* There have been about 10 examples of btree index corruption, across a handful of databases, in different data centres. In two of the cases, it's possible there were hardware issues as the hypervisor those two VM's were running on crashed.

Re: pg_upgrade and frozen xids

От
Alvaro Herrera
Дата:
bricklen wrote:

> We recently upgraded about 1500 Postgres clusters from 9.3 to 10.1 and we
> are seeing similar issues. Specifically:
> * An autovacuum worker stuck vacuuming regular tables for days until they
> are killed (I've seen "vacuuming indexes" phase), and they are showing 99%
> CPU in top. pg_cancel|terminate_backend does not usually kill the process,
> only a hard kill does. In the latter case, two (out of four) times it also
> crashed Postgres.
> * There have been about 10 examples of btree index corruption, across a
> handful of databases, in different data centres. In two of the cases, it's
> possible there were hardware issues as the hypervisor those two VM's were
> running on crashed.

I wonder if this may be related to changes in collation algorithm.  If
the indexes are all on text columns, I think you would be better served
by reindexing them all instead of trying to treat this situation as a
code bug.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: pg_upgrade and frozen xids

От
Peter Geoghegan
Дата:
On Wed, Mar 7, 2018 at 11:48 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> I wonder if this may be related to changes in collation algorithm.  If
> the indexes are all on text columns, I think you would be better served
> by reindexing them all instead of trying to treat this situation as a
> code bug.

I happen to know that bricklen already ran amcheck. There were errors,
but they were not consistent with a collation issue. Rather, it looked
like something was up with the storage layer -- the sibling links of a
pair of pages were not in mutual agreement.

Even if that wasn't something that I knew already, I still would not
suspect opclass misbehavior of any variety. VACUUM doesn't care about
the ordering of items on the page in the case of nbtree. And, it
performs a physical order scan there (albeit with some extra trickery
to prevent races due to concurrent splits). Index tuples that could
end up being unreachable to index scans due to opclass misbehavior
should remain reachable to VACUUM.

-- 
Peter Geoghegan


Re: pg_upgrade and frozen xids

От
Peter Geoghegan
Дата:
On Tue, Mar 6, 2018 at 3:41 PM, Natalie Wenz <nataliewenz@ebureau.com> wrote:
> They are not, unfortunately (fortunately?). Just a standard table.
>
> If it helps, the autovacuum always seems to be in the phase "cleaning up
> indexes" when I look at pg_stat_progress_vacuum.

It sounds like it might be a good idea for you to run amcheck on
affected indexes:

https://www.postgresql.org/docs/10/static/amcheck.html

If that doesn't show any issue, you might then use the Github
version's extra "heapallindexed" test on the same indexes:

https://github.com/petergeoghegan/amcheck

The extra "heapallindexed" test might allow you to uncover that some
parts of an index are unreachable. Though using
bt_index_parent_check() instead of bt_index_check() may be just as
effective, and should be considered if you can afford to have a lock
that will block writes to the table during verification.

-- 
Peter Geoghegan


Re: pg_upgrade and frozen xids

От
bricklen
Дата:


On Wed, Mar 7, 2018 at 12:01 PM, Peter Geoghegan <pg@bowt.ie> wrote:
I happen to know that bricklen already ran amcheck. There were errors,
but they were not consistent with a collation issue. Rather, it looked
like something was up with the storage layer -- the sibling links of a
pair of pages were not in mutual agreement.

Even if that wasn't something that I knew already, I still would not
suspect opclass misbehavior of any variety. VACUUM doesn't care about
the ordering of items on the page in the case of nbtree. And, it
performs a physical order scan there (albeit with some extra trickery
to prevent races due to concurrent splits). Index tuples that could
end up being unreachable to index scans due to opclass misbehavior
should remain reachable to VACUUM.

​What little detail I've been able to collect so far is below. All for 10.1 clusters.

From the postgres logs, for 6 different databases (across 3 geo regions, of which two were on the same hypervisor). Each one was discovered when autovacuum tried to vacuum them:

ERROR:  could not find left sibling of block 4775 in index "<some index>"
ERROR:  right sibling 13983 of block 7196 is not next child 7246 of block 5208 in index "<some index>"
ERROR:  right sibling 60252 of block 60115 is not next child 60118 of block 60113 in index "<some index>"
ERROR:  right sibling 93058 of block 93057 is not next child 93061 of block 93008 in index "<some index>"
ERROR:  right sibling 10081 of block 10079 is not next child 10084 of block 10046 in index "<some index>"
ERROR:  left link changed unexpectedly in block 13868 of index "<some index>"
ERROR:  right sibling 145 of block 92 is not next child 93 of block 3 in index "<some index>"



A strace from the hung autovac process (before we killed it):

futex(0x7f07b8f575f8, FUTEX_WAIT, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x7f07b8f575f8, FUTEX_WAIT, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x7f07b8f575f8, FUTEX_WAIT, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable)
...