Обсуждение: Maximum xid increasing

Поиск
Список
Период
Сортировка

Maximum xid increasing

От
Craig Milhiser
Дата:
Our max xid are over 500m and climbing. Auto vacuuming and manual vacuuming are returning

autovacuuming to prevent wraparound.
WARNING:  oldest xmin is far in the past
HINT:  Close open transactions soon to avoid wraparound problems.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.

The autovacuum previously kicked in at 200m and reduced the maximum xid.   Now there is no reduction.

We have a writer (master) node, 3 readers (slaves) and a hot-standby. We have an ETL vendor hooked up to replicate. All of the replications are working. There is current data in each. We have checked our replication slots. Everything is in use and working.

From transactions, we have terminated every writer node PID since before this climb has started. Almost every client has been restarted. 

We are handling 100+ insert/updates per second. Most transactions complete in ~2 seconds. Though some may take 1 minute.  There are no long running idle in transaction activities. 

We have been scouring the internet reading many helpful pages and trying their advice but we cannot find a problem.

We have restored a daily snapshot in another cluster. After vacuuming the max xid dropped to 50m. No client was connected. Implying that a restart should fix the issue.

I am searching for any hints. We have not found a smoking gun to focus on.  This is a 24x7 operation and would rather not have to restart the server.

We are on v13.8, AWS RDS.

Is there a way to find which row or table or lock or some open transaction is preventing the vacuuming from lowering xid? Any hints on where to look? 

Thanks

Re: Maximum xid increasing

От
Laurenz Albe
Дата:
On Wed, 2023-11-08 at 09:09 -0500, Craig Milhiser wrote:
> Our max xid are over 500m and climbing. Auto vacuuming and manual vacuuming are returning
>
> autovacuuming to prevent wraparound.
> WARNING:  oldest xmin is far in the past
> HINT:  Close open transactions soon to avoid wraparound problems.
> You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
>
> The autovacuum previously kicked in at 200m and reduced the maximum xid.   Now there is no reduction.
>
> We have a writer (master) node, 3 readers (slaves) and a hot-standby. We have an ETL vendor hooked
> up to replicate. All of the replications are working. There is current data in each. We have checked
> our replication slots. Everything is in use and working.
>
> From transactions, we have terminated every writer node PID since before this climb has started.
> Almost every client has been restarted. 
>
> We are handling 100+ insert/updates per second. Most transactions complete in ~2 seconds. Though some
> may take 1 minute.  There are no long running idle in transaction activities. 
>
> We have been scouring the internet reading many helpful pages and trying their advice but we cannot
> find a problem.
>
> We have restored a daily snapshot in another cluster. After vacuuming the max xid dropped to 50m.
> No client was connected. Implying that a restart should fix the issue.
>
> I am searching for any hints. We have not found a smoking gun to focus on.  This is a 24x7 operation
> and would rather not have to restart the server.
>
> We are on v13.8, AWS RDS.
>
> Is there a way to find which row or table or lock or some open transaction is preventing the vacuuming
> from lowering xid? Any hints on where to look? 

If that is not a bug introduced by Amazon, and you are certain that there are no long-running
transactions and stale replication slots, there are two more possibilities:

1. a prepared transaction (check pg_prepared_xacts)

2. a long-running query on a standby, and "hot_standby_feedback = on"

Yours,
Laurenz Albe