Re: Do we need to handle orphaned prepared transactions in the server?

Поиск
Список
Период
Сортировка
От Hamid Akhtar
Тема Re: Do we need to handle orphaned prepared transactions in the server?
Дата
Msg-id CANugjhsA-1ORGbV7mGW-dBLOK=2QJBTDqj5wkH=6-wB5sKj7XA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Do we need to handle orphaned prepared transactions in the server?  (Hamid Akhtar <hamid.akhtar@gmail.com>)
Ответы Re: Do we need to handle orphaned prepared transactions in theserver?  (Bruce Momjian <bruce@momjian.us>)
Re: Do we need to handle orphaned prepared transactions in the server?  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
Список pgsql-hackers
Here is the v2 of the same patch after rebasing it and running it through pgindent. There are no other code changes.


On Wed, Feb 19, 2020 at 8:04 PM Hamid Akhtar <hamid.akhtar@gmail.com> wrote:
All,

Attached is version 1 of POC patch for notifying of orphaned
prepared transactions via warnings emitted to a client
application and/or log file. It applies to PostgreSQL branch
"master" on top of "e2e02191" commit.

I've tried to keep the patch as less invasive as I could with
minimal impact on vacuum processes, so the performance impact
and the changes are minimal in that area of PostgreSQL core.


- What's in this Patch:

This patch throws warnings when an autovacuum worker encounters
an orphaned prepared transaction. It also throws warnings to a
client when a vacuum command is issued. This patch also
introduces two new GUCs:

(1) max_age_prepared_xacts
- The age after creation of a prepared transaction after which it
will be considered an orphan.

(2) prepared_xacts_vacuum_warn_timeout
- The timeout period for an autovacuum (essentially any of its
worker) to check for orphaned prepared transactions and throw
warnings if any are found.


- What This Patch Does:

If the GUCs are enabled (set to a value higher than -1), an
autovacuum worker running in the background checks if the
timeout has expired. If so, it checks if there are any orphaned
prepared transactions (i.e. their age has exceeded
max_age_prepared_xacts). If it finds any, it throws a warning for
every such transaction. It also emits the total number of orphaned
prepared transactions if one or more are found.

When a vacuum command is issued from within a client, say psql,
in that case, we skip the vacuum timeout check and simply scan
for any orphaned prepared transactions. Warnings are emitted to
the client and log file if any are found.


- About the New GUCs:

= max_age_prepared_xacts:
Sets maximum age after which a prepared transaction is considered an
orphan. It applies when "prepared transactions" are enabled. The
age for a transaction is calculated from the time it was created to
the current time. If this value is specified without units, it is taken
as milliseconds. The default value is -1 which allows prepared
transactions to live forever.

= prepared_xacts_vacuum_warn_timeout:
Sets timeout after which vacuum starts throwing warnings for every
prepared transactions that has exceeded maximum age defined by
"max_age_prepared_xacts". If this value is specified without units,
it is taken as milliseconds. The default value of -1 will disable
this warning mechanism. Setting a too value could potentially fill
up log with orphaned prepared transaction warnings, so this
parameter must be set to a value that is reasonably large to not
fill up log file, but small enough to notify of long running and
potential orphaned prepared transactions. There is no additional
timer or worker introduced with this change. Whenever a vacuum
worker runs, it first checks for any orphaned prepared transactions.
So at best, this GUC serves as a guideline for a vacuum worker
if a warning should be thrown to log file or a client issuing
vacuum command.


- What this Patch Does Not Cover:

The warning is not thrown when user either runs vacuumdb or passes
individual relations to be vacuum. Specifically in case of vacuumdb,
it breaks down a vacuum command to an attribute-wise vacuum command.
So the vacuum command is indirectly run many times. Considering that
we want to emit warnings for every manual vacuum command, this simply
floods the terminal and log with orphaned prepared transactions
warnings. We could potentially handle that, but the overhead of
that seemed too much to me (and I've not invested any time looking
to fix that either). Hence, warnings are not thrown when user runs
vacuumdb and relation specific vacuum.



On Fri, Jan 31, 2020 at 7:02 PM Hamid Akhtar <hamid.akhtar@gmail.com> wrote:


On Thu, Jan 30, 2020 at 8:28 AM Craig Ringer <craig@2ndquadrant.com> wrote:
On Thu, 30 Jan 2020 at 02:04, Hamid Akhtar <hamid.akhtar@gmail.com> wrote:
>
> So having seen the feedback on this thread, and I tend to agree with most of what has been said here, I also agree that the server core isn't really the ideal place to handle the orphan prepared transactions.
>
> Ideally, these must be handled by a transaction manager, however, I do believe that we cannot let database suffer for failing of an external software, and we did a similar change through introduction of idle in transaction timeout behavior.

The difference, IMO, is that idle-in-transaction aborts don't affect
anything we've promised to be durable.

Once you PREPARE TRANSACTION the DB has made a promise that that txn
is durable. We don't have any consistent feedback channel to back to
applications and say "Hey, if you're not going to finish this up we
need to get rid of it soon, ok?". If a distributed transaction manager
gets consensus for commit and goes to COMMIT PREPARED a previously
prepared txn only to find that it has vanished, that's a major
problem, and one that may bring the entire DTM to a halt until the
admin can intervene.

This isn't like idle-in-transaction aborts. It's closer to something
like uncommitting a previously committed transaction.

I do think it'd make sense to ensure that the documentation clearly
highlights the impact of abandoned prepared xacts on server resource
retention and performance, preferably with pointers to appropriate
views. I haven't reviewed the docs to see how clear that is already.

Having seen the documentation, IMHO the document does contain enough 
information for users to understand what issues can be caused by these 
orphaned prepared transactions. 
 

I can also see an argument for a periodic log message (maybe from
vacuum?) warning when old prepared xacts hold xmin down. Including one
sent to the client application when an explicit VACUUM is executed.
(In fact, it'd make sense to generalise that for all xmin-retention).

I think that opens up the debate on what we really mean by "old" and
whether that requires a syntax change when creating a prepared 
transactions as Thomas Kellerer suggested earlier?

I agree that vacuum should periodically throw warnings for any prepared
xacts that are holding xmin down.

Generalising it for all xmin-retention is a fair idea IMHO, though that
does increase the overall scope here. A vacuum process should (ideally)
periodically throw out warnings for anything that is preventing it (including 
orphaned prepared transactions) from doing its routine work so that 
somebody can take necessary actions.


But I'm really not a fan of aborting such txns. If you operate with
some kind of broken global transaction manager that can forget or
abandon prepared xacts, then fix it, or adopt site-local periodic
cleanup tasks that understand your site's needs.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 2ndQuadrant - PostgreSQL Solutions for the Enterprise


--
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950  EMAIL: mailto:hamid.akhtar@highgo.ca
SKYPE: engineeredvirus


--
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950  EMAIL: mailto:hamid.akhtar@highgo.ca
SKYPE: engineeredvirus


--
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950  EMAIL: mailto:hamid.akhtar@highgo.ca
SKYPE: engineeredvirus
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Daniel Gustafsson
Дата:
Сообщение: Re: Change atoi to strtol in same place
Следующее
От: Fujii Masao
Дата:
Сообщение: Re: Minor issues in .pgpass