Обсуждение: BUG #18457: Possible data loss needs to be evaluated

Поиск
Список
Период
Сортировка

BUG #18457: Possible data loss needs to be evaluated

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      18457
Logged by:          Julius Ernesti
Email address:      julius.ernesti@kpi4me.de
PostgreSQL version: 14.10
Operating system:   Rocky Linux 9
Description:

Our services connect to the database via libpq. It is ensured that
connections are not used by two threads at the same time, with several
physical connections to the database. Here's what happened:

A very large transaction, which makes inserts into different tables, could
not be executed due to a unique key constraint. Due to a not yet known
problem, this transaction was not rolled back properly and the connection
was in an invalid state. The Portgresql server log logged the following
messages every second over a period of hours:

2024-05-05 22:15:27.461 CEST [2287043] ERROR:  current transaction is
aborted, commands ignored until end of transaction block
2024-05-05 22:15:27.461 CEST [2287043] STATEMENT:  SAVEPOINT SP_264211

This is probably because a thread has continued to use the broken connection
to execute transactions. I assume that this thread also received
corresponding messages, which I cannot verify due to the lack of logs.

Other processes were still able to communicate successfully with the
database. Inserts were executed and generated rowids, which are generated
via sequences, were also returned. In retrospect, I can no longer find these
rows in the database. This means that inserts were apparently executed, but
no data exists in the final result.

The inserts correspond to the same tables as the failed transaction
mentioned above. As a result, we had a data loss of 1 day's work, which
unfortunately was only noticed afterwards.

Assumption:
I am sure that many unfavorable factors have led to a fatal error here. I
think that bugs from different systems have overlapped. We have not managed
to reproduce the invalid state described. I think that the transactions
described were successfully executed in the database. As we only need to
store data first and do not select it at that point, the error only became
apparent so late. At the moment I assume that the command “STATEMENT:
SAVEPOINT SP_264211” caused my data to be rolled back again without me
noticing in the service. I am sure that there is a link to the failed
transaction, that perhaps some tables were still locked and the rollback
droped those inserted rows.  

My knowledge of postgres has reached its limit here. I would like to
evaluate whether inserts from other connections, which have been
successfully executed, can be rolled back by another connection via the
statements 
2024-05-05 22:15:27.461 CEST [2287043] ERROR: current transaction is
aborted, commands ignored until end of transaction block
2024-05-05 22:15:27.461 CEST [2287043] STATEMENT: SAVEPOINT SP_264211

For the sake of simplicity, here is another example: There are two
connections A and B

Via A: insert into test (id, val) values (2, 'this will be a unique key
constraint error'); --> The Rollback statement will not be executed due to
an unknown error
Via B: insert into test (id, val) values (4, 'no error'); --> Successful
Transaction
Via A: insert into test (id, val) values (5, 'no error, but the is currently
a running transaction') --> Failed. Maybe a rollback is executed, I dont
know
Via B; select * from test where id = 4; --> Empty query 

We are still trying to find out how this could have happened. If the example
described actually leads to successful transactions being canceled again, I
would describe this as a bug in Postgres. Our codes have been running for 4
years and we have never seen an error like this.

Best regards.

Julius Ernesti


Re: BUG #18457: Possible data loss needs to be evaluated

От
Laurenz Albe
Дата:
On Tue, 2024-05-07 at 08:35 +0000, PG Bug reporting form wrote:
> ERROR:  current transaction is aborted, commands ignored until end of transaction block
>
> We are still trying to find out how this could have happened.

This is not a bug, it is expected behavior.
If a statement in a transaction causes an error, all subsequent
statements in that transaction will receive that message until
you end the transaction with ROLLBACK or COMMIT (which will also
roll back).

Yours,
Laurenz Albe



Re: BUG #18457: Possible data loss needs to be evaluated

От
Julius Ernesti
Дата:

Hello,

 

I can understand the behavior. However, if new transactions are opened via new connections and are successfully executed, how can it be that these are also rolled back?, Or is that also an expected behavior?

 

Best Regards.

 

Julius Ernesti

 

Mit freundlichen Grüßen

 

Julius Ernesti

Entwickler

Mob.: +49 173 6816311

Email: julius.ernesti@kpi4me.de

 

Ein Bild, das Text, ClipArt enthält.

Automatisch generierte Beschreibung

 

---------

KPI4me OHG

Kollwitzstraße 2

69168 Wiesloch

 

Web: www.kpi4me.de

 

USt.-ID-Nr.: DE339524073

Handelsregister: HRA 709842
Registergericht: Amtsgericht Mannheim

Geschäftsführer: Olaf Buchal, Michael Schneider

--------------------------------------------------------------------

 

Diese E-Mail kann Betriebs- oder Geschäftsgeheimnisse oder sonstige vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtümlich erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine Vervielfältigung oder Weitergabe der E-Mail ausdrücklich untersagt. Bitte benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail.Vielen Dank.

This e-mail may contain trade secrets or privileged, undisclosed, or otherwise confidential information. If you have received this e-mail in error, you are hereby notified that any review, copying, or distribution of it is strictly prohibited. Please inform us immediately and destroy the original transmittal. Thank you for your cooperation.

 

 

 

 

From: Laurenz Albe <laurenz.albe@cybertec.at>
Date: Tuesday, 7. May 2024 at 13:22
To: Julius Ernesti <julius.ernesti@kpi4me.de>, pgsql-bugs@lists.postgresql.org <pgsql-bugs@lists.postgresql.org>
Subject: Re: BUG #18457: Possible data loss needs to be evaluated

On Tue, 2024-05-07 at 08:35 +0000, PG Bug reporting form wrote:
> ERROR:  current transaction is aborted, commands ignored until end of transaction block
>
> We are still trying to find out how this could have happened.

This is not a bug, it is expected behavior.
If a statement in a transaction causes an error, all subsequent
statements in that transaction will receive that message until
you end the transaction with ROLLBACK or COMMIT (which will also
roll back).

Yours,
Laurenz Albe

Вложения

Re: BUG #18457: Possible data loss needs to be evaluated

От
"David G. Johnston"
Дата:
On Tuesday, May 7, 2024, PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:

Bug reference:      18457
Logged by:          Julius Ernesti
Email address:      julius.ernesti@kpi4me.de
PostgreSQL version: 14.10
Operating system:   Rocky Linux 9
Description:       

For the sake of simplicity, here is another example: There are two
connections A and B

Via A: insert into test (id, val) values (2, 'this will be a unique key
constraint error'); --> The Rollback statement will not be executed due to
an unknown error
Via B: insert into test (id, val) values (4, 'no error'); --> Successful
Transaction
Via A: insert into test (id, val) values (5, 'no error, but the is currently
a running transaction') --> Failed. Maybe a rollback is executed, I dont
know
Via B; select * from test where id = 4; --> Empty query 

If you actually perform that algorithm manually using two psql sessions you will find, if done properly, that the row with id=4 exists.  If it does not then show that work because you will indeed have found a bug.  But as of now you have not demonstrated a bug in PostgreSQL.

David J.
 

Re: BUG #18457: Possible data loss needs to be evaluated

От
Tom Lane
Дата:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> If you actually perform that algorithm manually using two psql sessions you
> will find, if done properly, that the row with id=4 exists.  If it does not
> then show that work because you will indeed have found a bug.  But as of
> now you have not demonstrated a bug in PostgreSQL.

Yeah.  This could be a Postgres problem, or it could be an application
problem.  Given the lack of other similar reports, the a-priori
assumption has to be that it's an application problem.  In any case,
there's no way for the rest of us to investigate since we don't have
access to either the application or data involved.

            regards, tom lane