Re: Transactions involving multiple postgres foreign servers, take 2
| От | Masahiro Ikeda |
|---|---|
| Тема | Re: Transactions involving multiple postgres foreign servers, take 2 |
| Дата | |
| Msg-id | 412f81780e15cfb6b3d4905db9000785@oss.nttdata.com обсуждение исходный текст |
| Ответ на | Re: Transactions involving multiple postgres foreign servers, take 2 (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>) |
| Список | pgsql-hackers |
On 2020-07-15 15:06, Masahiko Sawada wrote:
> On Tue, 14 Jul 2020 at 09:08, Masahiro Ikeda <ikedamsh@oss.nttdata.com>
> wrote:
>>
>> > I've attached the latest version patches. I've incorporated the review
>> > comments I got so far and improved locking strategy.
>>
>> Thanks for updating the patch!
>> I have three questions about the v23 patches.
>>
>>
>> 1. messages related to user canceling
>>
>> In my understanding, there are two messages
>> which can be output when a user cancels the COMMIT command.
>>
>> A. When prepare is failed, the output shows that
>> committed locally but some error is occurred.
>>
>> ```
>> postgres=*# COMMIT;
>> ^CCancel request sent
>> WARNING: canceling wait for resolving foreign transaction due to user
>> request
>> DETAIL: The transaction has already committed locally, but might not
>> have been committed on the foreign server.
>> ERROR: server closed the connection unexpectedly
>> This probably means the server terminated abnormally
>> before or while processing the request.
>> CONTEXT: remote SQL command: PREPARE TRANSACTION
>> 'fx_1020791818_519_16399_10'
>> ```
>>
>> B. When prepare is succeeded,
>> the output show that committed locally.
>>
>> ```
>> postgres=*# COMMIT;
>> ^CCancel request sent
>> WARNING: canceling wait for resolving foreign transaction due to user
>> request
>> DETAIL: The transaction has already committed locally, but might not
>> have been committed on the foreign server.
>> COMMIT
>> ```
>>
>> In case of A, I think that "committed locally" message can confuse
>> user.
>> Because although messages show committed but the transaction is
>> "ABORTED".
>>
>> I think "committed" message means that "ABORT" is committed locally.
>> But is there a possibility of misunderstanding?
>
> No, you're right. I'll fix it in the next version patch.
>
> I think synchronous replication also has the same problem. It says
> "the transaction has already committed" but it's not true when
> executing ROLLBACK PREPARED.
Thanks for replying and sharing the synchronous replication problem.
> BTW how did you test the case (A)? It says canceling wait for foreign
> transaction resolution but the remote SQL command is PREPARE
> TRANSACTION.
I think the timing of failure is important for 2PC test.
Since I don't have any good solution to simulate those flexibly,
I use the GDB debugger.
The message of the case (A) is sent
after performing the following operations.
1. Attach the debugger to a backend process.
2. Set a breakpoint to PreCommit_FdwXact() in CommitTransaction().
// Before PREPARE.
3. Execute "BEGIN" and insert data into two remote foreign tables.
4. Issue a "Commit" command
5. The backend process stops at the breakpoint.
6. Stop a remote foreign server.
7. Detach the debugger.
// The backend continues and prepare is failed. TR try to abort all
remote txs.
// It's unnecessary to resolve remote txs which prepare is failed,
isn't it?
8. Send a cancel request.
BTW, I concerned that how to test the 2PC patches.
There are many failure patterns, such as failure timing,
failure server/nw (and unexpected recovery), and those combinations...
Though it's best to test those failure patterns automatically,
I have no idea for now, so I manually check some patterns.
> I've incorporated the above your comments in the local branch. I'll
> post the latest version patch after incorporating other comments soon.
OK, Thanks.
Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION
В списке pgsql-hackers по дате отправления: