Re: Transactions over pathological TCP connections

Поиск
Список
Период
Сортировка
От Leon Smith
Тема Re: Transactions over pathological TCP connections
Дата
Msg-id CAPwAf1kGXm6YVKgAumdPR+Hu98Lpbzf8fWxwf4xwS6bmk9DRHg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Transactions over pathological TCP connections  (Leon Smith <leon.p.smith@gmail.com>)
Список pgsql-hackers
I just realized this is essentially an instance of the Two General's Problem;  which is something I feel should have been more obvious to me.

On Tue, Jun 19, 2012 at 5:50 PM, Leon Smith <leon.p.smith@gmail.com> wrote:
On Tue, Jun 19, 2012 at 11:59 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Jun 19, 2012 at 1:56 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The transaction would be committed before a command success report is
> delivered to the client, so I don't think delivered-and-not-marked is
> possible.

...unless you have configured synchronous_commit=off, or fsync=off.

Or unless your disk melts into a heap of slag and you have to restore
from backup.  You can protect against that last case using synchronous
replication.


But hard disk failure isn't in the failure model I was concerned about.   =)    To be perfectly honest,  I'm not too concerned with either hard drive failure or network failure,   as we are deploying on Raid 1+0 database server talking to the client over a redundant LAN,  and using asynchronous (Slony) replication to an identical database server just in case.   No single point of failure is a key philosophy of this app from top to bottom.     Like I said,  this is mostly idle curiosity.

But I'm also accustomed to trying to get work done on shockingly unreliable internet connections.   As a result, network failure is something I think about quite a lot when writing networked applications.   So this is not entirely idle curiosity either.

And thinking about this a bit more,   it's clear that the database has to commit before the result is sent,  on the off chance that the transaction fails and needs to be retried.   And that an explicit transaction block isn't really a solution either,  because   a "BEGIN;  SELECT dequeue_row()"  would get the row to the client without marking it as taken,   but the pathological TCP disconnect could then attack the  following "COMMIT;",  leaving the client to think that the row has not been actually taken when it in fact has.  

It's not clear to me that this is even a solvable problem without modifying the schema to include both a "taken" and a "finished processing" state,  and then letting elements be re-delievered after a period of time.   But this would then allow a pathological demon with the power to cause TCP connects have a single element delivered and processed multiple times.

In any case, thanks for the responses...

Best,
Leon

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: [PATCH 10/16] Introduce the concept that wal has a 'origin' node
Следующее
От: Simon Riggs
Дата:
Сообщение: Re: [PATCH 10/16] Introduce the concept that wal has a 'origin' node