Re: Problem with aborting entire transactions on error

Поиск
Список
Период
Сортировка
От Scott Marlowe
Тема Re: Problem with aborting entire transactions on error
Дата
Msg-id CAOR=d=0_FwZH_1hizXiU64LceBgVzyjd9yU7JYROxT6QJ6609w@mail.gmail.com
обсуждение исходный текст
Ответ на Problem with aborting entire transactions on error  (Zbigniew <zbigniew2011@gmail.com>)
Ответы Re: Problem with aborting entire transactions on error
Список pgsql-general
On Sun, Dec 9, 2012 at 8:20 PM, Zbigniew <zbigniew2011@gmail.com> wrote:
> Hello,
>
> As I read while googling the web, many people complained about this
> before. Couldn't it be made optional (can be even with "default ON")?
> I understand, that there are situations, when it is a must - for
> example, when the rest of queries rely on the result of first ones -
> but there are numerous situations, when just skipping a faulty query
> is all we need.
>
> A simple - but very common - example: I wanted to perform really large
> number of inserts - using transaction, to make it faster - while being
> sure the duplicate entries will be skipped. Of course, this job will
> be done best by server itself, which is keeping an eye on "primary
> key" of the table. Unfortunately: not during a transaction! Any "dupe"
> will trash thousands other (proper) entries immediately.
>
> Why is this? My guess is, there is kind of logic in the code, like this:
>
> if { no error during query } {
>   do it
> } else {
>  withdraw this one
>  rollback entire transaction
> }
>
> Therefore my request - and, as I saw, of many others - would be just
> to introduce a little change:
>
> if { no error during query } {
>   do it
> } else {
>  withdraw this one
>  if { ROLLBACK_ON_ERROR } {
>    rollback entire transaction
>   }
> }
>
> (if there's no ROLLBACK_ON_ERROR - it should carry on with the
> remaining queries)
>
> Is it really so problematic to introduce such code change, allowing
> the users to control this behaviour? Yes, I read about using
> "savepoints" - but I think we agree, it's just cumbersome workaround -
> and not real solution, like my proposal. All we need is either a
> variable to set, or a command, that will allow to modify the present
> functionality in the way described above.

Databases aren't as simple as you imagine.  What you're basically
asking for from the pg engine is for it to enclose every insert into a
subtransaction (i.e. set a savepoint) to check for an error.  The
overhead cost of doing this in pgsql is not cheap.  Now what I would
use for this would be some tool written to interface with pgsql and do
the job for you, like pgloader.  It gives you the simple interface to
do what you're asking.

The idea that everything like this belongs in the database,
particularly coded by the core developers isn't how things generally
get done in pgsql.  We've got a small sharp team of developers working
on BIG things, like covering indexes, HOT updates, query planner
optimizations and so on.  Fixing this problem is core probably doesn't
tickle anyone's itch, and there are so many ways to work around it in
pg that would not work as well in other dbs, like loading to a temp
table and updating in one big query.  That method might run an oracle
db out of rollback space etc, but on pg it's the way it's done.

The thing that often happens is that people learn to do things a
certain way on another db and then think that that experience should
translate to postgresql straight across.  It often does not, because
postgresql is so different in many ways.

Keep in mind with pg transactions you can put almost anything into a
transaction, and the only real limit to the size of a transaction you
can run in it is the size of your hard drives.  This alone lets you
approach problems from a very different perspective than on most other
dbs.

Lastly it's a free (as in beer, as in freedom) database.  People work
on it as much out of love as for money, and if you want to get
traction for a someone else to volunteer their time to make a change,
YOU have to make the case. I'm not a pg hacker.  But I've made a case
several times and gotten someone to make a change a few times for me.
If you think you have a case here you'll have to convince a lot of
people who don't think you do.


В списке pgsql-general по дате отправления:

Предыдущее
От: Jeff Janes
Дата:
Сообщение: Re: Corrupt indexes on slave when using pg_bulkload on master
Следующее
От: François Beausoleil
Дата:
Сообщение: Re: When is archive_cleanup called?