Re: Populating large tables with occasional bad values

Поиск
Список
Период
Сортировка
От John T. Dow
Тема Re: Populating large tables with occasional bad values
Дата
Msg-id 200806131518.m5DFIrSD018924@web2.nidhog.com
обсуждение исходный текст
Ответ на Populating large tables with occasional bad values  ("John T. Dow" <john@johntdow.com>)
Ответы Re: Populating large tables with occasional bad values
Список pgsql-jdbc
I have tried using a batch and it looked good at first but I don't think it will work with Postgres as cleanly as you'd
like.

First, set autocommit false.

Second, addBatch many times for the insert statements.

Third, executeBatch.

Fourth, the exception BatchUpdateException is thrown. The exception reports which insert statement had a problem.
getNextException()gives more details (duplicate key). Only one problem is reported. getUpdateCounts() on the exception
reportswhat we already know, that the statements up until this point all inserted without problem. 

Fifth - now what? I tried to commit at this point, hoping that the first insert statements would be commited, but they
arenot. Actually, I was hoping that the entire batch would be processed and then I could see which individual
statementsdidn't insert (using the update counts) and then redo them individually, but I think that is wishful
thinking.

Best possible solution with batches? Submit the batch, learn who many were accepted before the first error, resubmit
justthose. Process the problem statement by itself. Repeat with the statements left in the original batch until none
areleft, then create a new batch and begin anew. If a batch, or portion thereof, has no exception, then of course
commit.

For this, since portions of batches will be sent twice, the batch shouldn't be too large. Even 10 statements in a batch
wouldsubstantially reduce the overhead, certainly no need to do thousands. 

In the posting I quote below, the implication is that you can learn multiple statements in the batch that failed, but I
didn'tsee that happening. It seems that posgres just quits at the first problem. 

JOhn




On Thu, 12 Jun 2008 14:35:30 +0300, tivvpgsqljdbc@gtech-ua.com wrote:

>John T. Dow написав(ла):
>>> How about batches? Should not they help you in this case?
>>>
>>
>>
>> WHat happens if the 23rd and 59th rows in a batch of 100 have a problem, such as an invalid numeric value or a
duplicatekey? 
>>
>> Can the other 98 rows be committed?
>>
>> I haven't tried this.
>>
>
>In postgresql  no (may be yes with autocommit turned on - did not try).
>Because postgresql marks transaction as rollback-only on first problem.
>The one thing you can do is to detect which records are wrong in the
>batch (say, #2 and #55) and redo the batch without failing records
>(with/without appending batch with new records). This would make server
>load higher, but would give you only two roundtrips instead of 100
>roundtrips. Actually if every batch will have failing records, your
>server will has two times more transactions (this is maximum).
>
>



В списке pgsql-jdbc по дате отправления:

Предыдущее
От: tivvpgsqljdbc@gtech-ua.com
Дата:
Сообщение: Re: Populating large tables with occasional bad values
Следующее
От: Craig Ringer
Дата:
Сообщение: Re: Populating large tables with occasional bad values