Re: INSERTing lots of data

Поиск
Список
Период
Сортировка
От Szymon Guz
Тема Re: INSERTing lots of data
Дата
Msg-id AANLkTimvkkS7CYZtuIbIuTTiaHDJMLWkpPfji41Jh9-b@mail.gmail.com
обсуждение исходный текст
Ответ на INSERTing lots of data  (Joachim Worringen <joachim.worringen@iathh.de>)
Ответы Re: INSERTing lots of data  (Joachim Worringen <joachim.worringen@iathh.de>)
Re: INSERTing lots of data  (Martin Gainty <mgainty@hotmail.com>)
Список pgsql-general
2010/5/28 Joachim Worringen <joachim.worringen@iathh.de>
Greetings,

my Python application (http://perfbase.tigris.org) repeatedly needs to insert lots of data into an exsting, non-empty, potentially large table. Currently, the bottleneck is with the Python application, so I intend to multi-thread it. Each thread should work on a part of the input file.

I already multi-threaded the query part of the application, which requires to use one connection per thread - cursors a serialized via a single connection.

Provided that
- the threads use their own connection
- the threads perform all INSERTs within a single transaction
- the machine has enough resources

 will I get a speedup? Or will table-locking serialize things on the server side?

Suggestions for alternatives are welcome, but the data must go through the Python application via INSERTs (no bulk insert, COPY etc. possible)


Remember about Python's GIL in some Python implementations so those threads could be serialized at the Python level.

This is possible that those inserts will be faster. The speed depends on the table structure, some constraints and triggers and even database configuration. The best answer is: just check it on some test code, make a simple multithreaded aplication and try to do the inserts and check that out.


regards
Szymon Guz

В списке pgsql-general по дате отправления:

Предыдущее
От: Joachim Worringen
Дата:
Сообщение: INSERTing lots of data
Следующее
От: Joachim Worringen
Дата:
Сообщение: Re: INSERTing lots of data