Re: Resurrected thread: Speed improvement - Group batch Insert - Rewrite the INSERT at the driver level (using a parameter)

Поиск
Список
Период
Сортировка
От Dave Cramer
Тема Re: Resurrected thread: Speed improvement - Group batch Insert - Rewrite the INSERT at the driver level (using a parameter)
Дата
Msg-id CADK3HHK+aKbwOiBqsyhneaviVCiwf8sWvoysYHS8-cW9OcA28A@mail.gmail.com
обсуждение исходный текст
Ответ на Resurrected thread: Speed improvement - Group batch Insert - Rewrite the INSERT at the driver level (using a parameter)  (Jeremy Whiting <jwhiting@redhat.com>)
Ответы Re: Resurrected thread: Speed improvement - Group batch Insert - Rewrite the INSERT at the driver level (using a parameter)
Список pgsql-jdbc
Hi Jeremy,

As Oliver pointed out in the response to [1] this would require parsing the query which we avoid.

I also note that this approach isn't much of an improvement for small batches. I am curious what is your real world use case that prompted this experiment ?

Dave

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 25 March 2015 at 15:34, Jeremy Whiting <jwhiting@redhat.com> wrote:
Hi,
 I see this conversation [1] occurred back in 2009. I'd like to resurrect the thread.

 In response to the questions by J. W. Ulbts I have some performances results demonstrating the benefit. Also a response to his suggestion for using COPY.

"Where exactly is the performance benefit that you see coming from?"

 To answer this question a simple java JDBC project [2] was created to demonstrate the benefit. In the project are several benchmarks grouped by individual statements (IndividualStatementsTest) or re-written multi-insert (MultiInsertStatementTest). Each group has 3 different statement/row sizes (2/5/51) which are configurable. Named "SMALL", "MEDIUM" and "LARGE" respectively. The default sizes are not intended to be representative of any particular use case. As everyone has differing opinions of what is appropriate.

 The project is easy to set up and run. Details are in the README file.


 The attached normalized graph of ops/sec demonstrating the benefit at different levels of concurrency. The results are for a client machine and a dedicated server system. Details of the db system are: 32 core @2.90GHz, 283GB memory, couple of enterprise SSD for db storage. WAL and tablespace are on separate devices.


"If your use case is just "I want to do bulk inserts as fast as possible" then perhaps the newly merged COPY suport is a better way to go."

 For use cases involving applications using an ORM like Hibernate COPY isn't supported nor likely to. Hibernate doesn't have any concept of handling files on the database system.

 What are the thoughts for having this optimization introduced into pgjdbc driver ?

Regards,
Jeremy

[1] http://www.postgresql.org/message-id/828427796@web.de
[2] https://github.com/whitingjr/batch-rewrite-statements-perf

--
Jeremy Whiting
Senior Software Engineer, JBoss Performance Team
Red Hat

------------------------------------------------------------
Registered Address: RED HAT UK LIMITED, 64 Baker Street, 4th Floor, Paddington. London. United Kingdom W1U 7DF
Registered in UK and Wales under Company Registration No. 3798903  Directors: Michael Cunningham (US), Charles Peters (US), Matt Parson (US) and Michael O'Neill(Ireland)



--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc


В списке pgsql-jdbc по дате отправления:

Предыдущее
От: Kris Jurka
Дата:
Сообщение: Re: Does PGInterval class handle iso_8601 intervalstyle?
Следующее
От: Jeremy Whiting
Дата:
Сообщение: Re: Resurrected thread: Speed improvement - Group batch Insert - Rewrite the INSERT at the driver level (using a parameter)