Insert values() per-statement overhead

Поиск
Список
Период
Сортировка
От Vladimir Sitnikov
Тема Insert values() per-statement overhead
Дата
Msg-id CAB=Je-EDZwVH7FgpDOa0ad8a7a6D=1QoYmvW2kQqSAML2gf+Uw@mail.gmail.com
обсуждение исходный текст
Ответы Re: Insert values() per-statement overhead  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
Hi,

There is a finding that insert(x) values(y);insert(x) values(z);  is
2-4 times slower than insert(..) values(y),(z);
see [1], [2].

In other words, there is a significant per-statement overhead even
though server-prepared statements are properly used.
The issue is reproducible in 9.5rc1.

Is it something that was discussed previously? (I was not able to find
that in archives)
Is it something that can be improved? (e.g. "insert node state"
caching across execute calls, improve performance of "INSERT %u %u"
generation, etc)

Even though I understand there will always be _some_ per-statement
overhead, such a hight overhead plays against common case of using
ORMs.
End-users are just stuck with insert(...) values(...);

1) Java's standard way of batching statements is
"PreparedStatement#addBatch()". Unfortunately, backend protocol does
not support statement batching.
One does not simply teach Hibernate/EclipseLink,etc etc to use
PostgreSQL's-specific COPY.
Note: I am not talking about network roundtrips here. I'm just
highlighting that there is no way to execute "bind bind bind
executebatch" sequence at the protocol level.

2) One might consider "transparent rewrite of insert()  batches into a
single insert() values(),(),() statement" at JDBC driver level, but it
is hard to get right as there is no easy way to parse a query. It is
really expected that every PostgreSQL connector would implement SQL
parser & insert rewriter?

3) Transparent rewrites (including "rewrite inserts to COPY") would
fail to provide "number of modified rows" for each row. Error
semantics is different as well.

4) COPY does not support UPSERT, does it?

My profiler (Instruments in Mac OS) shows that significant time is
spent in standard_ExecutorStart: see [3]
In fact, the time spent in standard_ExecutorStart even exceeds the
time spent in standard_ExecutorRun.

[1]: http://www.postgresql.org/message-id/55130DC8.2070508@redhat.com
[2]: https://github.com/pgjdbc/pgjdbc/pull/491#issuecomment-171780054
[3]: https://github.com/pgjdbc/pgjdbc/pull/491#issuecomment-171908974

Vladimir Sitnikov



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: Stream consistent snapshot via a logical decoding plugin as a series of INSERTs
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Insert values() per-statement overhead