Re: [psycopg] speed concerns with executemany()
От | Adrian Klaver |
---|---|
Тема | Re: [psycopg] speed concerns with executemany() |
Дата | |
Msg-id | c5457fe5-cefb-7c7b-9a33-6a7ff0b3658f@aklaver.com обсуждение исходный текст |
Ответ на | Re: [psycopg] speed concerns with executemany() (Daniele Varrazzo <daniele.varrazzo@gmail.com>) |
Ответы |
Re: [psycopg] speed concerns with executemany()
|
Список | psycopg |
On 12/30/2016 02:24 PM, Daniele Varrazzo wrote: > The implementation of executemany as described by me a few days ago is > available in this gist, not heavily tested: > > https://gist.github.com/dvarrazzo/4204cca5d1cd7c9c95bc814d81e6b83e > > I would like to know if anyone sees any shortcoming in this new implementation. A quick test. I added an argument to change the page_size on the command line: With NRECS=10000: aklaver@tito:~> python psycopg_executemany.py -p 10 classic: 0.800544023514 sec joined: 0.514330863953 sec aklaver@tito:~> python psycopg_executemany.py -p 100 classic: 0.780461072922 sec joined: 0.473304986954 sec aklaver@tito:~> python psycopg_executemany.py -p 1000 classic: 0.820818901062 sec joined: 0.488647937775 sec With NRECS=100000: aklaver@tito:~> python psycopg_executemany.py -p 10 classic: 7.78319811821 sec joined: 4.18683385849 sec aklaver@tito:~> python psycopg_executemany.py -p 100 classic: 7.75992202759 sec joined: 4.06096816063 sec aklaver@tito:~> python psycopg_executemany.py -p 1000 classic: 7.76269102097 sec joined: 4.12301802635 sec The relative difference between the classic and joined seems to hold, you just do not seem to get much benefit from changing the page_size. Not sure how much that matters and you do get a benefit from the joined solution. > > -- Daniele > > On Sun, Dec 25, 2016 at 10:11 AM, Dorian Hoxha <dorian.hoxha@gmail.com> wrote: >> Sending stuff in big-batches + autocommit (fast transactions) + few network >> calls is performance 101 I thought. I think the "executemany" should be >> documented what it does (it looked suspicious when I saw it long time ago, >> why I didn't use it). >> >> On Sat, Dec 24, 2016 at 6:00 AM, Adrian Klaver <adrian.klaver@aklaver.com> >> wrote: >>> >>> On 12/23/2016 06:57 PM, Christophe Pettus wrote: >>>> >>>> >>>>> On Dec 23, 2016, at 18:55, Adrian Klaver <adrian.klaver@aklaver.com> >>>>> wrote: >>>>> Alright that I get. Still the practical outcome is each INSERT is being >>>>> done in a transaction (an implicit one) so the transaction overhead comes >>>>> into play. Or am I missing something? >>>> >>>> >>>> Nope, not missing a thing. The theory (and it is only that) is that when >>>> they do the .executemany(), each of those INSERTs pays the transaction >>>> overhead, while if they do one big INSERT, just that one statement does. >>> >>> >>> Just ran a quick and dirty test using IPython %timeit. >>> >>> With a list of 200 tuples each which had 3 integers INSERTing into: >>> test=> \d psycopg_table >>> Table "public.psycopg_table" >>> Column | Type | Modifiers >>> --------+---------+----------- >>> a | integer | >>> b | integer | >>> c | integer | >>> >>> >>> The results where: >>> >>> sql = "INSERT INTO psycopg_table VALUES(%s, %s, %s)" >>> >>> Without autocommit: >>> >>> In [65]: timeit -n 10 cur.executemany(sql, l) >>> 10 loops, best of 3: 12.5 ms per loop >>> >>> >>> With autocommit: >>> >>> In [72]: timeit -n 10 cur.executemany(sql, l) >>> 10 loops, best of 3: 1.71 s per loop >>> >>> >>>> >>>> -- >>>> -- Christophe Pettus >>>> xof@thebuild.com >>>> >>> >>> >>> -- >>> Adrian Klaver >>> adrian.klaver@aklaver.com >>> >>> >>> >>> -- >>> Sent via psycopg mailing list (psycopg@postgresql.org) >>> To make changes to your subscription: >>> http://www.postgresql.org/mailpref/psycopg >> >> -- Adrian Klaver adrian.klaver@aklaver.com
В списке psycopg по дате отправления: