Re: Best strategy for bulk inserts where some violate unique constraint?
От | Daniele Varrazzo |
---|---|
Тема | Re: Best strategy for bulk inserts where some violate unique constraint? |
Дата | |
Msg-id | CA+mi_8bqAV_zK3kZr=yYpCwjkgJ8ie5ABXZtbwQv07G7L02paw@mail.gmail.com обсуждение исходный текст |
Ответ на | Best strategy for bulk inserts where some violate unique constraint? (Denis Papathanasiou <denis.papathanasiou@gmail.com>) |
Ответы |
Re: Best strategy for bulk inserts where some violate unique
constraint?
Re: Best strategy for bulk inserts where some violate unique constraint? |
Список | psycopg |
On Tue, Nov 5, 2013 at 8:14 PM, Denis Papathanasiou <denis.papathanasiou@gmail.com> wrote: > I'm using Pyscopg2 to insert multiple rows at once, using a dictionary as > shown at the bottom of this page: > https://wiki.postgresql.org/wiki/Psycopg2_Tutorial > > I notice that even if just one entry violates a table unique constraint (an > IntegrityError in psycopg2 parlance), the entire contents of the dictionary > do not get inserted into the table. > > Is there any way to tell postgres: it's ok to reject individual entries that > may violate table unique constraints, but at the same time allow those which > do not? > > It's too expensive for me to do the inserts one at a time where I could just > catch the psycopg2.IntegrityError. As the last example do you mean the executemany() example? That is not going to be much faster than repeated execute(). The easiest thing you can do is to switch to autocommit=True and do repeated execute with insert. If one fail you can just ignore the IntegrityError and go on. About as easy you can prepare a statement and execute it repeatedly using PREPARE/EXECUTE: see <http://www.postgresql.org/docs/9.2/static/sql-prepare.html> There is no builtin support for that in psycopg but you can just execute() these statements. It may save you something. You can also take a look at this example of a PREPAREing cursor: <https://gist.github.com/dvarrazzo/3797445>. However, the fastest way to insert data into Postgres is COPY, see <http://initd.org/psycopg/docs/cursor.html#cursor.copy_from>. You will have to present your data as a file. I can't remember what happens when a record fails the integrity test: I think the other would still be inserted but you will have to check. A much more robust strategy is to create a temporary table with the right schema but without constraints, load the data there using COPY and then move the data to the final table using INSERT INTO ... SELECT * FROM temp_table WHERE ... and specify a condition to avoid the records that would fail the constraint. I would go for COPY, it's by far faster than execute[many], even including prepare. -- Daniele
В списке psycopg по дате отправления: