Re: Check for duplicates before inserting new rows

Поиск
Список
Период
Сортировка
От Chris Sterritt
Тема Re: Check for duplicates before inserting new rows
Дата
Msg-id eaeafafb-b01c-947e-07ab-7d374f1ac19c@yobota.xyz
обсуждение исходный текст
Ответ на Re: Check for duplicates before inserting new rows  (Rich Shepard <rshepard@appl-ecosys.com>)
Список pgsql-general


On 04/09/2020 15:46, Rich Shepard wrote:
On Fri, 4 Sep 2020, Olivier Gautherot wrote:

First of all, what version of PostgreSQL are you using?

Olivier,

12.2.

One way would be to add a UNIQUE constraint and perform for each row of the
source table a INSERT ... ON CONFLICT DO NOTHING (see
https://www.postgresql.org/docs/current/sql-insert.html#SQL-ON-CONFLICT)

If it is a 2-way merge, I would encapsulate the process in a function
(which will create a transaction to protect your process) and add a column
to trace the rows that have been merged. For this purpose, you can use the
ON CONFLICT DO UPDATE ...

Lots for me to ponder.

The new data has attributes stored in two database tables: location and
measurements. I'll need to think about how both could be handled in a single
function. For example, the new data contains a row with a location not
already in the existing location table. That location needs to be added to
the location table and its associated measurement attributes then added to
the existing measurements table.

Thanks for the suggestion,

Rich



Assuming some simple table structures (I've not included PK or FK definitions for simplicity):

create table station (station_id integer, station_data text);

create table measurement (measurement_id bigserial, station_id integer, ameasurement text);

-- Add some test stations
insert into station(station_id, station_data)
values (1, 'station1'), (2, 'station2'), (3, 'station3') ;


*Query to identify new measurements:
select station_id, ameasurement
from (VALUES (1,'1meas1'), -- this represents your new test data set             (1,'1meas2'),             (2,'2meas1'),             (3,'3meas1')) as m(station_id, ameasurement)
except
select station_id, ameasurement
from measurement;

The query above will give you a list of the new values which are not currently stored in table measurement.
Run it and we get all 4 rows returned.

Add a measurement row:
insert into measurement(station_id, ameasurement)
values (2,'2meas1');

Now if you repeat the check for new measurements with the same query as at *, you only get 3 rows.

Cheers, Chris Sterritt

В списке pgsql-general по дате отправления:

Предыдущее
От: Sang Gyu Kim
Дата:
Сообщение: Can I get some advice regarding logical replication server?
Следующее
От: Chris Sterritt
Дата:
Сообщение: Re: Check for duplicates before inserting new rows