Re: Limitting full join to one match

Поиск

Список

Период

Сортировка

От	Sergei Agalakov
Тема	Re: Limitting full join to one match
Дата	6 декабря 2018 г. 06:01:27
Msg-id	0a683148-5330-bd53-c3dc-7fb953e73874@gmail.com обсуждение
Список	pgsql-hackers

Дерево обсуждения

On 12/5/2018 8:30 PM, John W Higgins wrote:

On Wed, Dec 5, 2018 at 4:34 PM Phil Endecott <spam_from_pgsql_lists@chezphil.org> wrote:
Dear Experts,

I have a couple of tables that I want to reconcile, finding rows
that match and places where rows are missing from one table or the
other:

...

So my question is: how can I modify my query to output only two rows,
like this:?

+------------+--------+------------+--------+
| date | amount | date | amount |
+------------+--------+------------+--------+
| 2018-01-01 | 10.00 | 2018-01-01 | 10.00 |
| 2018-02-01 | 5.00 | | |
| | | 2018-03-01 | 8.00 |
| 2018-04-01 | 5.00 | 2018-04-01 | 5.00 |
| 2018-05-01 | 20.00 | 2018-05-01 | 20.00 | 1
| 2018-05-01 | 20.00 | 2018-05-01 | 20.00 | 2
+------------+--------+------------+--------+

Evening Phil,

Window functions are your friend here. I prefer views for this stuff - but subqueries would work just fine.

create view a_rows as (select *,
row_number() OVER (PARTITION BY date, amount) AS pos from a);
create view b_rows as (select *,
row_number() OVER (PARTITION BY date, amount) AS pos from b);

select
a_rows.date,
a_rows.amount,
a_rows.pos,
b_rows.date,
b_rows.amount,
b_rows.pos
from
a_rows full join b_rows using (date,amount,pos);

Example here - http://sqlfiddle.com/#!17/305d6/3

John

Any suggestions anyone?

The best I have found so far is something involving EXCEPT ALL:

db=> select * from a except all select * from b;
db=> select * from b except all select * from a;

That's not ideal, though, as what I ultimately want is something
that lists everything with its status:

+------------+--------+--------+
| date | amount | status |
+------------+--------+--------+
| 2018-01-01 | 10.00 | OK |
| 2018-02-01 | 5.00 | a_only |
| 2018-03-01 | 8.00 | b_only |
| 2018-04-01 | 5.00 | OK |
| 2018-05-01 | 20.00 | OK |
| 2018-05-01 | 20.00 | OK |
+------------+--------+--------+

That would be easy enough to achieve from the JOIN.

Thanks, Phil.

This question is always asked time to time.
I have found an old article with so far the best solution for big tables.
https://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:2151582681236#15393095283923

On the same test data
create table a (date date, amount money);create table b (date date, amount money);insert into a values ('2018-01-01', 10);insert into a values ('2018-02-01', 5);insert into a values ('2018-04-01', 5);insert into a values ('2018-05-01', 20);insert into a values ('2018-05-01', 20);insert into b values ('2018-01-01', 10);insert into b values ('2018-03-01', 8);insert into b values ('2018-04-01', 5);insert into b values ('2018-05-01', 20);insert into b values ('2018-05-01', 20);select tt.date, tt.amount, count(tt.src1) CNT1, count(tt.src2) CNT2 from ( select a.date, a.amount, 1 src1, null::integer src2 from a union all select b.date, b.amount, null::integer src1, 2 src2 from b ) ttgroup by tt.date, tt.amount;date amount cnt1 cnt22018-01-01 $10.00 1 12018-02-01 $5.00 1 02018-03-01 $8.00 0 12018-04-01 $5.00 1 12018-05-01 $20.00 2 2
It requires a sort, so you may want to increase work_mem before execution, and then return it back like
SET work_mem = '512MB';
... run your query
RESET work_mem;

Regards,

Sergei Agalakov

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Limitting full join to one match