Retrieve most recent 1 record from joined table

Поиск
Список
Период
Сортировка
От agharta
Тема Retrieve most recent 1 record from joined table
Дата
Msg-id 53F6F9DA.4060005@gmail.com
обсуждение исходный текст
Ответы Re: Retrieve most recent 1 record from joined table  (Brice André <brice@famille-andre.be>)
Re: Retrieve most recent 1 record from joined table  (Vik Fearing <vik.fearing@dalibo.com>)
Re: Retrieve most recent 1 record from joined table  (Thomas Kellerer <spam_eater@gmx.net>)
Список pgsql-sql
Hi all,
This is my first question, don't hate me please if it is the wrong place.

Mi question is relative seimple, but goes me crazy.

First, create some example tables:

----
create table table1(    t1_ID SERIAL primary key,    t1_value text
);

create table table2(    t2_ID SERIAL primary key,    t2_value text,    t1_id integer
);


create table table3(    t3_ID SERIAL primary key,    t3_value text,    t3_date timestamp,    t2_id integer
);

ALTER TABLE table2 ADD FOREIGN KEY (t1_id) REFERENCES table1 (t1_ID) ON 
DELETE CASCADE;
ALTER TABLE table3 ADD FOREIGN KEY (t2_id) REFERENCES table2 (t2_ID) ON 
DELETE CASCADE;

CREATE INDEX IDX_TABLE1_T1_value ON table1 (t1_value);
CREATE INDEX IDX_TABLE2_T2_value ON table2 (t2_value);
CREATE INDEX IDX_TABLE3_T3_value ON table3 (t3_value);
CREATE INDEX IDX_TABLE3_T3_DATE ON table3 (t3_date);
-----

As you can see, table3 is referenced to table2 and table2 is referenced 
to table1

Next, insert some data on tables:

--one millon records into table1
insert into table1(t1_value) select md5(random()::text) from (select 
generate_series(1,1000000) as a) as c;

--100.000 records on table2 where t1_id is a random number between 1 and 
10.001. This guarantees many records pointing to same table1 t1_id
insert into table2(t2_value, t1_id) select md5(random()::text), 
trunc((random()+1)*10000)::integer from (select 
generate_series(1,100000) as a) as c;

--again, 1.022.401 records on table3 where t2_id is a random number 
between 1 and 10.001. This guarantee many records pointing to same 
table2 t2_id.
-- random numbers (1.022.401) are generated by generated_series function 
with timestamp (1 minute) interval
insert into table3(t3_value, t2_id, t3_date) select md5(random()::text), 
trunc((random()+1)*10000)::integer, c.date_val from (select 
generate_series(timestamp '2013-01-01',timestamp '2014-12-12', interval 
'1 minute') as date_val) as c;


So, now we should have table3 with many rows per table2(t2_id) and many 
rows in table2 with same t1_id.


Now, the question:


Joining the tables, how to get ONLY most recent record per table3(t3_date)??

Query example:

select * from table1 as t1
inner join table2 t2 on (t1.t1_id = t2.t1_id and t2.t2_value like('%ab%') )
inner join table3 t3 on (t2.t2_id = t3.t2_id and t3.t3_date <= timestamp 
'2014-08-20')
order by t3.t2_id, t3.t3_date desc

produces to me this dataset (extract)


t1_id       t1_value      t2_id      t2_value           t1_id      
t3_id         t3_value                             t3_date                            t2_id
17098    74127bc80ca759678892c957b6a34fc7    10020 
9182c1f48cd008e31d781abc45723a10    17098    857683 
0fb6cb380522dd83b6ac4beba2c6e98f    2014-08-19 14:42:00.0    10020
17098    74127bc80ca759678892c957b6a34fc7    10020 
9182c1f48cd008e31d781abc45723a10    17098    857088 
9c3481bfc0bdea51e62b338a1777cde6    2014-08-19 04:47:00.0    10020
17098    74127bc80ca759678892c957b6a34fc7    10020 
9182c1f48cd008e31d781abc45723a10    17098    853208 
253dc2083e70dd9e276867f43889973f    2014-08-16 12:07:00.0    10020
17098    74127bc80ca759678892c957b6a34fc7    10020 
9182c1f48cd008e31d781abc45723a10    17098    851237 
970acf901c4232c178b5dbeda4e44ac8    2014-08-15 03:16:00.0    10020
17098    74127bc80ca759678892c957b6a34fc7    10020 
9182c1f48cd008e31d781abc45723a10    17098    847436 
cb8d52255eaa850f9d7f203092a2ce13    2014-08-12 11:55:00.0    10020
17098    74127bc80ca759678892c957b6a34fc7    10020 
9182c1f48cd008e31d781abc45723a10    17098    819339 
d9e36ad622b1db499b3f623cdd46a811    2014-07-23 23:38:00.0    10020
17098    74127bc80ca759678892c957b6a34fc7    10020 
9182c1f48cd008e31d781abc45723a10    17098    818022 
b9d13239f522e3a22f20d36ea6dab8ad    2014-07-23 01:41:00.0    10020
17098    74127bc80ca759678892c957b6a34fc7    10020 
9182c1f48cd008e31d781abc45723a10    17098    803046 
c36164f3077894a1986c4922dfb632ec    2014-07-12 16:05:00.0    10020
17098    74127bc80ca759678892c957b6a34fc7    10020 
9182c1f48cd008e31d781abc45723a10    17098    788129 
32c966feab2212a29f86bebbaa6dfec9    2014-07-02 07:28:00.0    10020


As you can see, there are many t3_id  per single t2_id.

I need the same dataset, but i want only the most recent (one) record 
per table3, and, of course, directly in the join rule (in production i 
have a long and complex query).

My personal solution (very slow, i can't use it):

select * from table1 as t1
inner join table2 t2 on (t1.t1_id = t2.t1_id and t2.t2_value like('%ab%') )
inner join table3 t3 on (
t3.t3_id = (select t3_id from table3 where t2_id = t2.t2_id and t3_date 
<= timestamp '2014-08-20' order by t3_date desc fetch first 1 rows only)
and t3.t2_id = t2.t2_id
)
order by t3.t2_id, t3.t3_date desc


it gives me the right result, but performances are poor........

Same bad performance result with a function that performs the table3 query.

Suggestions?

Thanks to anyone who can answer to me!!!!

Best regards,

Agharta












В списке pgsql-sql по дате отправления:

Предыдущее
От: Adrian Klaver
Дата:
Сообщение: Re: function call
Следующее
От: Brice André
Дата:
Сообщение: Re: Retrieve most recent 1 record from joined table