Обсуждение: Merging lines with NULLs (with example data)
Good afternoon,
I have a simple problem, and I feel stupid not finding myself what's the
solution... I try to explain shortly, but as I'm not really confident with
my explanation, I provided a complete example with data below.
How can I "merge" this
gday,count_udp,count_tcp
'2005-10-20','','2'
'2005-10-20','3',''
'2005-10-21','','1'
'2005-10-21','5',''
into that:
gday,count_udp,count_tcp
'2005-10-20','3','2'
'2005-10-21','5','1'
in a single query???
Thanks in advance,
MaXX
Here's all the details:
I have a table
CREATE TABLE test
( id serial NOT NULL, tstamp timestamptz, host varchar(80), rulenr int4, act varchar(10), proto varchar(4), src_ip
inet,src_port int4, dst_ip inet, dst_port int4, dir varchar(3), if varchar(5), reported bool, protected bool,
CONSTRAINTpk_ipfw_id PRIMARY KEY (id)
)
WITH OIDS;
Data:
INSERT INTO test VALUES (453639,'2005-10-21
09:39:19+02','akar',600,'Deny','UDP','10.182.144.167',52616,'10.0.0.1',2290,'in','rl0',TRUE,NULL);
INSERT INTO test VALUES (453634,'2005-10-21
09:36:21+02','akar',600,'Deny','UDP','10.182.144.167',52616,'10.0.0.1',2290,'in','rl0',TRUE,NULL);
INSERT INTO test VALUES (453633,'2005-10-21
09:36:20+02','akar',600,'Deny','UDP','10.182.144.167',52616,'10.0.0.1',2290,'in','rl0',TRUE,NULL);
INSERT INTO test VALUES (452610,'2005-10-21
03:33:28+02','akar',600,'Deny','TCP','10.182.174.7',4310,'10.0.0.1',2290,'in','rl0',TRUE,NULL);
INSERT INTO test VALUES (451735,'2005-10-21
00:11:52+02','akar',600,'Deny','UDP','10.216.48.231',6778,'10.0.0.1',2290,'in','rl0',TRUE,NULL);
INSERT INTO test VALUES (451734,'2005-10-21
00:11:50+02','akar',600,'Deny','UDP','10.216.48.231',6778,'10.0.0.1',2290,'in','rl0',TRUE,NULL);
INSERT INTO test VALUES (448277,'2005-10-20
16:31:17+02','akar',600,'Deny','UDP','10.244.165.152',31262,'10.0.0.1',2290,'in','rl0',TRUE,NULL);
INSERT INTO test VALUES (448276,'2005-10-20
16:31:15+02','akar',600,'Deny','UDP','10.244.165.152',31262,'10.0.0.1',2290,'in','rl0',TRUE,NULL);
INSERT INTO test VALUES (448266,'2005-10-20
16:29:08+02','akar',600,'Deny','TCP','10.224.32.85',63891,'10.0.0.1',2290,'in','rl0',TRUE,NULL);
INSERT INTO test VALUES (448265,'2005-10-20
16:29:05+02','akar',600,'Deny','TCP','10.224.32.85',63891,'10.0.0.1',2290,'in','rl0',TRUE,NULL);
INSERT INTO test VALUES (448258,'2005-10-20
16:28:16+02','akar',600,'Deny','UDP','10.244.165.152',31262,'10.0.0.1',2290,'in','rl0',TRUE,NULL);
when I execute a query like this,
select to_date(tstamp,'YYYY-MM-DD')as gday, case when proto='UDP' then count(id)
elseNULL end as count_udp, case when proto='TCP' then count(id) else NULL
end as count_tcp
from test where tstamp >= (now() - interval '$days days' ) and dst_port = $port group by gday, proto
order by gday;
I get:
gday,count_udp,count_tcp
'2005-10-20','','2'
'2005-10-20','3',''
'2005-10-21','','1'
'2005-10-21','5',''
This is not what I want, I want that:
gday,count_udp,count_tcp
'2005-10-20','3','2'
'2005-10-21','5','1'
--
MaXX
In article <djdp5l$1l4f$1@talisker.lacave.net>, MaXX <bs139412@skynet.be> writes: > How can I "merge" this > gday,count_udp,count_tcp > '2005-10-20','','2' > '2005-10-20','3','' > '2005-10-21','','1' > '2005-10-21','5','' > into that: > gday,count_udp,count_tcp > '2005-10-20','3','2' > '2005-10-21','5','1' > in a single query??? Try something like that: SELECT to_date (tstamp,'YYYY-MM-DD') AS gday, sum(CASE WHEN proto = 'UDP' THEN 1 ELSE 0 END) AS count_udp, sum(CASE WHEN proto = 'TCP' THEN 1 ELSE 0 END) AS count_tcp FROM test WHERE tstamp >= now() - INTERVAL '$days DAYS' ANDdst_port = $port GROUP BY gday ORDER BY gday
Harald Fuchs wrote: > In article <djdp5l$1l4f$1@talisker.lacave.net>, > MaXX <bs139412@skynet.be> writes: > > >>How can I "merge" this >>gday,count_udp,count_tcp >>'2005-10-20','','2' >>'2005-10-20','3','' >>'2005-10-21','','1' >>'2005-10-21','5','' > > >>into that: >>gday,count_udp,count_tcp >>'2005-10-20','3','2' >>'2005-10-21','5','1' > > >>in a single query??? > > > Try something like that: > > SELECT to_date (tstamp,'YYYY-MM-DD') AS gday, > sum(CASE WHEN proto = 'UDP' THEN 1 ELSE 0 END) AS count_udp, > sum(CASE WHEN proto = 'TCP' THEN 1 ELSE 0 END) AS count_tcp > FROM test > WHERE tstamp >= now() - INTERVAL '$days DAYS' > AND dst_port = $port > GROUP BY gday > ORDER BY gday > Or, via a subquery: select distinct to_date(tstamp,'YYYY-MM-DD') as gday, ( select count(id) from test t1 where proto='UDP' and to_date(t1.tstamp,'YYYY-MM-DD') = to_date(test.tstamp,'YYYY-MM-DD') ) as count_udp, ( select count(id) from test t1 where proto='TCP' and to_date(t1.tstamp,'YYYY-MM-DD') = to_date(test.tstamp,'YYYY-MM-DD') ) as count_tcp from test where tstamp >= (now() - interval '6 days' ) and dst_port = 2290 order by gday; Harald's solution is better for your particular case and will almost certainly be faster, but subqueries are good to know how to do. :) > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org -- Daryl Richter Platform Author & Director of Technology (( Brandywine Asset Management ) ( "Expanding the Science of Global Investing" ) ( http://www.brandywine.com ))
Thank you, and sorry for the late answer, I was far away from a decent internet connection... I'll try both your solutions, EXPLAIN ANALYSE will elect the winner... In any case that will be cleaner than my dirty hack (2 distinct queries) which generate a lot of garbage... Thanks again, MaXX Daryl Richter wrote: > Harald Fuchs wrote: >> Try something like that: >> SELECT to_date (tstamp,'YYYY-MM-DD') AS gday, >> sum(CASE WHEN proto = 'UDP' THEN 1 ELSE 0 END) AS count_udp, >> sum(CASE WHEN proto = 'TCP' THEN 1 ELSE 0 END) AS count_tcp >> FROM test >> WHERE tstamp >= now() - INTERVAL '$days DAYS' >> AND dst_port = $port >> GROUP BY gday >> ORDER BY gday > Or, via a subquery: > select distinct to_date(tstamp,'YYYY-MM-DD') as gday, > ( select count(id) from test t1 where proto='UDP' and > to_date(t1.tstamp,'YYYY-MM-DD') = to_date(test.tstamp,'YYYY-MM-DD') ) as > count_udp, > ( select count(id) from test t1 where proto='TCP' and > to_date(t1.tstamp,'YYYY-MM-DD') = to_date(test.tstamp,'YYYY-MM-DD') ) as > count_tcp > from test > where tstamp >= (now() - interval '6 days' ) > and dst_port = 2290 > order by gday; > > Harald's solution is better for your particular case and will almost > certainly be faster, but subqueries are good to know how to do. :) -- MaXX