Обсуждение: flagging first row inserted for each "group" of key
I've a web application. I'm logging data related to the landing page and subsequent selected hits. create table track_hit ( hitid serial not null, /* pk? I don't mind if serial wrap around pk could be (hitid, tracking_time) */ esid varchar(32), -- related to session tracking_time timestamp not null default now(), -- some other stats -- first_hit boolean not null default false, -- ??? ); I'd like to be sure I just count one "first hit" in a session (same esid) in an efficient way that means I'd like to mark them with a flag and avoid a group by, min subquery cycle when I'm doing reporting. I can't trust the browser and I just want one first hit for each esid, no matter if they have the same tracking_time. Of course the problem is concurrency, but maybe I missed some pg trick that could help me. I'm on 8.3 and no plan to move to 8.4 shortly, so no windowing functions that will make reporting easier/faster. -- Ivan Sergio Borgonovo http://www.webthatworks.it
In response to Ivan Sergio Borgonovo : > I've a web application. > > I'm logging data related to the landing page and subsequent selected > hits. > > create table track_hit ( > hitid serial not null, > /* pk? I don't mind if serial wrap around > pk could be (hitid, tracking_time) */ > esid varchar(32), -- related to session > tracking_time timestamp not null default now(), > -- some other stats > -- first_hit boolean not null default false, -- ??? > ); > > I'd like to be sure I just count one "first hit" in a session (same > esid) in an efficient way that means I'd like to mark them with a select distinct on (esid) esid, tracking_time from track_hit order by esid, tracking_time; returns only one record for each esid, ordered by tracking_time. Should work with 8.x, maybe sice 7.x (I'm not sure) HTH, Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431 2EB0 389D 1DC2 3172 0C99
On Thu, 17 Dec 2009 10:38:32 +0100 "A. Kretschmer" <andreas.kretschmer@schollglas.com> wrote: > In response to Ivan Sergio Borgonovo : > > I've a web application. > > > > I'm logging data related to the landing page and subsequent > > selected hits. > > > > create table track_hit ( > > hitid serial not null, > > /* pk? I don't mind if serial wrap around > > pk could be (hitid, tracking_time) */ > > esid varchar(32), -- related to session > > tracking_time timestamp not null default now(), > > -- some other stats > > -- first_hit boolean not null default false, -- ??? > > ); > > > > I'd like to be sure I just count one "first hit" in a session > > (same esid) in an efficient way that means I'd like to mark them > > with a > > select distinct on (esid) esid, tracking_time from track_hit order > by esid, tracking_time; > > returns only one record for each esid, ordered by tracking_time. > Should work with 8.x, maybe sice 7.x (I'm not sure) I think I've tried to resolve a concurrency problem in the wrong place... still... what is the difference between: select min(hitid) as h from track_hit group by esid; and select distinct on (esid) hitid from track_hit order by esid, track_time; I haven't enough test data to see if they perform differently. The second form seems to perform a little bit faster. I'd expect the opposite: the first performing better. I think I'll add an index on track_time for reporting and maybe make primary key (hitid, tracking_time). I don't want to be bothered by hitid wrap around, so I don't want to make it a pk alone, still I may need a pk. thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it