Re: SQL report
От | Rob Sargent |
---|---|
Тема | Re: SQL report |
Дата | |
Msg-id | 4A71B953.4070208@gmail.com обсуждение исходный текст |
Ответ на | SQL report (wkipjohn@gmail.com) |
Ответы |
Re: SQL report
(wkipjohn@gmail.com)
|
Список | pgsql-sql |
I would be curious to know the performance curve for let's say 20K, 40K , 60K, 80K, 100K records. And what sort of indexing you have, whether or not it's clustered, re-built and so on. One could envision partitioning the status table such that recent records were grouped together (on the assumption that they will be most frequently "reported"). wkipjohn@gmail.com wrote: > I have the following senario. > > I have a tracking system. The system will record the status of an > object regularly, all the status records are stored in one table. And > it will keep a history of maximum 1000 status record for each object > it tracks. The maximum objects the system will track is 100,000. Which > means I will potentially have a table size of 100 million records. > > I have to generate a report on the latest status of all objects being > tracked at a particular point in time, and also I have to allow user > to sort and filter on different columes in the status record displayed > in the report. > > The following is a brief description in the status record (they are > not actual code) > > ObjectRecord( > objectId bigint PrimaryKey > desc varchar > ) > > StatusRecord ( > id bigint PrimaryKey > objectId bigint indexed > datetime bigint indexed > capacity double > reliability double > efficiency double > ) > > I have tried to do the following, it works very well with around > 20,000 objects. (The query return in less than 10s) But when I have > 100,000 objects it becomes very very slow. (I don't even have patience > to wait for it to return.... I kill it after 30 mins) > > select * from statusrecord s1 INNER JOIN ( SELECT objectId , > MAX(datetime) AS msdt FROM statusrecord WHERE startDatetime <= > 1233897527657 GROUP BY objectId ) AS s2 ON ( s1.objectId = s2.objectId > AND s1.datetime = s2.msdt ) where ( capacity < 10.0 ) order by > s1.datetime DESC, s1.objectId DESC; > > I did try to write a store procedure like below, for 100,000 objects > and 1000 status records / object, it returns in around 30 mins. > > CREATE OR REPLACE FUNCTION getStatus(pitvalue BIGINT) RETURNS SETOF > statusrecord AS $BODY$ > DECLARE > id VARCHAR; > status statusrecord%ROWTYPE; > BEGIN > FOR object IN SELECT * FROM objectRecord > LOOP > EXECUTE 'SELECT * FROM statusrecord WHERE objectId = ' || > quote_literal(object.objectId) || > ' AND datetime <= ' || quote_literal(pitvalue) || ' ORDER BY datetime > DESC' > INTO status; > IF FOUND THEN > RETURN NEXT status; > END IF; > END LOOP; > RETURN; > END > $BODY$ LANGUAGE plpgsql; > > Just wanna to know if anyone have a different approach to my senario. > Thanks alot. > > John
В списке pgsql-sql по дате отправления: