Re: SQL report
От | Rob Sargent |
---|---|
Тема | Re: SQL report |
Дата | |
Msg-id | 4A7312AA.4080608@gmail.com обсуждение исходный текст |
Ответ на | SQL report (wkipjohn@gmail.com) |
Список | pgsql-sql |
Did you look at the query plans for the various record counts? That might show which index is missing or misinformed :). I wonder if clustering the status table on objectid would help? This does then require maintenance so you might only load it at 75%. wkipjohn@gmail.com wrote: > Hi Rob, > > I have default B-Tree indexes created for each of the indexed columes > and primary key columes. (No multiple columes indexe or NULL FIRST or > DESC/ASC). I am using PostgreSQL 8.3 with the auto vacuum daemon on. I > assume analyse will be automatically run to collect statistics for use > by the planner and there is no maintainance for B-tree indexes once it > is created. (Please point me out if I am wrong about this) > > I will probably try to partition the status table to group more recent > status records together to minimize the dataset I am querying. > > Thx > John > > > On Jul 31, 2009 1:16am, Rob Sargent <robjsargent@gmail.com> wrote: > > I would be curious to know the performance curve for let's say 20K, > 40K , 60K, 80K, 100K records. And what sort of indexing you have, > whether or not it's clustered, re-built and so on. > > > > > > > > One could envision partitioning the status table such that recent > records were grouped together (on the assumption that they will be > most frequently "reported"). > > > > > > > > wkipjohn@gmail.com wrote: > > > > > > I have the following senario. > > > > > > > > I have a tracking system. The system will record the status of an > object regularly, all the status records are stored in one table. And > it will keep a history of maximum 1000 status record for each object > it tracks. The maximum objects the system will track is 100,000. Which > means I will potentially have a table size of 100 million records. > > > > > > > > I have to generate a report on the latest status of all objects > being tracked at a particular point in time, and also I have to allow > user to sort and filter on different columes in the status record > displayed in the report. > > > > > > > > The following is a brief description in the status record (they are > not actual code) > > > > > > > > ObjectRecord( > > > > objectId bigint PrimaryKey > > > > desc varchar > > > > ) > > > > > > > > StatusRecord ( > > > > id bigint PrimaryKey > > > > objectId bigint indexed > > > > datetime bigint indexed > > > > capacity double > > > > reliability double > > > > efficiency double > > > > ) > > > > > > > > I have tried to do the following, it works very well with around > 20,000 objects. (The query return in less than 10s) But when I have > 100,000 objects it becomes very very slow. (I don't even have patience > to wait for it to return.... I kill it after 30 mins) > > > > > > > > select * from statusrecord s1 INNER JOIN ( SELECT objectId , > MAX(datetime) AS msdt FROM statusrecord WHERE startDatetime > > > > > > I did try to write a store procedure like below, for 100,000 objects > and 1000 status records / object, it returns in around 30 mins. > > > > > > > > CREATE OR REPLACE FUNCTION getStatus(pitvalue BIGINT) RETURNS SETOF > statusrecord AS $BODY$ > > > > DECLARE > > > > id VARCHAR; > > > > status statusrecord%ROWTYPE; > > > > BEGIN > > > > FOR object IN SELECT * FROM objectRecord > > > > LOOP > > > > EXECUTE 'SELECT * FROM statusrecord WHERE objectId = ' || > quote_literal(object.objectId) || > > > > ' AND datetime > > INTO status; > > > > IF FOUND THEN > > > > RETURN NEXT status; > > > > END IF; > > > > END LOOP; > > > > RETURN; > > > > END > > > > $BODY$ LANGUAGE plpgsql; > > > > > > > > Just wanna to know if anyone have a different approach to my > senario. Thanks alot. > > > > > > > > John > > > >
В списке pgsql-sql по дате отправления: