Re: VACUUM FULL

Поиск
Список
Период
Сортировка
От Brian McCane
Тема Re: VACUUM FULL
Дата
Msg-id 20020511044230.F99530-100000@fw.mccons.net
обсуждение исходный текст
Ответ на Re: VACUUM FULL  ("Rajesh Kumar Mallah." <mallah@trade-india.com>)
Ответы Re: VACUUM FULL  ("Rajesh Kumar Mallah." <mallah@trade-india.com>)
Список pgsql-admin
I have tried contrib/tsearch, but I do fairly complex score computation
based on word placement (headers,body,graphics,metatags, etc).  And also
do additional computations on the scores when the user does a search.  It
is very quick to use something like tsearch to figure out which pages
match my criterion, but very slow to then compute a valid score for each
page.  Instead, in an attempt to speed things up, a score is computed for
each distinct word on a page, and that score is stored in this table.
Then when a search is requested, I use previously existing search
information and data from this table to compute the new results.

The upshot of all this computing is that I get fairly relevant results,
but I have to do a lot of math to get there.  I have been considering
combining the two methods, doing something like a full text search to find
pages that meet my criterion, and then using this table to actually
compute a pages score.

- brian


On Sat, 11 May 2002, Rajesh Kumar Mallah. wrote:

>
> Hi Brian ,
>
> are you performing full text search in any case?
>
> Apart from optimizing the TABLE/INDEXES (thru VACUUM measures)
>
> i feel using text indexes provided by contrib/tsearch can also
> lead to significant improvement in search performance.
>
> regds
> mallah.
>
> On Saturday 11 May 2002 05:48 am, Brian McCane wrote:
> > Okay, I guess I misunderstood something about "VACUUM FULL".  My
> > understanding was that a VACUUM (without FULL), marked unused records for
> > reuse.  VACUUM FULL moved records from the end of a file to the holes
> > where unused records were stored and truncated the file to free disk
> > space.  So far so good, but....
> >
> > I have had continued loss of performance on one of my largest tables
> > (600,000,000 records).  There are 4 fields in a record like such:
> >
> > CREATE TABLE foo (
> >   a int not null references bar(a) on delete cascade on update no action,
> >   b int not null references baz(b) on delete cascade on update no action,
> >   c int,
> >   d smallint,
> >   primary key(a, b)) ;
> > CREATE INDEX foo_ac ON foo (a,c) ;
> > CREATE INDEX foo_ad on foo (a,d) ;
> > And there are 3 triggers which fire before insert/delete/update.
> >
> > I INSERT/UPDATE/DELETE approximately 300,000 records per day, but this
> > number is increasing on a daily basis as I make changes which improve the
> > performance of my data gathering tools (spiders ;).  Two days ago, it had
> > reached the point where a search for a 3-word term (ie. free news servers)
> > took about 60 seconds.  I have just spent 48 hours running a VACUUM FULL
> > on my table, and now the same search takes < 10 seconds.  I assume that
> > the increase in performance is due to the decrease in table/index size
> > which added up to approximate 1GB of freed space on the machine, which was
> > approximately 4% of the original size of the table and all its indices.
> > But, a 4% decrease in size should not add up to a 84% increase in
> > performance (is that right? I always get the ratio confused :).
> >
> > If all that VACUUM FULL did was move records from file 12345678.6 to file
> > 12345678, the database would still being doing a large number of random
> > accesses on the table.  However, if VACUUM FULL clusters the data
> > according to the primary key, it would still be doing a large number of
> > random access on the table, because the primary key has almost nothing to
> > do with how I actually access the data in real life.  So, is VACUUM FULL
> > looking somewhere in pg_statistics (or pg_stat_user_indexes), to determine
> > which index I actually use most (foo_ad), and then clustering the data
> > that way, or is there some other agent at work here.
> >
> > - brian
> >
> >
> > Wm. Brian McCane                    | Life is full of doors that won't open
> > Search http://recall.maxbaud.net/   | when you knock, equally spaced amid
> > those Usenet http://freenews.maxbaud.net/ | that open when you don't want
> > them to. Auction http://www.sellit-here.com/ | - Roger Zelazny "Blood of
> > Amber"
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 3: if posting/reading through Usenet, please send an appropriate
> > subscribe-nomail command to majordomo@postgresql.org so that your
> > message can get through to the mailing list cleanly
>
> --
> Rajesh Kumar Mallah,
> Project Manager (Development)
> Infocom Network Limited, New Delhi
> phone: +91(11)6152172 (221) (L) ,9811255597 (M)
>
> Visit http://www.trade-india.com ,
> India's Leading B2B eMarketplace.
>
>

Wm. Brian McCane                    | Life is full of doors that won't open
Search http://recall.maxbaud.net/   | when you knock, equally spaced amid those
Usenet http://freenews.maxbaud.net/ | that open when you don't want them to.
Auction http://www.sellit-here.com/ | - Roger Zelazny "Blood of Amber"


В списке pgsql-admin по дате отправления:

Предыдущее
От: "Rajesh Kumar Mallah."
Дата:
Сообщение: Re: VACUUM FULL
Следующее
От: Brian McCane
Дата:
Сообщение: Re: VACUUM FULL