Обсуждение: More Praise for 7.4RC2

Поиск
Список
Период
Сортировка

More Praise for 7.4RC2

От
Reece Hart
Дата:
More praise for 7.4RC2:

I've installed 7.4RC2 and restored a fairly complex 20GB database (from 7.3.4) with ~75M rows in 30 tables and 4 schemas, numerous triggers and constraints, procs in plpgsql and plperl. To say that it all works great hugely underestimates the improvements. Thank you! Great work, guys!


Here's one comparison of a query that choked 7.3.4 and flies in 7.4RC2:
7.3.4=> explain select count(distinct pseq_id) from paprospect2 where pmodel_id in (select pmodel_id from pmsm_prospect2 where pmodelset_id=2) and run_id=1 and svm>11::real;
                                    QUERY PLAN
----------------------------------------------------------------------------------Aggregate  (cost=78639954.60..78639954.60 rows=1 width=4)
   ->  Seq Scan on paprospect2  (cost=0.00..78639951.41 rows=1274 width=4)
         Filter: ((run_id = 1) AND (svm > 11::real) AND (subplan))
         SubPlan
           ->  Materialize  (cost=3.02..3.02 rows=3 width=4)
                 ->  Seq Scan on pmsm_prospect2  (cost=0.00..3.02 rows=3 width=4)
                       Filter: (pmodelset_id = 2)

And with the same data and indices:
7.4RC2=# explain select count(distinct pseq_id) from paprospect2 where pmodel_id in (select pmodel_id from pmsm_prospect2 where pmodelset_id=2) and run_id=1 and svm>11::real;
                                                              QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------- Aggregate  (cost=871.09..871.09 rows=1 width=4)
   ->  Nested Loop  (cost=3.03..870.55 rows=215 width=4)
         ->  HashAggregate  (cost=3.03..3.03 rows=1 width=4)
               ->  Seq Scan on pmsm_prospect2  (cost=0.00..3.02 rows=1 width=4)
                     Filter: (pmodelset_id = 2)
         ->  Index Scan using paprospect2_search1 on paprospect2  (cost=0.00..864.84 rows=215 width=8)
               Index Cond: ((paprospect2.pmodel_id = "outer".pmodel_id) AND (paprospect2.run_id = 1) AND (paprospect2.svm > 11::

The cost estimates give the right qualitative feel. 7.3.4 took >30 minutes for this query whereas 7.4RC2 responds essentially instantaneously.

Again, thanks pgsql-hackers. This is great!

-Reece


P.S. I don't use plpython, but I did try to install the language with Python 1.5 (as I said, I don't use python). This does work with 7.3.4 on the same machine/environment. I got:
createlang: language installation failed: ERROR:  could not load library "/apps/compbio/i686-linux-2.4/opt/postgresql-7.4RC2/lib/plpython.so": /apps/compbio/i686-linux-2.4/opt/postgresql-7.4RC2/lib/plpython.so: undefined symbol: PyDict_Copy

I strongly suspect my user error. However, it's probably worth getting some python user to ensure all's well there.

-- 
Reece Hart, Ph.D.                       rkh@gene.com, http://www.gene.com/
Genentech, Inc.                         650/225-6133 (voice), -5389 (fax)
Bioinformatics and Protein Engineering
1 DNA Way, MS-93                        http://www.in-machina.com/~reece/
South San Francisco, CA  94080-4990     reece@in-machina.com, GPG: 0x25EC91A0

Re: More Praise for 7.4RC2

От
jake@omnimode.com (jake johnson)
Дата:
reece@in-machina.com (Reece Hart) wrote in message news:<1068597494.28850.266.camel@tallac>...
[...snip...]
> The cost estimates give the right qualitative feel. 7.3.4 took >30
> minutes for this query whereas 7.4RC2 responds essentially
> instantaneously.
>
[...snip...]

I also posted about the performance increase of 7.4, but I think that
much of the difference you're seeing (because it's such a large
difference) is probably due to the cleanliness of a newly restored
database from backup.  Perhaps a vacuum analyze could have improved
that 30+ minute response time on 7.3.4.

I too posted about the performance improvement, seeing about 200% at
first after restoring from backup.  After some transactions, however,
I'd say its more like 100% - 125% improvement.  I'm still particularly
impressed with the improvement in the select count(*) query
performance, which is about 100% improved.

Re: More Praise for 7.4RC2

От
Reece Hart
Дата:
On Wed, 2003-11-12 at 09:04, jake johnson wrote:
I also posted about the performance increase of 7.4, but I think that
much of the difference you're seeing (because it's such a large
difference) is probably due to the cleanliness of a newly restored
database from backup.

I agree that this seems likely, except that the 7.3.4 database is vacuumed nightly, and analyzed periodically. And about a week ago I reclustered on the index intended to most facilitate this select. Furthermore, merely hardcoding the subselect result achieves a tremendous improvement (which was the workaround I used). So, I'm pretty sure that it's not a vacuum, index use, or cleanliness issue.

I also meant to add in my original post that the system is a dual 2.4G xeon with 4GB of RAM.

-Reece

-- 
Reece Hart, http://www.in-machina.com/~reece/, GPG:0x25EC91A0 0xD178AAF9

Re: More Praise for 7.4RC2

От
Martijn van Oosterhout
Дата:
Actually, in your case it's probably the new optimisation regarding the use
of IN (subquery). They're now optimised to the same lavel as EXISTS IIRC.

On Wed, Nov 12, 2003 at 05:46:23PM -0800, Reece Hart wrote:
> On Wed, 2003-11-12 at 09:04, jake johnson wrote:
>
> > I also posted about the performance increase of 7.4, but I think that
> > much of the difference you're seeing (because it's such a large
> > difference) is probably due to the cleanliness of a newly restored
> > database from backup.
>
>
> I agree that this seems likely, except that the 7.3.4 database is
> vacuumed nightly, and analyzed periodically. And about a week ago I
> reclustered on the index intended to most facilitate this select.
> Furthermore, merely hardcoding the subselect result achieves a
> tremendous improvement (which was the workaround I used). So, I'm pretty
> sure that it's not a vacuum, index use, or cleanliness issue.
>
> I also meant to add in my original post that the system is a dual 2.4G
> xeon with 4GB of RAM.
>
> -Reece
>
>
>
> --
> Reece Hart, http://www.in-machina.com/~reece/, GPG:0x25EC91A0 0xD178AAF9

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> "All that is needed for the forces of evil to triumph is for enough good
> men to do nothing." - Edmond Burke
> "The penalty good people pay for not being interested in politics is to be
> governed by people worse than themselves." - Plato

Вложения

Re: More Praise for 7.4RC2

От
"scott.marlowe"
Дата:
On Wed, 12 Nov 2003, Reece Hart wrote:

> On Wed, 2003-11-12 at 09:04, jake johnson wrote:
>
> > I also posted about the performance increase of 7.4, but I think that
> > much of the difference you're seeing (because it's such a large
> > difference) is probably due to the cleanliness of a newly restored
> > database from backup.
>
>
> I agree that this seems likely, except that the 7.3.4 database is
> vacuumed nightly, and analyzed periodically. And about a week ago I
> reclustered on the index intended to most facilitate this select.
> Furthermore, merely hardcoding the subselect result achieves a
> tremendous improvement (which was the workaround I used). So, I'm pretty
> sure that it's not a vacuum, index use, or cleanliness issue.

Do you vacuum full every so often?  If not, and if you've been overflowing
your fsm, then your tables will just grow without shrinking.

Also, index growth could be a problem.

The real test is to dump the database and reload it to give 7.3.4 a fair
shake.


Re: More Praise for 7.4RC2

От
Reece Hart
Дата:
On Thu, 2003-11-13 at 10:09, scott.marlowe wrote:
Do you vacuum full every so often?  If not, and if you've been overflowing 
your fsm, then your tables will just grow without shrinking.
Also, index growth could be a problem.

Hmm. I didn't realize that I needed to vacuum full as well -- I thought vacuum was sufficient for performance gains, and that full reclaimed space but didn't result in significant performance gains. I have reindexed infrequently, but since that locks the table I didn't do that (or vacuum full) often. I guess I should try out pg_autovacuum, but I think that full vacuums only to prevent XID wraparound (if age>1.5B transactions), but not for compaction (is this correct?).

The real test is to dump the database and reload it to give 7.3.4 a fair 
shake.
It turns out that I have two copies of this database around at the moment running on 7.3.4. One was a fresh restore, and that's what I used to generate the explain. However, the query was run on the older database which was vacuumed and analyzed (but not vacuum full or reindexed), and on that instance the query took a long time. On the fresh install, it takes 72s. In summary:

7.3.4, long-running db: eons
7.3.4, freshly restored: 72s
7.4RC2, freshly restored: 0.3s

Thanks everyone for feedback and setting me straight. Although the gain isn't as great as I thought, it's still very significant.

-Reece

-- 
Reece Hart, http://www.in-machina.com/~reece/, GPG:0x25EC91A0

Re: More Praise for 7.4RC2

От
"Rick Gigger"
Дата:

Are there any guidelines on how often one should do a reindex?
----- Original Message -----
Sent: Thursday, November 13, 2003 12:50 PM
Subject: Re: [GENERAL] More Praise for 7.4RC2

On Thu, 2003-11-13 at 10:09, scott.marlowe wrote:
Do you vacuum full every so often?  If not, and if you've been overflowing 
your fsm, then your tables will just grow without shrinking.
Also, index growth could be a problem.

Hmm. I didn't realize that I needed to vacuum full as well -- I thought vacuum was sufficient for performance gains, and that full reclaimed space but didn't result in significant performance gains. I have reindexed infrequently, but since that locks the table I didn't do that (or vacuum full) often. I guess I should try out pg_autovacuum, but I think that full vacuums only to prevent XID wraparound (if age>1.5B transactions), but not for compaction (is this correct?).

The real test is to dump the database and reload it to give 7.3.4 a fair 
shake.
It turns out that I have two copies of this database around at the moment running on 7.3.4. One was a fresh restore, and that's what I used to generate the explain. However, the query was run on the older database which was vacuumed and analyzed (but not vacuum full or reindexed), and on that instance the query took a long time. On the fresh install, it takes 72s. In summary:

7.3.4, long-running db: eons
7.3.4, freshly restored: 72s
7.4RC2, freshly restored: 0.3s

Thanks everyone for feedback and setting me straight. Although the gain isn't as great as I thought, it's still very significant.

-Reece

-- 
Reece Hart, http://www.in-machina.com/~reece/, GPG:0x25EC91A0

Re: More Praise for 7.4RC2

От
"scott.marlowe"
Дата:
On Thu, 13 Nov 2003, Reece Hart wrote:

> On Thu, 2003-11-13 at 10:09, scott.marlowe wrote:
>
> > Do you vacuum full every so often?  If not, and if you've been overflowing
> > your fsm, then your tables will just grow without shrinking.
> > Also, index growth could be a problem.
>
>
> Hmm. I didn't realize that I needed to vacuum full as well -- I thought
> vacuum was sufficient for performance gains, and that full reclaimed
> space but didn't result in significant performance gains. I have
> reindexed infrequently, but since that locks the table I didn't do that
> (or vacuum full) often. I guess I should try out pg_autovacuum, but I
> think that full vacuums only to prevent XID wraparound (if age>1.5B
> transactions), but not for compaction (is this correct?).

Assuming your free space map has enough room, and you vacuum (plain
vacuum) often enough, you're dandy.

But, let's say you run 10,000 transactions on a 1,000 row table, then run
a plain vacuum.  Even if your fsm can hold all the space that is free in
that table's space, you've got a sparsely populated table that will take a
while to seq scan through.

I.e. if you don't vacuum (regular) often enough, then your tables may be
quite large.

I've tested out the pg_autovacuum daemon, and it seems to work quite well
for me.

What you're really shooting for is a "steady state" table size.  Let's say
you've got a 10,000 row table, and you average 500 changes an hour on it.
If you vacuum it every day it will probably be fine, as by the end of 24
hours, the table will take up the space of about 22,000 rows.  So, it will
have the same basic performance as if it had 22,000 rows in it.

If you start vacuuming it every hour after that first few days, then it
will always be about 22,000 rows in size, (assuming the rows aren't
noticable bigger or smaller from one version to the next.)

While plain vacuums can reclaim the space at the very end of a table, and
could theoretically make this table shrink over time, it's highly unlikely
to ever drop back down to the approximate 10,000 rows in size started
with.  Vacuum full will drop it back down to somewhere around there.

So, if your table is HIGHLY updated, you may need to run a plain vacuum
very often, and that's where the autovacuum daemon comes in handy.  Just
set it to run every 30 minutes or so, and let it go.  It should only
vacuum the tables that have had lots of change, and leave the others
alone.

With the vacuum delay patch that's in testing for 7.5, it may well be that
running the autovacuum daemon will become acceptable in places where,
right now, vacuum, even the regular kind, produce too much system load /
slow down in the middle of the day.

> > The real test is to dump the database and reload it to give 7.3.4 a fair
> > shake.
>
> It turns out that I have two copies of this database around at the
> moment running on 7.3.4. One was a fresh restore, and that's what I used
> to generate the explain. However, the query was run on the older
> database which was vacuumed and analyzed (but not vacuum full or
> reindexed), and on that instance the query took a long time. On the
> fresh install, it takes 72s. In summary:
>
> 7.3.4, long-running db: eons
> 7.3.4, freshly restored: 72s
> 7.4RC2, freshly restored: 0.3s

You can use the oid2name program in the contrib directory to kinda
research which files are big under those trees and see if it's a table or
index growth problem.

something like:

#su - postgres
$ oid2name
All databases:
---------------------------------
16975  = template0
16976  = postgres

$ cd $PGDATA/base/16976
du -s *|sort -n

1004    16640
12232   109169550
65644   109169548

will list the largest files in the postgres database directory.

oid2name -d postgres |grep 10169548
109169548 = accounts

tells me that it's the accounts table that's taking up all my room.

and so on.


Re: More Praise for 7.4RC2

От
"scott.marlowe"
Дата:
Oh, another good choice for embedding is sleepycat's berkely db database,
or just plain old db style (gdbm lib, or ndbm, or any of a few others)
hash databases.  Simple, non-relational, and fast.

On Thu, 13 Nov 2003, scott.marlowe wrote:

> On Thu, 13 Nov 2003, Reece Hart wrote:
>
> > On Thu, 2003-11-13 at 10:09, scott.marlowe wrote:
> >
> > > Do you vacuum full every so often?  If not, and if you've been overflowing
> > > your fsm, then your tables will just grow without shrinking.
> > > Also, index growth could be a problem.
> >
> >
> > Hmm. I didn't realize that I needed to vacuum full as well -- I thought
> > vacuum was sufficient for performance gains, and that full reclaimed
> > space but didn't result in significant performance gains. I have
> > reindexed infrequently, but since that locks the table I didn't do that
> > (or vacuum full) often. I guess I should try out pg_autovacuum, but I
> > think that full vacuums only to prevent XID wraparound (if age>1.5B
> > transactions), but not for compaction (is this correct?).


Re: More Praise for 7.4RC2

От
Greg Stark
Дата:
Reece Hart <reece@in-machina.com> writes:

> On Thu, 2003-11-13 at 10:09, scott.marlowe wrote:
>
> > Do you vacuum full every so often?  If not, and if you've been overflowing
> > your fsm, then your tables will just grow without shrinking.
> > Also, index growth could be a problem.
>
>
> Hmm. I didn't realize that I needed to vacuum full as well -- I thought
> vacuum was sufficient for performance gains, and that full reclaimed
> space but didn't result in significant performance gains.

plain Vacuum is sufficient if the amount of free space it finds fits within
the free space map. During normal use with frequent vacuums on a system with
well-tuned fsm parameters that should be true.

However on a big heavily used database where the fsm parameters haven't been
raised from the defaults it's possible that it isn't. And on a table where
large batch updates or deletes have been run it's possible to require a vacuum
full after the batch job creates lots of dead tuples.

--
greg

Re: More Praise for 7.4RC2

От
Shridhar Daithankar
Дата:
On Friday 14 November 2003 01:33, Rick Gigger wrote:
> Are there any guidelines on how often one should do a reindex?

Vacuum in 7.4 does take care of index bloat, much better than earlier
versions. So if you run autovacuum daemon with 7.4, then you can do away with
reindex.

Of course testing at your site will provide the best answer. There is no
better tests than that..:-)

 Shridhar


Re: More Praise for 7.4RC2

От
"Markus Wollny"
Дата:
Hello!

> You can use the oid2name program in the contrib directory to kinda
> research which files are big under those trees and see if
> it's a table or
> index growth problem.

I found it a tedious operation, if you want to keep a check on growth of
your databases regularly. So I wrote a litte script which outputs a
sorted comma separated list of all objects within a database - so I can
do

      ./showdbsize foodb >foodb.csv

and import this thing in Excel for further processing or do whatever I
like with it. There may surely be more elegant ways of getting the task
done using another language or just plain bash-scripting and you have to
have oid2name made and installed, but in terms of language php is what I
am most comfortable with - if deemed necessary, anyone might write their
own little tool in their preferred ways and languages; but it's short
enough, so I'll just post it here if anyone wants to make use of it as
it is.

Kind Regards,

   Markus


Here goes:


        #!/usr/bin/php -q
        <?php
        #     showdbsize for PostgreSQL     #
        #                MWollny - 2003     #
        ######################################
        /*         Config Begin             */

        $pg_user = 'postgres';
        $pg_bindir = '/opt/pgsql/bin/';
        $pg_data ='/var/lib/pgsql/data/base/';

        /*          Config End              */
        ######################################
        /* DO NOT EDIT BELOW THIS LINE      */

        $argv=$_SERVER['argv'];
        $argc=$_SERVER['argc'];
        if ($argc != 2 || in_array($argv[1], array('--help', '-help',
'-h', '-?'))) {
        ?>
        This is a commandline PHP script to generate
        a list of object-ids, names and filesizes of
        all tables/indexes within a specified
        POSTGRESQL-database.

          Usage:
          <?php echo $argv[0]; ?> <database>

          <database> is the name of the database you
          wish to generate the list of.

          With the --list, -list, -l or --show,
          -show or -s options, you can get a list of
          all available databases on this server.

          With the --help, -help, -h,
          or -? options, you can get this help.

        <?php } else {

        /* Function to make bytesize numbers human-readable */
        function fsize($file) {
               $a = array("B", "KB", "MB", "GB", "TB", "PB");

               $pos = 0;
               $size = filesize($file);
               while ($size >= 1024) {
                      $size /= 1024;
                       $pos++;
               }

              return round($size,2)." ".$a[$pos];
        }

        /* One Ring To Find Them All */
        $pg_data=$pg_data.'base/';
        $db_exec=$pg_bindir.'oid2name -U '.$pg_user;
        $alldb=`$db_exec`;
        $i=1;
        $lines = explode ("\n", $alldb);
        foreach($lines as $value) {
         if (!strpos($value, "=")===false) {
          $dboid[$i] = trim(substr($value,0,strpos($value, "=")-1));
          $dbname[$i] = trim(substr(strstr($value,'='),2));
          $i++;
          }}


        if (in_array($argv[1], array('--show', '-show', '-s', '-l',
'--list', '-list'))) {
         echo "Databases available on this server:\n";
         foreach($dbname as $value) {echo "  $value\n";}
         die();
         }

        /* Is argument the name of an existing database on this server?
*/
        if (!in_array ($argv[1], $dbname)) {
            die ("Database $argv[1] not found on this server.\n");
        }

        /* Still alive? Okay, give me the OID of that DB! */
        $i=array_search($argv[1], $dbname);
        $use_oid=$dboid[$i];
        $use_name=$dbname[$i];
        $dbdir=$pg_data.$use_oid.'/';
        chdir ($dbdir);

        /* Let's see the list of files of the DB */
        $handle=opendir($dbdir);
        $i=0;
        while ($file = readdir ($handle)) {
            if ($file != "." && $file != "..") {
                $i++;
                $oid[$i]=$file;
            }
        }
        closedir($handle);

        /* Now gather data about actual names and filesizes of these
objects */
        for ($j = 1; $j <= $i; $j++) {
            if (is_numeric($oid[$j]))  {
            $oid_size[$j]=filesize($oid[$j]);
            $oid_hsize[$j]=fsize($oid[$j]);
            $db_exec=$pg_bindir.'oid2name -U '.$pg_user.' -d
'.$use_name.' -o '.$oid[$j];
            $raw_name=`$db_exec`;
            $full_name[$j]=trim(substr(substr(strstr($raw_name,'='), 1),
0, -1));
            # echo "$oid[$j]; $full_name[$j]; $oid_size[$j];
$oid_hsize[$j] \n";
        }}

        /* Sort and output the list so that it can be piped to a
CSV-file */
        asort ($oid_size);
        reset ($oid_size);
        echo "OID; Name; Size (Bytes); Size (readable)\n";
        foreach($oid_size as $key => $tablesize) {
        echo "$oid[$key]; $full_name[$key]; $oid_size[$key];
$oid_hsize[$key] \n";
        }
        } ?>

Re: More Praise for 7.4RC2

От
Reece Hart
Дата:
On Thu, 2003-11-13 at 13:10, scott.marlowe wrote:
So, if your table is HIGHLY updated, you may need to run a plain vacuum 
very often, and that's where the autovacuum daemon comes in handy.  Just 
set it to run every 30 minutes or so, and let it go.  It should only 
vacuum the tables that have had lots of change, and leave the others 
alone.

On Thu, 2003-11-13 at 19:37, Greg Stark wrote:
However on a big heavily used database where the fsm parameters haven't been
raised from the defaults it's possible that it isn't. And on a table where
large batch updates or deletes have been run it's possible to require a vacuum
full after the batch job creates lots of dead tuples.
 Scott & Greg-

Thanks for this info. I'm sure this explains at least part of the problem. I can't remember the sequence of events from several months back, but I did once update ~20M rows of this 40M row this table, and I have also deleted certain sets of rows at various times. Suspecting that I had a swiss-cheese table, I reclustered on an index several times (which, I presume, is at least as good as vacuum (non-full) removing internal free space, with the benefit of optimized row ordering). Since I can't remember the order of operations, it's possible that I timed the slow query at nearly the worst state, and it's the kinda think I only wanted to endure once.

Thanks again,
Reece

-- 
Reece Hart, http://www.in-machina.com/~reece/, GPG:0x25EC91A0

Re: More Praise for 7.4RC2

От
Christopher Browne
Дата:
reece@in-machina.com (Reece Hart) writes:
> On Thu, 2003-11-13 at 10:09, scott.marlowe wrote:
>
>           Do you vacuum full every so often?  If not, and if you've been overflowing
> your fsm, then your tables will just grow without shrinking.
> Also, index growth could be a problem.
>
> Hmm. I didn't realize that I needed to vacuum full as well -- I
> thought vacuum was sufficient for performance gains, and that full
> reclaimed space but didn't result in significant performance
> gains. I have reindexed infrequently, but since that locks the table
> I didn't do that (or vacuum full) often. I guess I should try out
> pg_autovacuum, but I think that full vacuums only to prevent XID
> wraparound (if age>1.5B transactions), but not for compaction (is
> this correct?).

That's not quite correct.  pg_autovacuum NEVER does a VACUUM FULL, and
it is not necessary to do so in order to avoid XID wraparound.  A
"simple VACUUM" suffices for that purpose..

What pg_autovacuum "buys you" is mainly twofold:

 1.  You don't need to schedule batch jobs to vacuum things;

 2.  If you have heavily updated tables, it will vacuum them a lot,
     which should prevent them from "blowing out" the free space map,
     and allow quicker reuse of dead space.

The merits of that aren't infinite, but are not nothing, either.
--
output = reverse("ofni.smrytrebil" "@" "enworbbc")
<http://dev6.int.libertyrms.com/>
Christopher Browne
(416) 646 3304 x124 (land)

Re: More Praise for 7.4RC2

От
Christopher Browne
Дата:
rick@alpinenetworking.com ("Rick Gigger") writes:
> Are there any guidelines on how often one should do a reindex?

When you discover that performance is "sucking" because of table
growth that would be fixed by a reindex.  Unfortunately, there's not
quite a "quick prescription" for how to discover that :-(.
--
(format nil "~S@~S" "cbbrowne" "libertyrms.info")
<http://dev6.int.libertyrms.com/>
Christopher Browne
(416) 646 3304 x124 (land)

Re: More Praise for 7.4RC2

От
Vivek Khera
Дата:
>>>>> "RG" == Rick Gigger <rick@alpinenetworking.com> writes:


RG> Are there any guidelines on how often one should do a reindex?

regularly monitor your index sizes with a query such as this:

SELECT relname,relpages FROM pg_class WHERE relname LIKE 'user_list%' ORDER BY relname;

On my most busy tables, index bloat in PG 7.2 is horrendous.  I have
one index that more than triples in size after about 10 days.  After
reindex it gets to a respectable 22k relpages.

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.                Khera Communications, Inc.
Internet: khera@kciLink.com       Rockville, MD       +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

Re: More Praise for 7.4RC2

От
"Rick Gigger"
Дата:
What exactly does that number (relpages) mean?  It seems that my large
tables and indexes have large values for relpages and and small tables and
indexes have small values.

rg

----- Original Message -----
From: "Vivek Khera" <khera@kcilink.com>
Newsgroups: ml.postgres.general
To: <pgsql-general@postgresql.org>
Sent: Friday, November 14, 2003 2:21 PM
Subject: Re: [GENERAL] More Praise for 7.4RC2


> >>>>> "RG" == Rick Gigger <rick@alpinenetworking.com> writes:
>
>
> RG> Are there any guidelines on how often one should do a reindex?
>
> regularly monitor your index sizes with a query such as this:
>
> SELECT relname,relpages FROM pg_class WHERE relname LIKE 'user_list%'
ORDER BY relname;
>
> On my most busy tables, index bloat in PG 7.2 is horrendous.  I have
> one index that more than triples in size after about 10 days.  After
> reindex it gets to a respectable 22k relpages.
>
> --
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Vivek Khera, Ph.D.                Khera Communications, Inc.
> Internet: khera@kciLink.com       Rockville, MD       +1-240-453-8497
> AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>


Re: More Praise for 7.4RC2

От
Martijn van Oosterhout
Дата:
It should be the number of pages in the relation. A page is usually 8K
unless it's changed. Not sure how often it's updated though.

On Fri, Nov 14, 2003 at 03:02:49PM -0700, Rick Gigger wrote:
> What exactly does that number (relpages) mean?  It seems that my large
> tables and indexes have large values for relpages and and small tables and
> indexes have small values.
>
> rg
>
> ----- Original Message -----
> From: "Vivek Khera" <khera@kcilink.com>
> Newsgroups: ml.postgres.general
> To: <pgsql-general@postgresql.org>
> Sent: Friday, November 14, 2003 2:21 PM
> Subject: Re: [GENERAL] More Praise for 7.4RC2
>
>
> > >>>>> "RG" == Rick Gigger <rick@alpinenetworking.com> writes:
> >
> >
> > RG> Are there any guidelines on how often one should do a reindex?
> >
> > regularly monitor your index sizes with a query such as this:
> >
> > SELECT relname,relpages FROM pg_class WHERE relname LIKE 'user_list%'
> ORDER BY relname;
> >
> > On my most busy tables, index bloat in PG 7.2 is horrendous.  I have
> > one index that more than triples in size after about 10 days.  After
> > reindex it gets to a respectable 22k relpages.
> >
> > --
> > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> > Vivek Khera, Ph.D.                Khera Communications, Inc.
> > Internet: khera@kciLink.com       Rockville, MD       +1-240-453-8497
> > AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> >
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> "All that is needed for the forces of evil to triumph is for enough good
> men to do nothing." - Edmond Burke
> "The penalty good people pay for not being interested in politics is to be
> governed by people worse than themselves." - Plato

Вложения

Re: More Praise for 7.4RC2

От
Greg Stark
Дата:
Reece Hart <reece@in-machina.com> writes:

> Suspecting that I had a swiss-cheese table, I reclustered on an index
> several times (which, I presume, is at least as good as vacuum (non-full)
> removing internal free space, with the benefit of optimized row ordering).

My understanding is that CLUSTER is as good as a VACUUM FULL actually.

--
greg

Re: More Praise for 7.4RC2

От
Alvaro Herrera
Дата:
On Sat, Nov 15, 2003 at 11:26:44AM -0500, Greg Stark wrote:
>
> Reece Hart <reece@in-machina.com> writes:
>
> > Suspecting that I had a swiss-cheese table, I reclustered on an index
> > several times (which, I presume, is at least as good as vacuum (non-full)
> > removing internal free space, with the benefit of optimized row ordering).
>
> My understanding is that CLUSTER is as good as a VACUUM FULL actually.

Well, it's different in that it does the equivalent of a REINDEX on all
indexes.

It's also slower and needs lots of additional space: double the space of
your table and all it's indexes.

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
Essentially, you're proposing Kevlar shoes as a solution for the problem
that you want to walk around carrying a loaded gun aimed at your foot.
(Tom Lane)