Обсуждение: VACUUMing for 30 minutes

Поиск
Список
Период
Сортировка

VACUUMing for 30 minutes

От
Дата:
Hello,

I have a DB with about 30 tables, where 2 tables are significantly
larger than the rest, and contain a bit over 100,000 rows.

Every night I do these 3 things:
VACUUM;
ANALYZE;
pg_dump

I am noticing that the VACUUM part takes nearly 30 minutes, during
which the DB is not very accessible (and a whole lot of load is put on
the machine in general).

Using pgsession.sh script mentioned earlier, I caught this process
taking a long time:

 31179 | mydb  | otis     | FETCH 100 FROM _pg_dump_cursor

Is there anything one can do to minimize the impact of VACUUM?

I am using PG 7.3.4 on a Linux box with a 1.70GHz Celeron, 1GB RAM, and
a 'regular' IDE disk.

Thanks,
Otis


Re: VACUUMing for 30 minutes

От
Дата:
Hello,

I still have this issue of a looooong vacuum process on a DB that is
not really all that big in schema (~ 30 tables) nor size (biggest table
is ~150K rows, another ~120K, all others only a few thousand rows
each).

VACUUMing this DB takes about 30 minutes, and during that time the DB
is pretty unresponsive, although the PG process is not using a lot of
CPU (load ~ 1) nor memory (~20MB for the VACUUM process).

During VACUUM I see these DB sessions:

 28764 | simpydb  | postgres | select count(*) from pg_stat_activity
 25946 | simpydb  | otis     | VACUUM;

My questions are:
- Does it sounds normal that such a small DB would need 30 minute
vacuuming?  (My iRobot Rumba does my apartment in less time.)

- Should I be giving PG more RAM while it's VACUUMing? (the PG process
running VACUUM is using only 20MB now, but I'm not sure if it needs
more)

Here are some possibly relevant config settings:

shared_buffers = 2048
sort_mem = 4096         # min 64, size in KB
effective_cache_size = 10000

#vacuum_mem = 8192  -- oh, look at that.  Can I freely give it more
without affecting the memory consumption while VACUUM is not running?

Thanks,
Otis


--- ogjunk-pgjedan@yahoo.com wrote:

> Hello,
>
> I have a DB with about 30 tables, where 2 tables are significantly
> larger than the rest, and contain a bit over 100,000 rows.
>
> Every night I do these 3 things:
> VACUUM;
> ANALYZE;
> pg_dump
>
> I am noticing that the VACUUM part takes nearly 30 minutes, during
> which the DB is not very accessible (and a whole lot of load is put
> on
> the machine in general).
>
> Using pgsession.sh script mentioned earlier, I caught this process
> taking a long time:
>
>  31179 | mydb  | otis     | FETCH 100 FROM _pg_dump_cursor
>
> Is there anything one can do to minimize the impact of VACUUM?
>
> I am using PG 7.3.4 on a Linux box with a 1.70GHz Celeron, 1GB RAM,
> and
> a 'regular' IDE disk.
>
> Thanks,
> Otis



Re: VACUUMing for 30 minutes

От
"Joshua D. Drake"
Дата:
>During VACUUM I see these DB sessions:
>
> 28764 | simpydb  | postgres | select count(*) from pg_stat_activity
> 25946 | simpydb  | otis     | VACUUM;
>
>My questions are:
>- Does it sounds normal that such a small DB would need 30 minute
>vacuuming?  (My iRobot Rumba does my apartment in less time.)
>
>
It depends... how much IO do you have and how active
is the DB (updates/deletes)

>- Should I be giving PG more RAM while it's VACUUMing? (the PG process
>running VACUUM is using only 20MB now, but I'm not sure if it needs
>more)
>
>
It can definately help.

>Here are some possibly relevant config settings:
>
>shared_buffers = 2048
>sort_mem = 4096         # min 64, size in KB
>effective_cache_size = 10000
>
>

>#vacuum_mem = 8192  -- oh, look at that.  Can I freely give it more
>without affecting the memory consumption while VACUUM is not running?
>
>

Yes.

>Thanks,
>Otis
>
>
>--- ogjunk-pgjedan@yahoo.com wrote:
>
>
>
>>Hello,
>>
>>I have a DB with about 30 tables, where 2 tables are significantly
>>larger than the rest, and contain a bit over 100,000 rows.
>>
>>Every night I do these 3 things:
>>VACUUM;
>>ANALYZE;
>>pg_dump
>>
>>I am noticing that the VACUUM part takes nearly 30 minutes, during
>>which the DB is not very accessible (and a whole lot of load is put
>>on
>>the machine in general).
>>
>>Using pgsession.sh script mentioned earlier, I caught this process
>>taking a long time:
>>
>> 31179 | mydb  | otis     | FETCH 100 FROM _pg_dump_cursor
>>
>>Is there anything one can do to minimize the impact of VACUUM?
>>
>>I am using PG 7.3.4 on a Linux box with a 1.70GHz Celeron, 1GB RAM,
>>and
>>a 'regular' IDE disk.
>>
>>Thanks,
>>Otis
>>
>>
>
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 6: Have you searched our list archives?
>
>               http://archives.postgresql.org
>
>


--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com
PostgreSQL Replicator -- production quality replication for PostgreSQL


Вложения

Re: VACUUMing for 30 minutes

От
Tom Lane
Дата:
<ogjunk-pgjedan@yahoo.com> writes:
> VACUUMing this DB takes about 30 minutes, and during that time the DB
> is pretty unresponsive, although the PG process is not using a lot of
> CPU (load ~ 1) nor memory (~20MB for the VACUUM process).

How big is the DB physically ("du $PGDATA" results)?  If you've been lax
about vacuuming or not had your FSM parameters set high enough, there
could be a whole lot of dead space for VACUUM to scan through.  If so,
VACUUM FULL or possibly CLUSTER would be the best way to re-compact the
tables.  (VACUUM VERBOSE on your larger tables would be another way to
investigate this.)

The other possibility is that you have a seriously slow disk drive :-(

            regards, tom lane

Re: VACUUMing for 30 minutes

От
Дата:
Hello,

--- Tom Lane <tgl@sss.pgh.pa.us> wrote:

> <ogjunk-pgjedan@yahoo.com> writes:
> > VACUUMing this DB takes about 30 minutes, and during that time the
> DB
> > is pretty unresponsive, although the PG process is not using a lot
> of
> > CPU (load ~ 1) nor memory (~20MB for the VACUUM process).
>
> How big is the DB physically ("du $PGDATA" results)?

4.2 GB:

# du -h ~postgres/data
3.6M    /var/lib/pgsql/data/base/1
3.6M    /var/lib/pgsql/data/base/16975
4.0K    /var/lib/pgsql/data/base/16976/pgsql_tmp
4.1G    /var/lib/pgsql/data/base/16976
4.1G    /var/lib/pgsql/data/base
152K    /var/lib/pgsql/data/global
129M    /var/lib/pgsql/data/pg_xlog
1.1M    /var/lib/pgsql/data/pg_clog
4.2G    /var/lib/pgsql/data

> If you've been lax
> about vacuuming or not had your FSM parameters set high enough, there
> could be a whole lot of dead space for VACUUM to scan through.

I've been vacuuming every night, like a good DBwife.

> If so,
> VACUUM FULL or possibly CLUSTER would be the best way to re-compact
> the
> tables.  (VACUUM VERBOSE on your larger tables would be another way
> to
> investigate this.)

I will try VACUUM VERBOSE on the biggest (and most active) table
tonight and report the findings.

> The other possibility is that you have a seriously slow disk drive
> :-(

It looks like I have an ATA-6 drive with only 2MB cache, and the
following throughput:

# /sbin/hdparm -tT /dev/hda


/dev/hda:
 Timing buffer-cache reads:   128 MB in  0.56 seconds =228.57 MB/sec
 Timing buffered disk reads:  64 MB in  1.18 seconds = 54.24 MB/sec

Not SCSI, not RAID, but not the slowest HDD on the continent.

Thanks,
Otis


Re: VACUUMing for 30 minutes

От
Дата:
Hello,

So I run VACUUM VERBOSE (just like that, without specifying a table)
and got some output.  The 2 big tables take 99% of the vacuuming time.


Now, I run VACUUM religiously every night, across all tables, but maybe
that's an overkill for th number of updates and inserts in this DB.
Maybe somebody can give an advice.

Big table #1:


INFO:  --Relation public.url--
INFO:  Index pk_url_id: Pages 29650; Tuples 114184: Deleted 47.
    CPU 3.61s/0.72u sec elapsed 129.67 sec.
INFO:  Index url_href_key: Pages 108190; Tuples 114182: Deleted 47.
    CPU 12.13s/2.30u sec elapsed 922.25 sec.
INFO:  Index ix_url_last_mod_date: Pages 37536; Tuples 114182: Deleted
47.
    CPU 4.39s/0.82u sec elapsed 270.24 sec.
INFO:  Index ix_url_last_code: Pages 38823; Tuples 114182: Deleted 47.
    CPU 3.75s/0.61u sec elapsed 160.87 sec.
INFO:  Removed 47 tuples in 3 pages.
    CPU 0.00s/0.00u sec elapsed 0.02 sec.
INFO:  Pages 303359: Changed 23, Empty 0; Tup 114182: Vac 47, Keep 273,
UnUsed 11798506.
    Total CPU 40.30s/7.10u sec elapsed 1564.58 sec.


Big table #2:


INFO:  --Relation public.user_url--
INFO:  Index pk_user_url_id: Pages 1291; Tuples 137980: Deleted 59.
    CPU 0.09s/0.07u sec elapsed 4.71 sec.
INFO:  Index ix_user_url_user_id_url_id: Pages 1633; Tuples 137980:
Deleted 59.
    CPU 0.08s/0.04u sec elapsed 1.55 sec.
INFO:  Index ix_user_url_add_date: Pages 1186; Tuples 137980: Deleted
59.
    CPU 0.15s/0.07u sec elapsed 4.38 sec.
INFO:  Index ix_user_url_last_click_date: Pages 1124; Tuples 137980:
Deleted 59.
    CPU 0.13s/0.05u sec elapsed 3.39 sec.
INFO:  Index ix_user_url_click_count: Pages 977; Tuples 137980: Deleted
59.
    CPU 0.13s/0.06u sec elapsed 2.07 sec.
INFO:  Removed 59 tuples in 8 pages.
    CPU 0.00s/0.00u sec elapsed 0.03 sec.
INFO:  Pages 5052: Changed 0, Empty 0; Tup 137980: Vac 59, Keep 31,
UnUsed 215734.
    Total CPU 0.86s/0.35u sec elapsed 17.65 sec.


It looks like 'Tuples' number corresponds to the number of rows and
'Deleted' refers to the number of rows that were DELETEd, so vacuuming
basically re-claims the gaps in the index caused by deletion.

Does the above 'rate of table modification' warrant nightly VACUUMing,
or should I relax and do it weekly or even monthly?

Thanks,
Otis



--- ogjunk-pgjedan@yahoo.com wrote:

> Hello,
>
> --- Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> > <ogjunk-pgjedan@yahoo.com> writes:
> > > VACUUMing this DB takes about 30 minutes, and during that time
> the
> > DB
> > > is pretty unresponsive, although the PG process is not using a
> lot
> > of
> > > CPU (load ~ 1) nor memory (~20MB for the VACUUM process).
> >
> > How big is the DB physically ("du $PGDATA" results)?
>
> 4.2 GB:
>
> # du -h ~postgres/data
> 3.6M    /var/lib/pgsql/data/base/1
> 3.6M    /var/lib/pgsql/data/base/16975
> 4.0K    /var/lib/pgsql/data/base/16976/pgsql_tmp
> 4.1G    /var/lib/pgsql/data/base/16976
> 4.1G    /var/lib/pgsql/data/base
> 152K    /var/lib/pgsql/data/global
> 129M    /var/lib/pgsql/data/pg_xlog
> 1.1M    /var/lib/pgsql/data/pg_clog
> 4.2G    /var/lib/pgsql/data
>
> > If you've been lax
> > about vacuuming or not had your FSM parameters set high enough,
> there
> > could be a whole lot of dead space for VACUUM to scan through.
>
> I've been vacuuming every night, like a good DBwife.
>
> > If so,
> > VACUUM FULL or possibly CLUSTER would be the best way to re-compact
> > the
> > tables.  (VACUUM VERBOSE on your larger tables would be another way
> > to
> > investigate this.)
>
> I will try VACUUM VERBOSE on the biggest (and most active) table
> tonight and report the findings.
>
> > The other possibility is that you have a seriously slow disk drive
> > :-(
>
> It looks like I have an ATA-6 drive with only 2MB cache, and the
> following throughput:
>
> # /sbin/hdparm -tT /dev/hda
>
>
>
> /dev/hda:
>  Timing buffer-cache reads:   128 MB in  0.56 seconds =228.57 MB/sec
>  Timing buffered disk reads:  64 MB in  1.18 seconds = 54.24 MB/sec
>
> Not SCSI, not RAID, but not the slowest HDD on the continent.
>
> Thanks,
> Otis
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that
> your
>       message can get through to the mailing list cleanly
>


Re: VACUUMing for 30 minutes

От
Tom Lane
Дата:
<ogjunk-pgjedan@yahoo.com> writes:
> Now, I run VACUUM religiously every night, across all tables, but maybe
> that's an overkill for th number of updates and inserts in this DB.

Or maybe it's not enough?  How many updates/deletes do you do in an
average day?

> INFO:  Pages 303359: Changed 23, Empty 0; Tup 114182: Vac 47, Keep 273,
> UnUsed 11798506.

You've got 303359 pages holding only 114182 tuples, which means this
table is *at least* two-thirds empty, and probably a lot more; there's
no way to tell from this output what the average tuple size is, but
I bet it's a lot less than a page.  The indexes look pretty bloated too.

I would recommend CLUSTERing the table on one index or another as the
easiest way of cleaning it up.  A somewhat faster way would be to drop
the indexes, VACUUM FULL, re-make the indexes, but there's more chance
of mistakes that way.

Once you've got the tables de-bloated, take another look at your FSM
settings; it's a good bet they are not high enough for your database
size.

            regards, tom lane