Обсуждение: VACUUM degrades performance significantly. Database becomes unusable!

Поиск
Список
Период
Сортировка

VACUUM degrades performance significantly. Database becomes unusable!

От
"Stephen"
Дата:
Hello,

Is it normal for plain VACUUM on large table to degrade performance by over
9 times? My database becomes unusable when VACUUM runs. From reading
newsgroups, I thought VACUUM should only slow down by 10% to 15%. Other MVCC
databases like MySQL InnoDB can even VACUUM discretely (runs internally). Is
it my Linux system or is it PostgreSQL?

The database is mostly read-only. There are 133,000 rows and each row is
about 2.5kB in size (mostly due to the bytea column holding a binary image).
The long row causes system to TOAST the table. VACUUM takes 5m20s to
complete. I repeatedly ran the following tests while system is idling:


In normal operation:
====================
tsdb=# explain analyze select * from table1 where id =
'33a4e9b6eae09634f4ff3e6fa9280f6e';
                                                           QUERY PLAN
----------------------------------------------------------------------------
----------------------------------------------------
 Index Scan using table1_pkey on table1  (cost=0.00..6.01 rows=1 width=346)
(actual time=25.30..25.31 rows=1 loops=1)
   Index Cond: (id = '33a4e9b6eae09634f4ff3e6fa9280f6e'::character varying)
 Total runtime: 25.52 msec
(3 rows)

When VACUUM runs:
=================
tsdb=# explain analyze select * from table1 where id =
'336139b47b7faf09fc4d4f03680a4ce5';
                                                             QUERY PLAN
----------------------------------------------------------------------------
--------------------------------------------------------
 Index Scan using table1_pkey on table1  (cost=0.00..6.01 rows=1 width=346)
(actual time=2290.07..2290.10 rows=1 loops=1)
   Index Cond: (id = '336139b47b7faf09fc4d4f03680a4ce5'::character varying)
 Total runtime: 2290.22 msec
(3 rows)


VACUUM output:
==============
tsdb=# VACUUM VERBOSE table1;
INFO:  --Relation public.table1--
INFO:  Pages 5887: Changed 0, Empty 0; Tup 132672: Vac 0, Keep 0, UnUsed
144.
        Total CPU 0.28s/0.01u sec elapsed 36.08 sec.
INFO:  --Relation pg_toast.pg_toast_12437088--
INFO:  Pages 40495: Changed 0, Empty 0; Tup 197587: Vac 0, Keep 0, UnUsed
235.
        Total CPU 1.73s/0.20u sec elapsed 233.91 sec.
VACUUM



vmstat while VACUUM'ing:
========================
   procs                      memory      swap          io     system
cpu
 r  b  w   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy
id
 0  1  1  74420   6520  30616 405128    0    0  1280     0  287   487  0  1
99
 0  1  0  74420   6520  30620 405168    0    0  1196     0  271   436  0  0
100
 0  1  1  74420   6520  30620 405120    0    0  1496     4  289   491  0  3
97
 0  1  1  74420   6520  30620 405208    0    0  1280     0  268   466  0  0
100
 1  0  1  74420   6520  30620 405208    0    0  1280     0  288   482  0  1
99
 1  0  1  74420   6520  30632 405200    0    0  1416     8  277   441  1  2
97
 3  1  1  74416   6520  30632 405196    4    0  1284     0  284   473  0  3
97



PostgreSQL configuration (the only changes made):
=================================================
max_connections = 1024
shared_buffers = 2800
sort_mem = 8192
vacuum_mem = 8192
effective_cache_size = 32000


System:
=======
Hardware: AMD 1.2GHz Athlon 512MB SDRAM
OS: Redhat Linux 9.0 (kernel 2.4.20-8)
FS: EXT3 with Journalling mounted with noatime, UDMA5
Disk1 (Linux): IDE 20GB 7200 RPM Western Digital
Disk2 (PostgreSQL): IDE 120GB 7200 RPM Western Digital
PostgreSQL: 7.3.4


hdparm:
=======
/dev/hda: (Linux partition)
 multcount    = 16 (on)
 IO_support   =  1 (32-bit)
 unmaskirq    =  1 (on)
 using_dma    =  1 (on)
 keepsettings =  0 (off)
 readonly     =  0 (off)
 readahead    =  8 (on)
 geometry     = 2498/255/63, sectors = 40132503, start = 0

/dev/hdc: (PostgreSQL partition)
 multcount    = 16 (on)
 IO_support   =  1 (32-bit)
 unmaskirq    =  1 (on)
 using_dma    =  1 (on)
 keepsettings =  1 (on)
 readonly     =  0 (off)
 readahead    =  8 (on)
 geometry     = 232581/16/63, sectors = 234441648, start = 0


Schema:
=======
CREATE TABLE table1 (
   id              varchar(32)     DEFAULT ''                      NOT NULL,
   colname1          varchar(10)     DEFAULT 'http'                  NOT
NULL,
   colname2      varchar(300)    DEFAULT ''                      NOT NULL,
   colname3             varchar(5)      DEFAULT ''                      NOT
NULL,
   colname4         varchar(300)    DEFAULT ''                      NOT
NULL,
   colname5      varchar(300)    DEFAULT ''                      NOT NULL,
   colname6             integer         DEFAULT 0                       NOT
NULL,
   colname7           integer         DEFAULT 0                       NOT
NULL,
   colname8     integer         DEFAULT 200                     NOT NULL,
   colname9          varchar(10)     DEFAULT ''                      NOT
NULL,
   colname10      varchar(10)     DEFAULT ''                      NOT NULL,
   colname11          varchar(100)    DEFAULT ''                      NOT
NULL,
   colname12    varchar(100)    DEFAULT ''                      NOT NULL,
   colname13     varchar(100)    DEFAULT ''                      NOT NULL,
   colname14  varchar(20)     DEFAULT ''                      NOT NULL,
   colname15         integer         DEFAULT 640                     NOT
NULL,
   colname16        integer         DEFAULT 480                     NOT
NULL,
   colname17            integer         DEFAULT 120                     NOT
NULL,
   colname18           integer         DEFAULT 90                      NOT
NULL,
   colname19        timestamp       DEFAULT CURRENT_TIMESTAMP       NOT
NULL,
   colname20         timestamp       DEFAULT CURRENT_TIMESTAMP       NOT
NULL,
   colname21   integer         DEFAULT 0                       NOT NULL,
   colname22      integer         DEFAULT 0                       NOT NULL,
   colname23  timestamp       DEFAULT CURRENT_TIMESTAMP       NOT NULL,
   colname24             integer         DEFAULT 0                       NOT
NULL,
   colname25       integer         DEFAULT 0                       NOT NULL,
   colname26          varchar(10)     DEFAULT ''                      NOT
NULL,
   colname28   varchar(10)     DEFAULT ''                      NOT NULL,
   colname29        varchar(10)     DEFAULT 'jpeg'                  NOT
NULL,
   colname30   varchar(20)     DEFAULT ''                      NOT NULL,
   colname31       bytea                                                   ,
   PRIMARY KEY (id)
) WITHOUT OIDS

CREATE UNIQUE INDEX table1_idx_1 ON table1 (colname1, colname2, colname3,
colname4, colname5)

Sigh, :-(

Stephen




Re: VACUUM degrades performance significantly. Database becomes unusable!

От
Tom Lane
Дата:
"Stephen" <jleelim@xxxxxx.com> writes:
> Is it normal for plain VACUUM on large table to degrade performance by over
> 9 times? My database becomes unusable when VACUUM runs. From reading
> newsgroups, I thought VACUUM should only slow down by 10% to 15%.

We have heard reports of very significant slowdowns from people who were
already nearly saturating their disk I/O bandwidth, and then VACUUM
pushed their systems over the knee of the response curve.  You haven't
said anything about what was happening in your system besides the VACUUM
and the test query, but I wonder how busy the disk drive was already.

Another possibility is that you've got the disk subsystem misconfigured
somehow, although your hdparm output looks okay to my not-expert eyes.

Possibly you should think about SCSI instead of IDE disk drives.
Consumer-grade IDE drives seem to be not very good about servicing
parallel loads.  As an example, running "pgbench" with 10 client threads
on a fairly new Dell PC (P4, some IDE drive or other), I see about a 3x
slowdown in reported transactions-per-second when a background VACUUM
is running.  The same test on my trusty old built-like-a-tank HP server
(with fast-for-its-time SCSI drives) shows only a 20% slowdown.  I
attribute this to the SCSI drive being better able to handle concurrent
requests.

            regards, tom lane

Re: VACUUM degrades performance significantly. Database

От
"scott.marlowe"
Дата:
It sounds like you might be I/O bound.  if you drop the 5 way unique index
for a test, how do the vacuum and parallel select run?

On Wed, 15 Oct 2003, Stephen wrote:

> Hello,
>
> Is it normal for plain VACUUM on large table to degrade performance by over
> 9 times? My database becomes unusable when VACUUM runs. From reading
> newsgroups, I thought VACUUM should only slow down by 10% to 15%. Other MVCC
> databases like MySQL InnoDB can even VACUUM discretely (runs internally). Is
> it my Linux system or is it PostgreSQL?
>
> The database is mostly read-only. There are 133,000 rows and each row is
> about 2.5kB in size (mostly due to the bytea column holding a binary image).
> The long row causes system to TOAST the table. VACUUM takes 5m20s to
> complete. I repeatedly ran the following tests while system is idling:
>
>
> In normal operation:
> ====================
> tsdb=# explain analyze select * from table1 where id =
> '33a4e9b6eae09634f4ff3e6fa9280f6e';
>                                                            QUERY PLAN
> ----------------------------------------------------------------------------
> ----------------------------------------------------
>  Index Scan using table1_pkey on table1  (cost=0.00..6.01 rows=1 width=346)
> (actual time=25.30..25.31 rows=1 loops=1)
>    Index Cond: (id = '33a4e9b6eae09634f4ff3e6fa9280f6e'::character varying)
>  Total runtime: 25.52 msec
> (3 rows)
>
> When VACUUM runs:
> =================
> tsdb=# explain analyze select * from table1 where id =
> '336139b47b7faf09fc4d4f03680a4ce5';
>                                                              QUERY PLAN
> ----------------------------------------------------------------------------
> --------------------------------------------------------
>  Index Scan using table1_pkey on table1  (cost=0.00..6.01 rows=1 width=346)
> (actual time=2290.07..2290.10 rows=1 loops=1)
>    Index Cond: (id = '336139b47b7faf09fc4d4f03680a4ce5'::character varying)
>  Total runtime: 2290.22 msec
> (3 rows)
>
>
> VACUUM output:
> ==============
> tsdb=# VACUUM VERBOSE table1;
> INFO:  --Relation public.table1--
> INFO:  Pages 5887: Changed 0, Empty 0; Tup 132672: Vac 0, Keep 0, UnUsed
> 144.
>         Total CPU 0.28s/0.01u sec elapsed 36.08 sec.
> INFO:  --Relation pg_toast.pg_toast_12437088--
> INFO:  Pages 40495: Changed 0, Empty 0; Tup 197587: Vac 0, Keep 0, UnUsed
> 235.
>         Total CPU 1.73s/0.20u sec elapsed 233.91 sec.
> VACUUM
>
>
>
> vmstat while VACUUM'ing:
> ========================
>    procs                      memory      swap          io     system
> cpu
>  r  b  w   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy
> id
>  0  1  1  74420   6520  30616 405128    0    0  1280     0  287   487  0  1
> 99
>  0  1  0  74420   6520  30620 405168    0    0  1196     0  271   436  0  0
> 100
>  0  1  1  74420   6520  30620 405120    0    0  1496     4  289   491  0  3
> 97
>  0  1  1  74420   6520  30620 405208    0    0  1280     0  268   466  0  0
> 100
>  1  0  1  74420   6520  30620 405208    0    0  1280     0  288   482  0  1
> 99
>  1  0  1  74420   6520  30632 405200    0    0  1416     8  277   441  1  2
> 97
>  3  1  1  74416   6520  30632 405196    4    0  1284     0  284   473  0  3
> 97
>
>
>
> PostgreSQL configuration (the only changes made):
> =================================================
> max_connections = 1024
> shared_buffers = 2800
> sort_mem = 8192
> vacuum_mem = 8192
> effective_cache_size = 32000
>
>
> System:
> =======
> Hardware: AMD 1.2GHz Athlon 512MB SDRAM
> OS: Redhat Linux 9.0 (kernel 2.4.20-8)
> FS: EXT3 with Journalling mounted with noatime, UDMA5
> Disk1 (Linux): IDE 20GB 7200 RPM Western Digital
> Disk2 (PostgreSQL): IDE 120GB 7200 RPM Western Digital
> PostgreSQL: 7.3.4
>
>
> hdparm:
> =======
> /dev/hda: (Linux partition)
>  multcount    = 16 (on)
>  IO_support   =  1 (32-bit)
>  unmaskirq    =  1 (on)
>  using_dma    =  1 (on)
>  keepsettings =  0 (off)
>  readonly     =  0 (off)
>  readahead    =  8 (on)
>  geometry     = 2498/255/63, sectors = 40132503, start = 0
>
> /dev/hdc: (PostgreSQL partition)
>  multcount    = 16 (on)
>  IO_support   =  1 (32-bit)
>  unmaskirq    =  1 (on)
>  using_dma    =  1 (on)
>  keepsettings =  1 (on)
>  readonly     =  0 (off)
>  readahead    =  8 (on)
>  geometry     = 232581/16/63, sectors = 234441648, start = 0
>
>
> Schema:
> =======
> CREATE TABLE table1 (
>    id              varchar(32)     DEFAULT ''                      NOT NULL,
>    colname1          varchar(10)     DEFAULT 'http'                  NOT
> NULL,
>    colname2      varchar(300)    DEFAULT ''                      NOT NULL,
>    colname3             varchar(5)      DEFAULT ''                      NOT
> NULL,
>    colname4         varchar(300)    DEFAULT ''                      NOT
> NULL,
>    colname5      varchar(300)    DEFAULT ''                      NOT NULL,
>    colname6             integer         DEFAULT 0                       NOT
> NULL,
>    colname7           integer         DEFAULT 0                       NOT
> NULL,
>    colname8     integer         DEFAULT 200                     NOT NULL,
>    colname9          varchar(10)     DEFAULT ''                      NOT
> NULL,
>    colname10      varchar(10)     DEFAULT ''                      NOT NULL,
>    colname11          varchar(100)    DEFAULT ''                      NOT
> NULL,
>    colname12    varchar(100)    DEFAULT ''                      NOT NULL,
>    colname13     varchar(100)    DEFAULT ''                      NOT NULL,
>    colname14  varchar(20)     DEFAULT ''                      NOT NULL,
>    colname15         integer         DEFAULT 640                     NOT
> NULL,
>    colname16        integer         DEFAULT 480                     NOT
> NULL,
>    colname17            integer         DEFAULT 120                     NOT
> NULL,
>    colname18           integer         DEFAULT 90                      NOT
> NULL,
>    colname19        timestamp       DEFAULT CURRENT_TIMESTAMP       NOT
> NULL,
>    colname20         timestamp       DEFAULT CURRENT_TIMESTAMP       NOT
> NULL,
>    colname21   integer         DEFAULT 0                       NOT NULL,
>    colname22      integer         DEFAULT 0                       NOT NULL,
>    colname23  timestamp       DEFAULT CURRENT_TIMESTAMP       NOT NULL,
>    colname24             integer         DEFAULT 0                       NOT
> NULL,
>    colname25       integer         DEFAULT 0                       NOT NULL,
>    colname26          varchar(10)     DEFAULT ''                      NOT
> NULL,
>    colname28   varchar(10)     DEFAULT ''                      NOT NULL,
>    colname29        varchar(10)     DEFAULT 'jpeg'                  NOT
> NULL,
>    colname30   varchar(20)     DEFAULT ''                      NOT NULL,
>    colname31       bytea                                                   ,
>    PRIMARY KEY (id)
> ) WITHOUT OIDS
>
> CREATE UNIQUE INDEX table1_idx_1 ON table1 (colname1, colname2, colname3,
> colname4, colname5)
>
> Sigh, :-(
>
> Stephen
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
>                http://archives.postgresql.org
>


Re: VACUUM degrades performance significantly. Database becomes unusable!

От
"Stephen"
Дата:
The system is actually idling when I ran the tests (load average: 0.01,
0.02, 0.00). When VACUUM runs, load average increases quickly (load average:
1.77, 0.60, 0.21) just by running psql on command line and issuing 2
queries. I've been running PostgreSQL for over 3 years now and IIRC VACUUM
has always been a show-stopper with or without the FULL attribute on my
system (from linux 2.2 to linux 2.4, and from PostgreSQL 7.1 to PostgreSQL
7.3). The only thing that didn't change is the IDE drive. The problem is not
unique to this system, I have 2 machines with the same configuration and
VACUUM slows down both databases considerably. Unfortunately, I prefer to
implement a system on cheap IDE disks a la Google because my database is
expected to hit terabytes and SCSI cost becomes prohibitive.

Is anyone else getting performance degradation with VACUUM on Linux? I'm
thinking maybe Linux has a bad scheduler and high IO latency that other OSes
(eg. FreeBSD) do not suffer from. Although, someone did tell me that Redhat
9.0 Linux 2.4.20-8 has parts of the low latency patch applied but I
certainly don't feel the improvement on PostgreSQL.



Re: VACUUM degrades performance significantly. Database

От
"Stephen"
Дата:
Scott,

I dropped the 5 way unique index and the VACUUM improved slightly. I ran
VACUUM, ANALYZE, VACUUM and queries repeatedly. The max response time seem
to have reduced to 1700 msec from 2300 msec. The higher load and vmstat
during VACUUM remained the same. It's still not enough to justify dropping
the index for my purposes.

tsdb=# explain analyze select * from table1 where id =
'3305b141837f065d673aa09cf382d331';
                                                             QUERY PLAN
----------------------------------------------------------------------------
--------------------------------------------------------
 Index Scan using table1_pkey on table1 (cost=0.00..6.01 rows=1 width=346)
(actual time=1762.34..1762.37 rows=1 loops=1)
   Index Cond: (id = '3305b141837f065d673aa09cf382d331'::character varying)
 Total runtime: 1762.50 msec
(3 rows)

Regards,

Stephen

""scott.marlowe"" <scott.marlowe@ihs.com> wrote in message
news:Pine.LNX.4.33.0310151150580.23393-100000@css120.ihs.com...
> It sounds like you might be I/O bound.  if you drop the 5 way unique index
> for a test, how do the vacuum and parallel select run?
>
> On Wed, 15 Oct 2003, Stephen wrote:
>
> > Hello,
> >
> > Is it normal for plain VACUUM on large table to degrade performance by
over
> > 9 times? My database becomes unusable when VACUUM runs. From reading
> > newsgroups, I thought VACUUM should only slow down by 10% to 15%. Other
MVCC
> > databases like MySQL InnoDB can even VACUUM discretely (runs
internally). Is
> > it my Linux system or is it PostgreSQL?
> >
> > The database is mostly read-only. There are 133,000 rows and each row is
> > about 2.5kB in size (mostly due to the bytea column holding a binary
image).
> > The long row causes system to TOAST the table. VACUUM takes 5m20s to
> > complete. I repeatedly ran the following tests while system is idling:
> >
> >
> > In normal operation:
> > ====================
> > tsdb=# explain analyze select * from table1 where id =
> > '33a4e9b6eae09634f4ff3e6fa9280f6e';
> >                                                            QUERY PLAN
>
> --------------------------------------------------------------------------
--
> > ----------------------------------------------------
> >  Index Scan using table1_pkey on table1  (cost=0.00..6.01 rows=1
width=346)
> > (actual time=25.30..25.31 rows=1 loops=1)
> >    Index Cond: (id = '33a4e9b6eae09634f4ff3e6fa9280f6e'::character
varying)
> >  Total runtime: 25.52 msec
> > (3 rows)
> >
> > When VACUUM runs:
> > =================
> > tsdb=# explain analyze select * from table1 where id =
> > '336139b47b7faf09fc4d4f03680a4ce5';
> >                                                              QUERY PLAN
>
> --------------------------------------------------------------------------
--
> > --------------------------------------------------------
> >  Index Scan using table1_pkey on table1  (cost=0.00..6.01 rows=1
width=346)
> > (actual time=2290.07..2290.10 rows=1 loops=1)
> >    Index Cond: (id = '336139b47b7faf09fc4d4f03680a4ce5'::character
varying)
> >  Total runtime: 2290.22 msec
> > (3 rows)
> >
> >
> > VACUUM output:
> > ==============
> > tsdb=# VACUUM VERBOSE table1;
> > INFO:  --Relation public.table1--
> > INFO:  Pages 5887: Changed 0, Empty 0; Tup 132672: Vac 0, Keep 0, UnUsed
> > 144.
> >         Total CPU 0.28s/0.01u sec elapsed 36.08 sec.
> > INFO:  --Relation pg_toast.pg_toast_12437088--
> > INFO:  Pages 40495: Changed 0, Empty 0; Tup 197587: Vac 0, Keep 0,
UnUsed
> > 235.
> >         Total CPU 1.73s/0.20u sec elapsed 233.91 sec.
> > VACUUM
> >
> >
> >
> > vmstat while VACUUM'ing:
> > ========================
> >    procs                      memory      swap          io     system
> > cpu
> >  r  b  w   swpd   free   buff  cache   si   so    bi    bo   in    cs us
sy
> > id
> >  0  1  1  74420   6520  30616 405128    0    0  1280     0  287   487  0
1
> > 99
> >  0  1  0  74420   6520  30620 405168    0    0  1196     0  271   436  0
0
> > 100
> >  0  1  1  74420   6520  30620 405120    0    0  1496     4  289   491  0
3
> > 97
> >  0  1  1  74420   6520  30620 405208    0    0  1280     0  268   466  0
0
> > 100
> >  1  0  1  74420   6520  30620 405208    0    0  1280     0  288   482  0
1
> > 99
> >  1  0  1  74420   6520  30632 405200    0    0  1416     8  277   441  1
2
> > 97
> >  3  1  1  74416   6520  30632 405196    4    0  1284     0  284   473  0
3
> > 97
> >
> >
> >
> > PostgreSQL configuration (the only changes made):
> > =================================================
> > max_connections = 1024
> > shared_buffers = 2800
> > sort_mem = 8192
> > vacuum_mem = 8192
> > effective_cache_size = 32000
> >
> >
> > System:
> > =======
> > Hardware: AMD 1.2GHz Athlon 512MB SDRAM
> > OS: Redhat Linux 9.0 (kernel 2.4.20-8)
> > FS: EXT3 with Journalling mounted with noatime, UDMA5
> > Disk1 (Linux): IDE 20GB 7200 RPM Western Digital
> > Disk2 (PostgreSQL): IDE 120GB 7200 RPM Western Digital
> > PostgreSQL: 7.3.4
> >
> >
> > hdparm:
> > =======
> > /dev/hda: (Linux partition)
> >  multcount    = 16 (on)
> >  IO_support   =  1 (32-bit)
> >  unmaskirq    =  1 (on)
> >  using_dma    =  1 (on)
> >  keepsettings =  0 (off)
> >  readonly     =  0 (off)
> >  readahead    =  8 (on)
> >  geometry     = 2498/255/63, sectors = 40132503, start = 0
> >
> > /dev/hdc: (PostgreSQL partition)
> >  multcount    = 16 (on)
> >  IO_support   =  1 (32-bit)
> >  unmaskirq    =  1 (on)
> >  using_dma    =  1 (on)
> >  keepsettings =  1 (on)
> >  readonly     =  0 (off)
> >  readahead    =  8 (on)
> >  geometry     = 232581/16/63, sectors = 234441648, start = 0
> >
> >
> > Schema:
> > =======
> > CREATE TABLE table1 (
> >    id              varchar(32)     DEFAULT ''                      NOT
NULL,
> >    colname1          varchar(10)     DEFAULT 'http'                  NOT
> > NULL,
> >    colname2      varchar(300)    DEFAULT ''                      NOT
NULL,
> >    colname3             varchar(5)      DEFAULT ''
NOT
> > NULL,
> >    colname4         varchar(300)    DEFAULT ''                      NOT
> > NULL,
> >    colname5      varchar(300)    DEFAULT ''                      NOT
NULL,
> >    colname6             integer         DEFAULT 0
NOT
> > NULL,
> >    colname7           integer         DEFAULT 0
NOT
> > NULL,
> >    colname8     integer         DEFAULT 200                     NOT
NULL,
> >    colname9          varchar(10)     DEFAULT ''                      NOT
> > NULL,
> >    colname10      varchar(10)     DEFAULT ''                      NOT
NULL,
> >    colname11          varchar(100)    DEFAULT ''
NOT
> > NULL,
> >    colname12    varchar(100)    DEFAULT ''                      NOT
NULL,
> >    colname13     varchar(100)    DEFAULT ''                      NOT
NULL,
> >    colname14  varchar(20)     DEFAULT ''                      NOT NULL,
> >    colname15         integer         DEFAULT 640                     NOT
> > NULL,
> >    colname16        integer         DEFAULT 480                     NOT
> > NULL,
> >    colname17            integer         DEFAULT 120
NOT
> > NULL,
> >    colname18           integer         DEFAULT 90
NOT
> > NULL,
> >    colname19        timestamp       DEFAULT CURRENT_TIMESTAMP       NOT
> > NULL,
> >    colname20         timestamp       DEFAULT CURRENT_TIMESTAMP       NOT
> > NULL,
> >    colname21   integer         DEFAULT 0                       NOT NULL,
> >    colname22      integer         DEFAULT 0                       NOT
NULL,
> >    colname23  timestamp       DEFAULT CURRENT_TIMESTAMP       NOT NULL,
> >    colname24             integer         DEFAULT 0
NOT
> > NULL,
> >    colname25       integer         DEFAULT 0                       NOT
NULL,
> >    colname26          varchar(10)     DEFAULT ''
NOT
> > NULL,
> >    colname28   varchar(10)     DEFAULT ''                      NOT NULL,
> >    colname29        varchar(10)     DEFAULT 'jpeg'                  NOT
> > NULL,
> >    colname30   varchar(20)     DEFAULT ''                      NOT NULL,
> >    colname31       bytea
,
> >    PRIMARY KEY (id)
> > ) WITHOUT OIDS
> >
> > CREATE UNIQUE INDEX table1_idx_1 ON table1 (colname1, colname2,
colname3,
> > colname4, colname5)
> >
> > Sigh, :-(
> >
> > Stephen
> >
> >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 6: Have you searched our list archives?
> >
> >                http://archives.postgresql.org
> >
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>



Re: VACUUM degrades performance significantly. Database

От
"Matthew T. O'Connor"
Дата:
On Wed, 2003-10-15 at 12:57, Tom Lane wrote:
> "Stephen" <jleelim@xxxxxx.com> writes:
> > Is it normal for plain VACUUM on large table to degrade performance by over
> > 9 times? My database becomes unusable when VACUUM runs. From reading
> > newsgroups, I thought VACUUM should only slow down by 10% to 15%.
>
> We have heard reports of very significant slowdowns from people who were
> already nearly saturating their disk I/O bandwidth, and then VACUUM
> pushed their systems over the knee of the response curve.

Makes sense but, wouldn't vacuum saturate the I/O bandwidth by
definition?  Especially with modern CPU's when vacuum runs it's going to
be reading from disk as fast as the disk can possibly supply the data.
For this reason, I think the suggestion you made recently of putting in
a small delay in the main vacuum loop might be helpful.

I also acknowledge that SCSI makes a big difference here, but LOTS of
people run PG on cheap linux boxes with IDE drives, so if there is
something we can do to help this setup, it would be "a good thing" IMHO.


Re: VACUUM degrades performance significantly. Database becomes

От
Vivek Khera
Дата:
>>>>> "S" == Stephen  <jleelim@xxxxxx.com> writes:

S> The system is actually idling when I ran the tests (load average: 0.01,
S> 0.02, 0.00). When VACUUM runs, load average increases quickly (load average:
S> 1.77, 0.60, 0.21) just by running psql on command line and issuing 2
S> queries. I've been running PostgreSQL for over 3 years now and IIRC VACUUM

Your CPU usage is irrelevent here.  What's your *disk* utilization.
Does linux have systat and/or iostat to tell you how busy your disks
are?

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.                Khera Communications, Inc.
Internet: khera@kciLink.com       Rockville, MD       +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

Re: VACUUM degrades performance significantly. Database

От
Shridhar Daithankar
Дата:
Vivek Khera wrote:

>>>>>>"S" == Stephen  <jleelim@xxxxxx.com> writes:
>
>
> S> The system is actually idling when I ran the tests (load average: 0.01,
> S> 0.02, 0.00). When VACUUM runs, load average increases quickly (load average:
> S> 1.77, 0.60, 0.21) just by running psql on command line and issuing 2
> S> queries. I've been running PostgreSQL for over 3 years now and IIRC VACUUM
>
> Your CPU usage is irrelevent here.  What's your *disk* utilization.
> Does linux have systat and/or iostat to tell you how busy your disks
> are?

It has a rather verbose vmstat which gives everything.

  Shridhar


Re: VACUUM degrades performance significantly. Database becomes

От
Greg Stark
Дата:
Vivek Khera <khera@kcilink.com> writes:

> Your CPU usage is irrelevent here.  What's your *disk* utilization.
> Does linux have systat and/or iostat to tell you how busy your disks
> are?

I normally look at the bi/bo columns of "vmstat 1".

There does appear to be an iostat command, on debian in the sysstat package.
No idea how good it is.

--
greg

Re: VACUUM degrades performance significantly. Database becomes unusable!

От
"Stephen"
Дата:
Good news,

I partially fixed the problem on Linux 2.4. It appears the responsiveness
can be improved significantly by tuning the disk IO elevator in Linux using
"elvtune" in util-linux. The elevator in Linux is used to re-order
read/write requests to reduce disk seeks by ordering requests according to
disk sectors. Unfortunately, the elevator in kernel 2.4 is not very smart
(or flexible I should say depending on your needs) and can starve a
read/write request for a long time if not properly tuned.

See my older thread I wrote long time ago. Back then I didn't have too much
time to test different values especially for VACUUM:

http://groups.google.com/groups?q=linux+responsiveness+stephen+postgresql&hl
=en&lr=&ie=UTF-8&oe=UTF-8&selm=F92Znj0TrJIPT6nhdBf00021ae6%40hotmail.com&rnu
m=1

See also:

http://strasbourg.linuxfr.org/jl3/features-2.3-1.html


Below are the results using different elvtune values running repeatedly when
VACUUM'ing.

elvtune -r 2048 -w 8192 /dev/hdc (default Redhat 9):
====================================================

tsdb=# explain analyze select * from table1 where id =
'336139b47b7faf09fc4d4f03680a4ce5';
                                                             QUERY PLAN
----------------------------------------------------------------------------
--------------------------------------------------------
 Index Scan using table1_pkey on table1  (cost=0.00..6.01 rows=1 width=346)
(actual time=2290.07..2290.10 rows=1 loops=1)
   Index Cond: (id = '336139b47b7faf09fc4d4f03680a4ce5'::character varying)
 Total runtime: 2290.22 msec
(3 rows)


elvtune -r 128 -w 8192 /dev/hdc:
================================

tsdb=# explain analyze select * from table1 where id =
'008ab286d725d2ea0b3269c89fc01ce2';
                                                            QUERY PLAN
----------------------------------------------------------------------------
------------------------------------------------------
 Index Scan using table1_pkey on table1  (cost=0.00..6.01 rows=1 width=346)
(actual time=450.29..450.32 rows=1 loops=1)
   Index Cond: (id = '008ab286d725d2ea0b3269c89fc01ce2'::character varying)
 Total runtime: 450.46 msec
(3 rows)


elvtune -r 64 -w 8192 /dev/hdc:
===============================

tsdb=# explain analyze select * from table1 where id =
'0078997ac809877c1a0d1f76af753608';
                                                            QUERY PLAN
----------------------------------------------------------------------------
------------------------------------------------------
 Index Scan using table1_pkey on table1  (cost=0.00..6.01 rows=1 width=346)
(actual time=206.00..206.01 rows=1 loops=1)
   Index Cond: (id = '0078997ac809877c1a0d1f76af753608'::character varying)
 Total runtime: 206.14 msec
(3 rows)


elvtune -r 32 -w 8192 /dev/hdc:
===============================

tsdb=# explain analyze select * from table1 where id =
'00c45490c9f24858c17d7dfb98c5def5';
                                                            QUERY PLAN
----------------------------------------------------------------------------
------------------------------------------------------
 Index Scan using table1_pkey on table1  (cost=0.00..6.01 rows=1 width=346)
(actual time=210.61..210.62 rows=1 loops=1)
   Index Cond: (id = '00c45490c9f24858c17d7dfb98c5def5'::character varying)
 Total runtime: 210.75 msec
(3 rows)


elvtune -r 8 -w 8192 /dev/hdc:
==============================

tsdb=# explain analyze select * from table1 where id =
'006ea95ef6b9b8f0ddcb1f33c40190ec';
                                                            QUERY PLAN
----------------------------------------------------------------------------
------------------------------------------------------
 Index Scan using table1_pkey on table1  (cost=0.00..6.01 rows=1 width=346)
(actual time=338.04..338.05 rows=1 loops=1)
   Index Cond: (id = '006ea95ef6b9b8f0ddcb1f33c40190ec'::character varying)
 Total runtime: 338.18 msec
(3 rows)


elvtune -r 1 -w 8192 /dev/hdc:
==============================

tsdb=# explain analyze select * from table1 where id =
'0025a7a9182d5456474a72f773433c01';
                                                            QUERY PLAN
----------------------------------------------------------------------------
------------------------------------------------------
 Index Scan using table1_pkey on table1  (cost=0.00..6.01 rows=1 width=346)
(actual time=390.41..390.42 rows=1 loops=1)
   Index Cond: (id = '0025a7a9182d5456474a72f773433c01'::character varying)
 Total runtime: 390.55 msec
(3 rows)

elvtune -r 0 -w 8192 /dev/hdc:
==============================

tsdb=# explain analyze select * from table1 where id =
'0023783eda7e2a5f434e55a66c3a0a11';
                                                            QUERY PLAN
----------------------------------------------------------------------------
------------------------------------------------------
 Index Scan using table1_pkey on table1  (cost=0.00..6.01 rows=1 width=346)
(actual time=246.29..246.32 rows=1 loops=1)
   Index Cond: (id = '0023783eda7e2a5f434e55a66c3a0a11'::character varying)
 Total runtime: 246.44 msec
(3 rows)

tsdb=# explain analyze select * from table1 where id =
'001eacb36161ac6a1f860bd391dce5c2';
                                                           QUERY PLAN
----------------------------------------------------------------------------
----------------------------------------------------
 Index Scan using table1_pkey on table1  (cost=0.00..6.01 rows=1 width=346)
(actual time=14.11..14.12 rows=1 loops=1)
   Index Cond: (id = '001eacb36161ac6a1f860bd391dce5c2'::character varying)
 Total runtime: 14.25 msec
(3 rows)



Remarks:
========

In all cases, VACUUM completed in the same amount of time, process load
average is reduced slightly and vmstat IO is the same.

It turns out that as the -r read value in elvtune is made smaller, the more
responsive the read becomes up to a certain point when the disk needs to do
too many unoptimized seeks. A too small read value can cause the response
time to fluctuate more than a large value.

Understandably, the elvtune values are quite specific depending on what you
do. In my case, where the table is mostly read-only, "elevtune -r 64 -w 8192
/dev/hdc" works best. It should also depend on your expected load, purpose,
disk type, size and settings.

VACUUM is still disk IO intensive. Even with "elvtune -r 64 -w 8192
/dev/hdc" and VACUUM, the response time has gone down to 200 msec from 2200
msec (10 times factor), is still high compared to normal queries at 25 msec.
VACUUM needs to clamped down much more!

Regards,

Stephen