Обсуждение: Copy performance issues

Поиск
Список
Период
Сортировка

Copy performance issues

От
Saadat Anwar
Дата:
I am having severe COPY performance issues after adding indices. What used to take a few minutes (without indices) now takes several hours (with indices). I've tried to tweak the database configuration (based on Postgres documentation and forums), but it hasn't helped as yet. Perhaps, I haven't increased the limits sufficiently. Dropping and recreating indices may not be an option due to a long time it takes to rebuild all indices.

I'll appreciate someone looking at my configuration and giving me a few ideas on how to increase the copy performance.

Thanks.
Saadat.

Table structure:
===========
table C:
           Table "public.C"
  Column  |       Type       | Modifiers
----------+------------------+-----------
 sclk     | double precision | not null
 chan     | smallint         | not null
 det      | smallint         | not null
 x        | real             | not null
 y        | real             | not null
 z        | real             | not null
 r        | real             |
 t        | real             |
 lat      | real             |
 lon      | real             |
 a        | real             |
 b        | real             |
 c        | real             |
 time     | real             |
 qa       | smallint         | not null
 qb       | smallint         | not null
 qc       | smallint         | not null
Indexes:
    "C_pkey" PRIMARY KEY, btree (sclk, chan, det)


partitioned into 19 sub-tables covering lat bands. For example:

sub-table C0:
   Inherits: C
   Check constraints:
       "C0_lat_check" CHECK (lat >= (-10::real) AND lat < 0::real)
   Indexes:
       "C0_pkey" PRIMARY KEY, btree (sclk, chan, det)
       "C0_lat" btree (lat)
       "C0_time" btree (time)
       "C0_lon" btree (lon)

sub-table C1:
   Inherits: C
   Check constraints:
       "C1_lat_check" CHECK (lat >= (-20::real) AND lat < -10::real)
   Indexes:
       "C1_pkey" PRIMARY KEY, btree (sclk, chan, det)
       "C1_lat" btree (lat)
       "C1_time" btree (time)
       "C1_lon" btree (lon)

The partitions C?s are ~30G (328,000,000 rows) each except one, which is ~65G (909,000,000 rows). There are no rows in umbrella table C from which C1, C2, ..., C19 inherit. The data is partitioned in C1, C2, ..., C19 in order to promote better access. Most people will access the data in C by specifying a lat range. Also, C?s can become quite large over time.

The COPY operation copies one file per partition, for each of the 19 partitions. Each file is between 300,000 - 600,000 records.


System configuration:
================
1. RHEL5 x86_64
2. 32G RAM
3. 8T RAID5 partition for database on a Dell PERC 5/E controller
   (I understand that I'll never get fast inserts/updates on it based on
    http://wiki.postgresql.org/wiki/SlowQueryQuestions but cannot change
    to a RAID0+1 for now).
    Database's filesystem is ext4 on LVM on RAID5.
4. Postgres 8.4.2
    shared_buffers = 10GB
    temp_buffers = 16MB
    work_mem = 2GB
    maintenance_work_mem = 256MB
    max_files_per_process = 1000
    effective_io_concurrency = 3
    wal_buffers = 8MB
    checkpoint_segments = 40
    enable_seqscan = off
    effective_cache_size = 16GB
5. analyze verbose; ran on the database before copy operation

Bonnie++ output:
=============
Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
dbtest    64240M 78829  99 266172  42 47904   6 58410  72 116247   9 767.9   1
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                256 16229  98 371704  99 20258  36 16115  97 445680  99 17966  36
dbtest,64240M,78829,99,266172,42,47904,6,58410,72,116247,9,767.9,1,256,16229,98,371704,99,20258,36,16115,97,445680,99,17966,36


Re: Copy performance issues

От
Tom Lane
Дата:
Saadat Anwar <sanwar@asu.edu> writes:
> I am having severe COPY performance issues after adding indices. What used
> to take a few minutes (without indices) now takes several hours (with
> indices). I've tried to tweak the database configuration (based on Postgres
> documentation and forums), but it hasn't helped as yet. Perhaps, I haven't
> increased the limits sufficiently. Dropping and recreating indices may not
> be an option due to a long time it takes to rebuild all indices.

I suspect your problem is basically that the index updates require a
working set larger than available RAM, so the machine spends all its
time shuffling index pages in and out.  Can you reorder the input so
that there's more locality of reference in the index values?

Also, my first reaction to that schema is to wonder whether the lat/lon
indexes are worth anything.  What sort of queries are you using them
for, and have you considered an rtree/gist index instead?

            regards, tom lane

Re: Copy performance issues

От
s anwar
Дата:
On Wed, Aug 18, 2010 at 3:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Saadat Anwar <sanwar@asu.edu> writes:
> I am having severe COPY performance issues after adding indices. What used
> to take a few minutes (without indices) now takes several hours (with
> indices). I've tried to tweak the database configuration (based on Postgres
> documentation and forums), but it hasn't helped as yet. Perhaps, I haven't
> increased the limits sufficiently. Dropping and recreating indices may not
> be an option due to a long time it takes to rebuild all indices.

I suspect your problem is basically that the index updates require a
working set larger than available RAM, so the machine spends all its
time shuffling index pages in and out.  Can you reorder the input so
that there's more locality of reference in the index values?

I can potentially reorder the data so that it has locality of reference w.r.t. one index, but not all. Or did I not interpret your response correctly?

Also, my first reaction to that schema is to wonder whether the lat/lon
indexes are worth anything.  What sort of queries are you using them
for, and have you considered an rtree/gist index instead?

I always assumed that the btree indices on individual fields were smaller and more efficient as compared to the rtree/gist indices. Is that not the case? And since the users did not need points and point-queries, I decided in the favor of indexing individual fields.
 
                       regards, tom lane


Thanks.
Saadat.

Re: Copy performance issues

От
Sergey Konoplev
Дата:
Hi,

Try to split your data in small batches. It helped me in a similar
situation recently. I was loading about a million rows into the table
highly populated with indexes and different triggers and the batch
size was 100 (using COPY). The only thing I did with DDL is droped FKs
and recreated them after.

BTW question to gurus - why and in what cases small batch loading
could theoretically be faster then huge one if there is no another
load on the database but this?


On 18 August 2010 20:25, Saadat Anwar <sanwar@asu.edu> wrote:
> I am having severe COPY performance issues after adding indices. What used
> to take a few minutes (without indices) now takes several hours (with
> indices). I've tried to tweak the database configuration (based on Postgres
> documentation and forums), but it hasn't helped as yet. Perhaps, I haven't
> increased the limits sufficiently. Dropping and recreating indices may not
> be an option due to a long time it takes to rebuild all indices.
>
> I'll appreciate someone looking at my configuration and giving me a few
> ideas on how to increase the copy performance.
>
> Thanks.
> Saadat.
>
> Table structure:
> ===========
> table C:
>            Table "public.C"
>   Column  |       Type       | Modifiers
> ----------+------------------+-----------
>  sclk     | double precision | not null
>  chan     | smallint         | not null
>  det      | smallint         | not null
>  x        | real             | not null
>  y        | real             | not null
>  z        | real             | not null
>  r        | real             |
>  t        | real             |
>  lat      | real             |
>  lon      | real             |
>  a        | real             |
>  b        | real             |
>  c        | real             |
>  time     | real             |
>  qa       | smallint         | not null
>  qb       | smallint         | not null
>  qc       | smallint         | not null
> Indexes:
>     "C_pkey" PRIMARY KEY, btree (sclk, chan, det)
>
>
> partitioned into 19 sub-tables covering lat bands. For example:
>
> sub-table C0:
>    Inherits: C
>    Check constraints:
>        "C0_lat_check" CHECK (lat >= (-10::real) AND lat < 0::real)
>    Indexes:
>        "C0_pkey" PRIMARY KEY, btree (sclk, chan, det)
>        "C0_lat" btree (lat)
>        "C0_time" btree (time)
>        "C0_lon" btree (lon)
>
> sub-table C1:
>    Inherits: C
>    Check constraints:
>        "C1_lat_check" CHECK (lat >= (-20::real) AND lat < -10::real)
>    Indexes:
>        "C1_pkey" PRIMARY KEY, btree (sclk, chan, det)
>        "C1_lat" btree (lat)
>        "C1_time" btree (time)
>        "C1_lon" btree (lon)
>
> The partitions C?s are ~30G (328,000,000 rows) each except one, which is
> ~65G (909,000,000 rows). There are no rows in umbrella table C from which
> C1, C2, ..., C19 inherit. The data is partitioned in C1, C2, ..., C19 in
> order to promote better access. Most people will access the data in C by
> specifying a lat range. Also, C?s can become quite large over time.
>
> The COPY operation copies one file per partition, for each of the 19
> partitions. Each file is between 300,000 - 600,000 records.
>
>
> System configuration:
> ================
> 1. RHEL5 x86_64
> 2. 32G RAM
> 3. 8T RAID5 partition for database on a Dell PERC 5/E controller
>    (I understand that I'll never get fast inserts/updates on it based on
>     http://wiki.postgresql.org/wiki/SlowQueryQuestions but cannot change
>     to a RAID0+1 for now).
>     Database's filesystem is ext4 on LVM on RAID5.
> 4. Postgres 8.4.2
>     shared_buffers = 10GB
>     temp_buffers = 16MB
>     work_mem = 2GB
>     maintenance_work_mem = 256MB
>     max_files_per_process = 1000
>     effective_io_concurrency = 3
>     wal_buffers = 8MB
>     checkpoint_segments = 40
>     enable_seqscan = off
>     effective_cache_size = 16GB
> 5. analyze verbose; ran on the database before copy operation
>
> Bonnie++ output:
> =============
> Version  1.03       ------Sequential Output------ --Sequential Input-
> --Random-
>                     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
> --Seeks--
> Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec
> %CP
> dbtest    64240M 78829  99 266172  42 47904   6 58410  72 116247   9 767.9
> 1
>                     ------Sequential Create------ --------Random
> Create--------
>                     -Create-- --Read--- -Delete-- -Create-- --Read---
> -Delete--
>               files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec
> %CP
>                 256 16229  98 371704  99 20258  36 16115  97 445680  99
> 17966  36
>
dbtest,64240M,78829,99,266172,42,47904,6,58410,72,116247,9,767.9,1,256,16229,98,371704,99,20258,36,16115,97,445680,99,17966,36
>
>
>



--
Sergey Konoplev

Blog: http://gray-hemp.blogspot.com /
Linkedin: http://ru.linkedin.com/in/grayhemp /
JID/GTalk: gray.ru@gmail.com / Skype: gray-hemp / ICQ: 29353802