Re: very, very slow performance

Поиск
Список
Период
Сортировка
От Tena Sakai
Тема Re: very, very slow performance
Дата
Msg-id FE44E0D7EAD2ED4BB2165071DB8E328C0378F76C@egcrc-ex01.egcrc.org
обсуждение исходный текст
Ответ на very, very slow performance  ("Tena Sakai" <tsakai@gallo.ucsf.edu>)
Список pgsql-admin

Hi Emanuel,

> Do you tried partitioned tables? diferent tablespaces?
> set the storage external for more important columns?
> what kind of indexes do you have?

I think the table design is ok.
I don't know what "set the storage external" means.
Maybe you can explain it to me in lay-person's terms?
I think indices are ok, but I welcome your comments.

  canon=# \d genotype
                                              Table "public.genotype"
       Column      |            Type             |                           Modifiers                          
  -----------------+-----------------------------+---------------------------------------------------------------
   genotypeid      | integer                     | not null default nextval('genotype_genotypeid_seq'::regclass)
   subjectid       | integer                     |
   markerid        | integer                     |
   allele1id       | integer                     |
   allele2id       | integer                     |
   datecreated     | timestamp without time zone | not null
   datereplaced    | timestamp without time zone | not null
   ignore          | character(1)                | not null default 'N'::bpchar
   inconsistent    | character(1)                | not null default 'N'::bpchar
   sourcetablename | character varying           | not null
   sourceid        | character varying           | not null
  Indexes:
      "genotype_pkey" PRIMARY KEY, btree (genotypeid, datecreated)
      "genotype_genotypeid_idx" btree (genotypeid)
      "genotype_markerid_idx" btree (markerid)
      "genotype_source_idx" btree (sourceid, sourcetablename)
      "genotype_subjectid_idx" btree (subjectid)

  canon=# \d allele
                                           Table "public.allele"
      Column    |            Type             |                         Modifiers                        
  --------------+-----------------------------+-----------------------------------------------------------
   alleleid     | integer                     | not null default nextval('allele_alleleid_seq'::regclass)
   markerid     | integer                     |
  value        | character varying           | not null
   datecreated  | timestamp without time zone | not null
   datereplaced | timestamp without time zone | not null
  Indexes:
      "allele_pkey" PRIMARY KEY, btree (alleleid, datecreated)
      "allele_markerid_idx" btree (markerid)

> try to run explain analyze for those querys, then
> post the results.

Yes, it is running for last 13+ hours and I have no
idea how much longer it might take.

> I recommend follow this thread:
> http://archives.postgresql.org/pgsql-hackers/2009-02/msg00718.php

That's pretty interesting stuff.  I need to experiment tweaking
parameters Joshua mentions.  Alas, now is not a good time to do
so.

Regards,

Tena Sakai

-----Original Message-----
From: pgsql-admin-owner@postgresql.org on behalf of Emanuel Calvo Franco
Sent: Fri 2/20/2009 4:10 AM
To: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] very, very slow performance

2009/2/20 Tena Sakai <tsakai@gallo.ucsf.edu>:
> Hi Everybody,
>
> Here's the query (please read them in fixed-size font, if
> you can):
>
>  select subjectid, genotype.markerid, a1.value as allele1,
>         a2.value as allele2, genotype.dateCreated,
>         genotype.dateReplaced, genotype.ignore,
>         genotype.inconsistent
>    from genotype, allele a1, allele a2
>   where
>         allele1id = a1.alleleid
>     and
>         allele2id = a2.alleleid;
>
> Genotype table mentioned above has about 600,000,000+ rows.  As
> I mentioned, there are 20 more of them running concurrently.
> 3 other jobs look like:
>

Do you tried partitioned tables? diferent tablespaces? set the storage
external for
more important columns? what kind of indexes do you have?

>  SELECT a.markerid,a.type,a.localname,b.ncbibuild,
>         a.chromosome,a.geneticposition,b.physicalposition,
>         a.strand,a.stdflanks,a.maxflanks,a.datecreated,
>         a.datereplaced,a.sourcetablename,a.sourceid,
>         b.dbsnprsid,a.ignore,a.gene
>    FROM public.marker a, public.snpposition b
>   WHERE
>         a.ignore= 'N'
>     AND a.datecreated <= (timestamp'Wed Oct 29 09:35:54.266 2008')
>     AND a.datereplaced > (timestamp'Wed Oct 29 09:35:54.266 2008')
>     AND a.localname IN
> ('RS10757474','RS7859598','RS6148','RS9792663','RS1541125',
>                          'RS10511446','RS10814410','RS12338622','RS875587',
>                          'RS1590979', 'RS748786','RS958505','RS12352961',
>                           and on and on and on...);
>
>
>   insert into summarystats
>                     select 'Marker by Chromosomes', chromosome,
> sourcetablename,
>                             count(*), null, to_timestamp('2009-02-18
> 20:29:40.125',
>                             'yyyy-mm-dd hh:mi:ss.ms')
>                       from marker
>                      where ignore = 'N'
>                        and datereplaced = '3000-01-01 12:00:00.000'
>                        and exists (select 1
>                                      from genotype
>                                     where genotype.markerid =
> marker.markerid
>                                       and genotype.ignore = 'N'
>                                       and genotype.datereplaced =
> '3000-01-01 12:00:00.000')
>                      group by chromosome, sourcetablename;
>
>   COPY public.genotype (genotypeid, subjectid, markerid,
>                         allele1id, allele2id, datecreated, datereplaced,
>                         ignore, inconsistent, sourcetablename, sourceid)
>     TO stdout;
>

Like Scott saids, try to run explain analyze for those querys, then
post the results.


>
> Can anybody suggest anything that I can do to gain speed?
> Any help is much appreciated.
>

I recommend follow this thread:
http://archives.postgresql.org/pgsql-hackers/2009-02/msg00718.php


> Regards,
>
> Tena Sakai
> tsakai@gallo.ucsf.edu
>
>
>



--
      Emanuel Calvo Franco
        Sumate al ARPUG !
      (www.postgres-arg.org -
         www.arpug.com.ar)
    ArPUG / AOSUG Member
   Postgresql Support & Admin

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

В списке pgsql-admin по дате отправления:

Предыдущее
От: Scott Marlowe
Дата:
Сообщение: Re: Crash with data corruption under Windows
Следующее
От: Emanuel Calvo Franco
Дата:
Сообщение: Fwd: very, very slow performance