Re: Query taking long time

Поиск
Список
Период
Сортировка
От Venkata Balaji Nagothi
Тема Re: Query taking long time
Дата
Msg-id CAHBAh5uskrpvnB7OQBt-GdDCf2oeHj1SFDMzMt8V4bqwQGACxQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Query taking long time  ("acanada" <acanada@cnio.es>)
Ответы Re: Query taking long time  ("acanada" <acanada@cnio.es>)
Список pgsql-performance
On Wed, Mar 19, 2014 at 10:09 PM, acanada <acanada@cnio.es> wrote:

Hello,

First of all I'd like to thank all of you for taking your time and help me with this. Thank you very much.

I did migrate the database to the new server with 32 processors Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz  and 60GB of RAM. 
Evegeny pointed that the disks I am using are not fast enough (For data: 00:1f.2 RAID bus controller: Intel Corporation C600/X79 series chipset SATA RAID Controller (rev 05); and for logging a SAS disk but with only 240GB available, database is 365GB...). I cannot change the locations of data and log since there's not enough space for the data in the SAS disk.  Sadly this is a problem that I cannot solve any time soon...

The migration had really improved the performance
I paste the before and after (the migration) explain analyze, buffers(if aplicable due to server versions)

BEFORE:
explain analyze select * from (select * from entity2document2  where name='Acetaminophen' ) as a  order by a.hepval;
                                                                  QUERY PLAN                                                                  
----------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=18015.66..18027.15 rows=4595 width=139) (actual time=39755.942..39756.246 rows=2845 loops=1)
   Sort Key: entity2document2.hepval
   Sort Method:  quicksort  Memory: 578kB
   ->  Bitmap Heap Scan on entity2document2  (cost=116.92..17736.15 rows=4595 width=139) (actual time=45.682..39751.255 rows=2845 loops=1)
         Recheck Cond: ((name)::text = 'Acetaminophen'::text)
         ->  Bitmap Index Scan on entity2document2_name  (cost=0.00..115.77 rows=4595 width=0) (actual time=45.124..45.124 rows=2845 loops=1)
               Index Cond: ((name)::text = 'Acetaminophen'::text)
 Total runtime: 39756.507 ms

 AFTER:
 explain (analyze,buffers) select * from (select * from entity2document2  where name='Acetaminophen' ) as a  order by a.hepval;
                                                                   QUERY PLAN                                                                   
------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=18434.76..18446.51 rows=4701 width=131) (actual time=9196.634..9196.909 rows=2845 loops=1)
   Sort Key: entity2document2.hepval
   Sort Method: quicksort  Memory: 604kB
   Buffers: shared hit=4 read=1725
   ->  Bitmap Heap Scan on entity2document2  (cost=105.00..18148.03 rows=4701 width=131) (actual time=38.668..9190.318 rows=2845 loops=1)
         Recheck Cond: ((name)::text = 'Acetaminophen'::text)
         Buffers: shared hit=4 read=1725
         ->  Bitmap Index Scan on entity2documentnew_name  (cost=0.00..103.82 rows=4701 width=0) (actual time=30.905..30.905 rows=2845 loops=1)
               Index Cond: ((name)::text = 'Acetaminophen'::text)
               Buffers: shared hit=1 read=14
 Total runtime: 9197.186 ms

The improve is definitely good!!.
This is the table that I'm using: 
\d+ entity2document2;
                                    Table "public.entity2document2"
      Column      |              Type              | Modifiers | Storage  | Stats target | Description 
------------------+--------------------------------+-----------+----------+--------------+-------------
 id               | integer                        | not null  | plain    |              | 
 document_id      | integer                        |           | plain    |              | 
 name             | character varying(255)         | not null  | extended |              | 
 qualifier        | character varying(255)         | not null  | extended |              | 
 tagMethod        | character varying(255)         |           | extended |              | 
 created          | timestamp(0) without time zone | not null  | plain    |              | 
 updated          | timestamp(0) without time zone |           | plain    |              | 
 curation         | integer                        |           | plain    |              | 
 hepval           | double precision               |           | plain    |              | 
 cardval          | double precision               |           | plain    |              | 
 nephval          | double precision               |           | plain    |              | 
 phosval          | double precision               |           | plain    |              | 
 patternCount     | double precision               |           | plain    |              | 
 ruleScore        | double precision               |           | plain    |              | 
 hepTermNormScore | double precision               |           | plain    |              | 
 hepTermVarScore  | double precision               |           | plain    |              | 
 svmConfidence    | double precision               |           | plain    |              | 
Indexes:
"ent_pkey" PRIMARY KEY, btree (id)
    "ent_cardval" btree (cardval)
    "ent_document_id" btree (document_id)
    "ent_heptermnormscore" btree ("hepTermNormScore")
    "ent_heptermvarscore" btree ("hepTermVarScore")
    "ent_hepval" btree (hepval)
    "ent_name" btree (name)
    "ent_nephval" btree (nephval)
    "ent_patterncount" btree ("patternCount")
    "ent_phosval" btree (phosval)
    "ent_qualifier" btree (qualifier)
    "ent_qualifier_name" btree (qualifier, name)
    "ent_rulescore" btree ("ruleScore")
    "ent_svm_confidence_index" btree ("svmConfidence")

And this are my current_settings

            name            |  current_setting   |        source        
----------------------------+--------------------+----------------------
 application_name           | psql               | client
 client_encoding            | UTF8               | client
 DateStyle                  | ISO, MDY           | configuration file
 default_text_search_config | pg_catalog.english | configuration file
 effective_cache_size       | 45000MB            | configuration file
 lc_messages                | en_US.UTF-8        | configuration file
 lc_monetary                | en_US.UTF-8        | configuration file
 lc_numeric                 | en_US.UTF-8        | configuration file
 lc_time                    | en_US.UTF-8        | configuration file
 listen_addresses           | *                  | configuration file
 log_timezone               | Europe/Madrid      | configuration file
 logging_collector          | on                 | configuration file
 maintenance_work_mem       | 4000MB             | configuration file
 max_connections            | 100                | configuration file
 max_stack_depth            | 2MB                | environment variable
 shared_buffers             | 10000MB            | configuration file
 TimeZone                   | Europe/Madrid      | configuration file
 work_mem                   | 32MB               | configuration file

The size  of the table is 41 GB and some statistics:
 relname             | rows_in_bytes |  num_rows   | number_of_indexes | unique | single_column | multi_column 
entity2document2               | 89 MB         | 9.33479e+07 |                14 | Y      |            13 |            1


I'm doing right now the CLUSTER on the table using the name+hepval multiple index as Venkata told me and will post you if it works. 
Anyway, even though the improvement is important, I'd like an increase of the performance. When the number of rows returned is high, the performance decreases too much..

Sorry, i have not been following this since sometime now.

Hardware configuration is better now. You were running on 8.3.x, can you please help us know what version of Postgres is this ?

Did you collect latest statistics and performed VACUUM after migration ?

Can you get us the EXPLAIN plan for "select * from entity2document2  where name='Acetaminophen' ; " ?

Venkata Balaji N

Sr. Database Administrator
Fujitsu Australia

В списке pgsql-performance по дате отправления:

Предыдущее
От: Evgeny Shishkin
Дата:
Сообщение: Re: slave wal is ahead of master
Следующее
От: "acanada"
Дата:
Сообщение: Re: Query taking long time