Re: Query taking long time

Поиск

Список

Период

Сортировка

От	acanada
Тема	Re: Query taking long time
Дата	6 марта 2014 г. 10:47:59
Msg-id	42391040-3FF8-4391-B248-ADFDBA8AB586@cnio.es обсуждение исходный текст
Ответ на	Query taking long time ("acanada" <acanada@cnio.es>)
Список	pgsql-performance

Дерево обсуждения

Hi Venkata,

Increasing the work_mem doesn't improve results. After raising it to 1GB:

limtox=> explain analyze select * from entity_compounddict2document where name='Troglitazone' order by hepval;

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------

Sort (cost=11083.47..11090.54 rows=2828 width=133) (actual time=19679.354..19679.494 rows=1283 loops=1)

Sort Key: hepval

Sort Method: quicksort Memory: 238kB

-> Bitmap Heap Scan on entity_compounddict2document (cost=73.87..10921.34 rows=2828 width=133) (actual time=93.926..19677.110 rows=1283 loops=1)

Recheck Cond: ((name)::text = 'Troglitazone'::text)

-> Bitmap Index Scan on entity_compound2document_name (cost=0.00..73.16 rows=2828 width=0) (actual time=78.005..78.005 rows=1283 loops=1)

Index Cond: ((name)::text = 'Troglitazone'::text)

Total runtime: 19679.680 ms

There are not temp files in the data_directory... I have set to 1MB log_temp_files and did this query again but there's nothing related to tmp files in it...

I cannot see "temp_files" column in pg_stat_database view (using 8.3 version) :-(

Thanks for your help. Regards,

Andrés

El Mar 6, 2014, a las 2:36 AM, Venkata Balaji Nagothi escribió:

Hi Andres,

Sorting cost is high !!

This query must be going for a disk sort, do you see temp files getting created in pg_stat_tmp directory in the data directory ?

Or you can enable "log_temp_files" to probably 1 MB or so. This will log the information about temp files in the log file in pg_log directory.

Or you can see "temp_files" column in pg_stat_database view before and after the execution of the query.

How much work_mem did you give ? you can increase the work_mem to probably 1 GB at session level and run the query. That might have different results.

Dropping of other Indexes will not effect this query's performance. I recommended them to be dropped to avoid maintenance and performance over-head in future.

Venkata Balaji N

Sr. Database Administrator
Fujitsu Australia

On Wed, Mar 5, 2014 at 10:14 PM, acanada <acanada@cnio.es> wrote:
Hi,

This are the parameters of the server:

SELECT name, current_setting(name), source
FROM pg_settings
WHERE source NOT IN ('default', 'override');

name | current_setting | source
----------------------------+--------------------+----------------------
client_encoding | UTF8 | client
DateStyle | ISO, DMY | configuration file
default_statistics_target | 100 | configuration file
default_text_search_config | pg_catalog.spanish | configuration file
effective_cache_size | 7500MB | configuration file
lc_messages | es_ES.UTF-8 | configuration file
lc_monetary | es_ES.UTF-8 | configuration file
lc_numeric | C | configuration file
lc_time | es_ES.UTF-8 | configuration file
listen_addresses | * | configuration file
log_line_prefix | %t | configuration file
log_timezone | localtime | command line
maintenance_work_mem | 2000MB | configuration file
max_connections | 100 | configuration file
max_fsm_pages | 63217760 | configuration file
max_stack_depth | 2MB | environment variable
port | 5432 | configuration file
shared_buffers | 1500MB | configuration file
ssl | on | configuration file
tcp_keepalives_count | 9 | configuration file
tcp_keepalives_idle | 7200 | configuration file
tcp_keepalives_interval | 75 | configuration file
TimeZone | localtime | command line
timezone_abbreviations | Default | command line
work_mem | 50MB | configuration file

The server has 2 processors quadcore, 10GB of RAM and data is located in a fiber disk of 2TB.
Changing work_mem parameter seems to have no effect on the perform.

Now I have a curious situation. I created a new table, the one we are query against. This table, entity_compounddict2document has less rows aprox. 50M vs. the original table entity2document2 that has 94M rows.
Well, after dropping indexes not already in use, both tables have the same performance with this query...

explain analyze select * from entity_compounddict2document where name='Troglitazone' order by hepval;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=11083.47..11090.54 rows=2828 width=133) (actual time=19708.019..19708.136 rows=1283 loops=1)
Sort Key: hepval
Sort Method: quicksort Memory: 238kB
-> Bitmap Heap Scan on entity_compounddict2document (cost=73.87..10921.34 rows=2828 width=133) (actual time=44.292..19705.954 rows=1283 loops=1)
Recheck Cond: ((name)::text = 'Troglitazone'::text)
-> Bitmap Index Scan on entity_compound2document_name (cost=0.00..73.16 rows=2828 width=0) (actual time=28.159..28.159 rows=1283 loops=1)
Index Cond: ((name)::text = 'Troglitazone'::text)
Total runtime: 19708.275 ms
(8 rows)

explain analyze select * from entity2document2 where name='Troglitazone' order by hepval;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=18237.75..18249.38 rows=4653 width=123) (actual time=18945.732..18945.869 rows=1283 loops=1)
Sort Key: hepval
Sort Method: quicksort Memory: 238kB
-> Bitmap Heap Scan on entity2document2 (cost=117.37..17954.29 rows=4653 width=123) (actual time=41.703..18943.720 rows=1283 loops=1)
Recheck Cond: ((name)::text = 'Troglitazone'::text)
-> Bitmap Index Scan on entity2document2_name (cost=0.00..116.20 rows=4653 width=0) (actual time=28.703..28.703 rows=1283 loops=1)
Index Cond: ((name)::text = 'Troglitazone'::text)
Total runtime: 18945.991 ms
(8 rows)

Description of the tables are:

limtox=> \d+ entity_compounddict2document;
Table "public.entity_compounddict2document"
Column | Type | Modifiers | Storage | Description
------------------+--------------------------------+-----------+----------+-------------
id | integer | not null | plain |
document_id | integer | | plain |
name | character varying(255) | | extended |
qualifier | character varying(255) | | extended |
tagMethod | character varying(255) | | extended |
created | timestamp(0) without time zone | | plain |
updated | timestamp(0) without time zone | | plain |
curation | integer | | plain |
hepval | double precision | | plain |
cardval | double precision | | plain |
nephval | double precision | | plain |
phosval | double precision | | plain |
patternCount | double precision | | plain |
ruleScore | double precision | | plain |
hepTermNormScore | double precision | | plain |
hepTermVarScore | double precision | | plain |
Indexes:
"entity_compounddict2document_pkey" PRIMARY KEY, btree (id)
"entity_compound2document_name" btree (name)
"entity_compound2document_nephval" btree (nephval)
Has OIDs: no

limtox=> \d+ entity2document2;
Table "public.entity2document2"
Column | Type | Modifiers | Storage | Description
------------------+--------------------------------+-----------+----------+-------------
id | integer | not null | plain |
document_id | integer | | plain |
name | character varying(255) | | extended |
qualifier | character varying(255) | | extended |
tagMethod | character varying(255) | | extended |
created | timestamp(0) without time zone | | plain |
updated | timestamp(0) without time zone | | plain |
curation | integer | | plain |
hepval | double precision | | plain |
cardval | double precision | | plain |
nephval | double precision | | plain |
phosval | double precision | | plain |
patternCount | double precision | | plain |
ruleScore | double precision | | plain |
hepTermNormScore | double precision | | plain |
hepTermVarScore | double precision | | plain |
Indexes:
"entity2document2_pkey" PRIMARY KEY, btree (id)
"entity2document2_hepval_index" btree (hepval)
"entity2document2_name" btree (name)
"entity2document2_qualifier_name_hepval" btree (qualifier, name)
"entity2document_qualifier_index" btree (qualifier)
Has OIDs: no

I really appreciate your help!!

Regards,
Andrés

El Mar 5, 2014, a las 12:35 AM, Venkata Balaji Nagothi escribió:

After looking at the distinct values, yes the composite Index on "name" and "hepval" is not recommended. That would worsen - its expected.

We need to look for other possible work around. Please drop off the above Index. Let me see if i can drill further into this.

Meanwhile - can you help us know the memory parameters (work_mem, temp_buffers etc) set ?

Do you have any other processes effecting this query's performance ?

Any info about your Disk, RAM, CPU would also help.

Regards,
Venkata Balaji N

Fujitsu Australia

Venkata Balaji N

Sr. Database Administrator
Fujitsu Australia

On Tue, Mar 4, 2014 at 10:23 PM, acanada <acanada@cnio.es> wrote:
Hello,

I don't know if this helps to figure out what is the problem but after adding the multicolumn index on name and hepval, the performance is even worse (¿?). Ten times worse...

explain analyze select * from (select * from entity_compounddict2document where name='progesterone') as a order by a.hepval;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=422746.18..423143.94 rows=159104 width=133) (actual time=95769.674..95797.943 rows=138165 loops=1)
Sort Key: entity_compounddict2document.hepval
Sort Method: quicksort Memory: 25622kB
-> Bitmap Heap Scan on entity_compounddict2document (cost=3501.01..408999.90 rows=159104 width=133) (actual time=70.789..95519.258 rows=138165 loops=1)
Recheck Cond: ((name)::text = 'progesterone'::text)
-> Bitmap Index Scan on entity_compound2document_name (cost=0.00..3461.23 rows=159104 width=0) (actual time=35.174..35.174 rows=138165 loops=1)
Index Cond: ((name)::text = 'progesterone'::text)
Total runtime: 95811.838 ms
(8 rows)

Any ideas please?

Thank you
Andrés.

El Mar 4, 2014, a las 12:28 AM, Venkata Balaji Nagothi escribió:

On Mon, Mar 3, 2014 at 9:17 PM, acanada <acanada@cnio.es> wrote:
Hello,

Thankyou for your answer.
I have made more changes than a simple re-indexing recently. I have moved the sorting field to the table in order to avoid the join clause. Now the schema is very simple. The query only implies one table:

x=> \d+ entity_compounddict2document;
Table "public.entity_compounddict2document"
Column | Type | Modifiers | Storage | Description
------------------+--------------------------------+-----------+----------+-------------
id | integer | not null | plain |
document_id | integer | | plain |
name | character varying(255) | | extended |
qualifier | character varying(255) | | extended |
tagMethod | character varying(255) | | extended |
created | timestamp(0) without time zone | | plain |
updated | timestamp(0) without time zone | | plain |
curation | integer | | plain |
hepval | double precision | | plain |
cardval | double precision | | plain |
nephval | double precision | | plain |
phosval | double precision | | plain |
patternCount | double precision | | plain |
ruleScore | double precision | | plain |
hepTermNormScore | double precision | | plain |
hepTermVarScore | double precision | | plain |
Indexes:
"entity_compounddict2document_pkey" PRIMARY KEY, btree (id)
"entity_compound2document_cardval" btree (cardval)
"entity_compound2document_heptermnormscore" btree ("hepTermNormScore")
"entity_compound2document_heptermvarscore" btree ("hepTermVarScore")
"entity_compound2document_hepval" btree (hepval)
"entity_compound2document_name" btree (name)
"entity_compound2document_nephval" btree (nephval)
"entity_compound2document_patterncount" btree ("patternCount")
"entity_compound2document_phosval" btree (phosval)
"entity_compound2document_rulescore" btree ("ruleScore")
Has OIDs: no

tablename | indexname | num_rows | table_size | index_size | unique | number_of_scans | tuples_read | tuples_fetched
entity_compounddict2document | entity_compound2document_cardval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_heptermnormscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_heptermvarscore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_hepval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_name | 5.42452e+07 | 6763 MB | 1505 MB | Y | 24 | 178680 | 0
entity_compounddict2document | entity_compound2document_nephval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_patterncount | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_phosval | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compound2document_rulescore | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0
entity_compounddict2document | entity_compounddict2document_pkey | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | 0 | 0

The table has aprox. 54,000,000 rows
There are no NULLs in hepval field and pg_settings haven't changed. I also have done "analyze" to this table.

I have simplified the query and added the last advise that you told me:

Query:

explain analyze select * from (select * from entity_compounddict2document where name='ranitidine') as a order by a.hepval;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=11060.50..11067.55 rows=2822 width=133) (actual time=32715.097..32716.488 rows=13512 loops=1)
Sort Key: entity_compounddict2document.hepval
Sort Method: quicksort Memory: 2301kB
-> Bitmap Heap Scan on entity_compounddict2document (cost=73.82..10898.76 rows=2822 width=133) (actual time=6.034..32695.483 rows=13512 loops=1)
Recheck Cond: ((name)::text = 'ranitidine'::text)
-> Bitmap Index Scan on entity_compound2document_name (cost=0.00..73.12 rows=2822 width=0) (actual time=3.221..3.221 rows=13512 loops=1)
Index Cond: ((name)::text = 'ranitidine'::text)
Total runtime: 32717.548 ms

Another query:

explain analyze select * from (select * from entity_compounddict2document where name='progesterone' ) as a order by a.hepval;

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=367879.25..368209.24 rows=131997 width=133) (actual time=9262.887..9287.046 rows=138165 loops=1)
Sort Key: entity_compounddict2document.hepval
Sort Method: quicksort Memory: 25622kB
-> Bitmap Heap Scan on entity_compounddict2document (cost=2906.93..356652.81 rows=131997 width=133) (actual time=76.316..9038.485 rows=138165 loops=1)
Recheck Cond: ((name)::text = 'progesterone'::text)
-> Bitmap Index Scan on entity_compound2document_name (cost=0.00..2873.93 rows=131997 width=0) (actual time=40.913..40.913 rows=138165 loops=1)
Index Cond: ((name)::text = 'progesterone'::text)
Total runtime: 9296.815 ms

It has improved (I supose because of the lack of the join table) but still taking a lot of time... Anything I can do??

Any help would be very appreciated. Thank you very much.

Good to know performance has increased.

"entity_compounddict2document" table goes through high INSERTS ?

Can you help us know if the "helpval" column and "name" column have high duplicate values ? "n_distinct" value from pg_stats table would have that info.

Below could be a possible workaround -

As mentioned earlier in this email, a composite Index on name and hepval column might help. If the table does not go through lot of INSERTS, then consider performing a CLUSTER on the table using the same INDEX.

Other recommendations -

Please drop off all the Non-primary key Indexes which have 0 scans / hits. This would harm the DB and the DB server whilst maintenance and DML operations.

Regards,
Venkata Balaji N

Fujitsu Australia

**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.

**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.

**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.

**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.

**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en su caso los ficheros adjuntos, pueden contener información protegida para el uso exclusivo de su destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de transmisión por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido.

**CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.

В списке pgsql-performance по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Query taking long time