Обсуждение: Performance Query

Поиск
Список
Период
Сортировка

Performance Query

От
Abu Mushayeed
Дата:
1. The function:
 
SELECT a.birth_date FROM (
 SELECT  indiv_fkey, birth_dt as birth_date, intern_last_update::date as last_update, 'fed' as source
    
  FROM cdm.cdm_fedcustomer
  WHERE birth_dt IS NOT NULL
  AND indiv_fkey = $1
  UNION
  SELECT  indiv_fkey, birthdate as birth_date, last_update::date as last_update, 'reg' as source
  FROM cdm.cdm_reg_customer
  WHERE birthdate IS NOT NULL
  AND indiv_fkey = $1
ORDER BY source asc, last_update desc limit 1
)   a
 
2. The query:
 
INSERT INTO  indiv_mast.staging_birthdate
        SELECT * FROM (
        SELECT im.indiv_key,   indiv_mast.getbest_bday(im.indiv_key::integer) AS birth_date
        FROM indiv_mast.indiv_mast Im
        WHERE im.indiv_key >= 2000000 AND im.indiv_key < 4000000
        ) b
        WHERE b.birth_date IS NOT NULL
        ;
 
3. The query plan:
 
Bitmap Heap Scan on indiv_mast im  (cost=28700.91..2098919.14 rows=1937250 width=8)
  Recheck Cond: ((indiv_key >= 2000000) AND (indiv_key < 4000000))
  Filter: (indiv_mast.getbest_bday((indiv_key)::integer) IS NOT NULL)
  ->  Bitmap Index Scan on indiv_mast_pkey_idx  (cost=0.00..28700.91 rows=1946985 width=0)
        Index Cond: ((indiv_key >= 2000000) AND (indiv_key < 4000000))
4. Number of records in the tables:
 
indiv_mast.indiv_mast : 15Million
cdm.cdm_fedcustomer: 18Million
cdm.cdm_reg_customer: 9 Million
 
The query (2) runs for hours. It started at 2:00Am last night and it is still running (6:00Am).
 
Some of the postgresql.conf file parameters are below:
 
shared_buffers = 20000 #60000
work_mem = 65536 #131072 #65536
maintenance_work_mem = 524288 #131072
max_fsm_pages = 8000000
max_fsm_relations = 32768
wal_buffers = 128
checkpoint_segments = 256               # in logfile segments, min 1, 16MB each
checkpoint_timeout = 3600
checkpoint_warning = 300
effective_cache_size = 20000
random_page_cost = 2        # (same)
 
I really do not know how to find out what the query is waiting on, unlike oracle db provides some of the information through its dynamic performance views.
 
Please help in understanding how I can find out what the system is waiting for or why is it taking the query so long.
 
I will really appreciate some help.
 
Thanks
Abu


Sucker-punch spam with award-winning protection.
Try the free Yahoo! Mail Beta.

Re: Performance Query

От
Josh Berkus
Дата:
Abu,

>   I really do not know how to find out what the query is waiting on,
> unlike oracle db provides some of the information through its dynamic
> performance views.

Yeah, we don't have that yet.

>   Please help in understanding how I can find out what the system is
> waiting for or why is it taking the query so long.

First guess would be I/O bound.  The planner, at least, thinks you're
inserting 2 million records.  What kind of disk support do you have?


--
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

Re: Performance Query

От
Tom Lane
Дата:
Josh Berkus <josh@agliodbs.com> writes:
>> Please help in understanding how I can find out what the system is
>> waiting for or why is it taking the query so long.

> First guess would be I/O bound.  The planner, at least, thinks you're
> inserting 2 million records.  What kind of disk support do you have?

I don't see any need to guess.  iostat or vmstat or local equivalent
will show you quick enough if you are maxing out the disk or the CPU.

It seems at least somewhat possible that the thing is blocked on a lock,
in which case the pg_locks view would tell you about it.

            regards, tom lane