Re: Performance Issue (Not using Index when joining two tables).

Поиск
Список
Период
Сортировка
От Gopisetty, Ramesh
Тема Re: Performance Issue (Not using Index when joining two tables).
Дата
Msg-id BL0PR11MB31535607A1D9AAB6015D67CBEE230@BL0PR11MB3153.namprd11.prod.outlook.com
обсуждение исходный текст
Ответ на Re: Performance Issue (Not using Index when joining two tables).  (Michael Lewis <mlewis@entrata.com>)
Ответы Re: Performance Issue (Not using Index when joining two tables).  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-performance

Hi,

Thanks for looking into the problem/issue.    Let me give more details about the functions...    Yes,  we are using row level security. 

Actually, we have converted an Oracle VPD database (Virtual Private Databases -  In short row level security)  into postgresql.   We have several functions available to filter or to provide the row level security.

f_sel_policy_ty_static;  f_sel_policy_all  filters the tables where the vpd_key is provided initially.
f_sel_policy_prod_locale  filters the table where the prod_locale_code is provided initially.

Before running any queries in the database, we will set the context settings/row level security  based on the function below.. 

 CALLvpd_filter(vpd_key=>'XXXX',mod_user=>'XXXXX',user_locale=>'en_XX',prod_locale=>'XX');

This will set the context variables and provide row level security.   All the tables in our database consists of vpd_key which is a filter for to run the queries for a given client. 

The tables mentioned below chr_emp_position and chr_simple_val consists of many rows and the functions filter them based on the vpd_key and prod_user_locale_code.
Once after providing the row level security we executed the query joining the tables..    And where the index is not being utlitized/ the query runs slower i.e., greater than 8seconds.

The normal structure of the tables will be like this.. 

chr_emp_position  --- has columns  vpd_key,oid, home_Dept_oid, eff_date, start_Date,.....etc.,    (almost having 200+ columns).   -- primary key is  vpd_key and oid.
chr_simple_Val   --- has columns   vpd_key, oid , category, description..et.c,     (almost has around 70 columns).    (primary key is  vpd_key and oid)

The rows mentioned below are after setting the row level security on those tables .. 

i.e,  after executing the function 

 CALL vpd_filter(spv_vpd_key=>'XXXX',spv_mod_usr=>'XXXXX',spv_user_locale=>'en_XX',spv_prod_locale=>'XX');


pgwfc01q=> select count(*) from chr_simple_val;
 count
-------
 13158
(1 row)

pgwfc01q=> select count(*) from chr_emp_position;
 count
-------
   228
(1 row)


The primary key for the table chr_Simple_val  contains OID.   Still not using the index.

I'm sharing the explain plan over here..

pgwfc01q=> explain analyze select cep.HOME_DEPT_OID,ctc.oid,ctc.category,ctc.code from chr_emp_position cep inner join chr_Simple_Val ctc on ctc.oid=cep.HOME_DEPT_OID;
                                                                                                             QUERY P
LAN
--------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=49299.91..51848.83 rows=651 width=42) (actual time=3512.692..3797.583 rows=228 loops=1)
   Hash Cond: ((cep.home_dept_oid)::text = (ctc.oid)::text)
   ->  Seq Scan on chr_emp_position cep  (cost=0.00..2437.77 rows=436 width=11) (actual time=44.713..329.435 rows=22
8 loops=1)
         Filter: ((("current_user"())::text <> ANY ('{wfnadmin,skipvpd}'::text[])) AND f_sel_policy_all(vpd_key, 'CH
R_EMP_POSITION'::character varying) AND f_sel_policy_prod_locale((cep.*)::character varying, prod_locale_code))
         Rows Removed by Filter: 3695
   ->  Hash  (cost=49176.40..49176.40 rows=9881 width=31) (actual time=3467.907..3467.908 rows=13158 loops=1)
         Buckets: 16384  Batches: 1  Memory Usage: 1031kB
         ->  Seq Scan on chr_simple_val ctc  (cost=0.00..49176.40 rows=9881 width=31) (actual time=2.191..3460.929 r
ows=13158 loops=1)
               Filter: ((("current_user"())::text <> ANY ('{wfnadmin,skipvpd}'::text[])) AND f_sel_policy_ty_static(
vpd_key) AND f_sel_policy_prod_locale((ctc.*)::character varying, prod_locale_code))
Planning Time: 0.297 ms
 Execution Time: 3797.768 ms
(12 rows)


If i don't set the context and run as a root user the explain plan is as below..   And it executes in milliseconds even without the index having the full table scan.  

  1.  I'm not sure if my filters are time consuming.  Most of the queries works except few.  We hadn't seen the problem in Oracle.  I'm not comparing between Oracle and Postgres here.   I see both are two different flavors. but trying to get my query runs less than 8seconds.
  2.  I'm not sure why the index on chr_simple_val is not being used here  vpd_key,oid.   I'm confident if it uses index, it will/might  be faster as it is looking for 2 or  3 home departments based on oid.
  3.  I'm not sure why even having the full scan it worked for the root user.  
  4.  I'm not sure why the bitmap heap scan was not followed after setting the row level security.   How to make the bitmap heap scan on chr_emp_position as i observed here.

fyi.,

Running as a root user.

pgwfc01q=> explain analyze select cep.HOME_DEPT_OID,ctc.oid,ctc.category,ctc.code from xxxx.chr_emp_position cep inner join wfnsch001.chr_Simple_Val ctc on ctc.oid=cep.HOME_DEPT_OID and (ctc.vpd_key='COMMON' or ctc.vpd_key=cep.vpd_key) and cep.vpd_key='xxxxxxxxxx';

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------
-------------------
 Hash Join  (cost=5503.95..6742.82 rows=453 width=42) (actual time=131.241..154.201 rows=228 loops=1)
   Hash Cond: ((cep.home_dept_oid)::text = (ctc.oid)::text)
   Join Filter: (((ctc.vpd_key)::text = 'NG_COMMON'::text) OR ((ctc.vpd_key)::text = (cep.vpd_key)::text))
   Rows Removed by Join Filter: 19770
   ->  Bitmap Heap Scan on chr_emp_position cep  (cost=10.05..362.25 rows=228 width=28) (actual time=0.056..0.253 ro
ws=228 loops=1)
         Recheck Cond: ((vpd_key)::text = 'xxxxxxxxxx'::text)
         Heap Blocks: exact=26
         ->  Bitmap Index Scan on uq1_chr_emp_position  (cost=0.00..9.99 rows=228 width=0) (actual time=0.041..0.041
 rows=228 loops=1)
               Index Cond: ((vpd_key)::text = 'xxxxxxxxxx'::text)
   ->  Hash  (cost=3600.29..3600.29 rows=88929 width=48) (actual time=130.826..130.826 rows=88929 loops=1)
         Buckets: 65536 (originally 65536)  Batches: 4 (originally 2)  Memory Usage: 3585kB
         ->  Seq Scan on chr_simple_val ctc  (cost=0.00..3600.29 rows=88929 width=48) (actual time=0.005..33.356 row
s=88929 loops=1)
 Planning Time: 3.977 ms
 Execution Time: 154.535 ms
(14 rows)

pgwfc01q=> select count(*) from wfnsch001.chr_emp_position;
 count
-------
  3923
(1 row)

pgwfc01q=> select count(*) from wfnsch001.chr_Simple_Val;
 count
-------
 88929
(1 row)

 

I'm not sure if i'm thinking in the right way or not. (As of safety purpose, i have  rebuilded indexes, analyzed, did vaccum on those tables).   Sorry for the lengthy email and i'm trying to explain my best on this.

Thank you.

Regards,
Ramesh G



From: Michael Lewis <mlewis@entrata.com>
Sent: Sunday, September 13, 2020 10:51 PM
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: Tomas Vondra <tomas.vondra@2ndquadrant.com>; Gopisetty, Ramesh <rameshg2@illinois.edu>; pgsql-performance@lists.postgresql.org <pgsql-performance@lists.postgresql.org>
Subject: Re: Performance Issue (Not using Index when joining two tables).
 
Filter: ((("current_user"())::text <> ANY ('{wfnadmin,skipvpd}'::text[])) AND f_sel_policy_all(vpd_key, 'CH
R_EMP_POSITION'::character varying) AND f_sel_policy_prod_locale((cep.*)::character varying, prod_locale_code))

This looks like some stuff for row level security perhaps. My understanding is limited, but perhaps those restrictions are influencing the planners access or reliance on stats.

Also, it would seem like you need the entire table since you don't have an explicit where clause. Why would scanning an index and then also visiting every row in the table be faster than just going directly to the table?

В списке pgsql-performance по дате отправления:

Предыдущее
От: Michael Lewis
Дата:
Сообщение: Re: Performance Issue (Not using Index when joining two tables).
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Performance Issue (Not using Index when joining two tables).