Обсуждение: Postgres not using index on views
Hi,
I am seeing a performance problem with postgresql v 11.7 on views, and I am wondering if anyone can tell me why or has any suggestion.
A table is created as:
CREATE TABLE "FBNK_CUSTOMER" (RECID VARCHAR(255) NOT NULL PRIMARY KEY, XMLRECORD VARCHAR)
And contains only 180 rows.
Doing an explain plan on the view created over this gives:
EXPLAIN ANALYZE
select RECID from "V_FBNK_CUSTOMER"
Subquery Scan on "V_FBNK_CUSTOMER" (cost=0.00..19014.60 rows=180 width=7) (actual time=459.601..78642.189 rows=180 loops=1)
-> Seq Scan on "FBNK_CUSTOMER" a (cost=0.00..19012.80 rows=180 width=14575) (actual time=459.600..78641.950 rows=180 loops=1)
Planning Time: 0.679 ms
Execution Time: 78642.616 ms
Yet an Explain plan on the underlying table( on select RECID from “FBNK_CUSTOMER”) gives:
Seq Scan on "FBNK_CUSTOMER" (cost=0.00..22.80 rows=180 width=7) (actual time=0.004..0.272 rows=180 loops=1)
Planning Time: 0.031 ms
Execution Time: 0.288 ms
So you can see that postgresql is not using the primary key index for RECID. THIS IS NOT THE CASE FOR ORACLE where the primary key index is used in the explain plan
The view is created similar to the following where extractValueJS is a stored procedure that extracts a value from the VARCHAR XMLRECORD column.
CREATE VIEW "V_FBNK_CUSTOMER" as
SELECT a.RECID, a.XMLRECORD "THE_RECORD"
,a.RECID "CUSTOMER_CODE"
,a.RECID "CUSTOMER_NO"
,extractValueJS(a.XMLRECORD, 1, 0) "MNEMONIC"
,extractValueJS(a.XMLRECORD, 2, 0) "SHORT_NAME"
,extractValueJS(a.XMLRECORD, 2, -1) "SHORT_NAME_2"
, etc
, extractValueJS(a.XMLRECORD, 179, 9) "TESTER"
FROM
"FBNK_CUSTOMER" a
As well, the problem gets worse as columns are added to the view, irrespective of the SELECTION columns and it seems to perform some activity behind.
Creating an empty view,
CREATE VIEW "V_FBNK_CUSTOMER_TEST" as
SELECT a.RECID, a.XMLRECORD "THE_RECORD"
,a.RECID "CUSTOMER_CODE"
,a.RECID "CUSTOMER_NO"
FROM
"FBNK_CUSTOMER" a ------------- > 3 ms select RECID from "V_FBNK_CUSTOMER_TEST"
CREATE VIEW "V_FBNK_CUSTOMER_TEST" as
SELECT a.RECID, a.XMLRECORD "THE_RECORD"
,a.RECID "CUSTOMER_CODE"
,a.RECID "CUSTOMER_NO"
,extractValueJS(a.XMLRECORD, 1, 0) "MNEMONIC"
FROM
"FBNK_CUSTOMER" a ----------------à 54 ms select RECID from "V_FBNK_CUSTOMER_TEST"
CREATE VIEW "V_FBNK_CUSTOMER_TEST" as
SELECT a.RECID, a.XMLRECORD "THE_RECORD"
,a.RECID "CUSTOMER_CODE"
,a.RECID "CUSTOMER_NO"
,extractValueJS(a.XMLRECORD, 1, 0) "MNEMONIC"
,extractValueJS(a.XMLRECORD, 2, 0) "SHORT_NAME"
FROM
"FBNK_CUSTOMER" a ----------------------à 118 ms select RECID from "V_FBNK_CUSTOMER_TEST"
The following query takes an extremely long time for only 180 rows, and what this means is that we would have to index anything appearing in the where clause for every table in order to use views because the views seem not to consider the select clause. Why is that and does anyone know a way around this?
SELECT RECID FROM "V_FBNK_CUSTOMER" WHERE "TESTER" = '5.00' ORDER BY RECID
Sort (cost=19015.06..19015.06 rows=1 width=7) (actual time=102172.500..102172.501 rows=1 loops=1)
Sort Key: "V_FBNK_CUSTOMER".recid
Sort Method: quicksort Memory: 25kB
-> Subquery Scan on "V_FBNK_CUSTOMER" (cost=0.00..19015.05 rows=1 width=7) (actual time=91242.866..102172.474 rows=1 loops=1)
Filter: (("V_FBNK_CUSTOMER"."TESTER")::text = '5.00'::text)
Rows Removed by Filter: 179
-> Seq Scan on "FBNK_CUSTOMER" a (cost=0.00..19012.80 rows=180 width=14575) (actual time=613.455..102172.175 rows=180 loops=1)
Planning Time: 1.674 ms
Execution Time: 102174.015 ms
The information in this e-mail and any attachments is confidential and may be legally privileged. It is intended solely for the addressee or addressees. Any use or disclosure of the contents of this e-mail/attachments by a not intended recipient is unauthorized and may be unlawful. If you have received this e-mail in error please notify the sender. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of TEMENOS. We recommend that you check this e-mail and any attachments against viruses. TEMENOS accepts no liability for any damage caused by any malicious code or virus transmitted by this e-mail.
On Mon, Apr 06, 2020 at 02:19:59PM +0000, Rick Vincent wrote: > I am seeing a performance problem with postgresql v 11.7 on views, and I am wondering if anyone can tell me why or hasany suggestion. > > A table is created as: > > CREATE TABLE "FBNK_CUSTOMER" (RECID VARCHAR(255) NOT NULL PRIMARY KEY, XMLRECORD VARCHAR) > > And contains only 180 rows. > > Doing an explain plan on the view created over this gives: > > EXPLAIN ANALYZE > select RECID from "V_FBNK_CUSTOMER" > > > Subquery Scan on "V_FBNK_CUSTOMER" (cost=0.00..19014.60 rows=180 width=7) (actual time=459.601..78642.189 rows=180 loops=1) > -> Seq Scan on "FBNK_CUSTOMER" a (cost=0.00..19012.80 rows=180 width=14575) (actual time=459.600..78641.950 rows=180loops=1) > > Yet an Explain plan on the underlying table( on select RECID from "FBNK_CUSTOMER") gives: > > Seq Scan on "FBNK_CUSTOMER" (cost=0.00..22.80 rows=180 width=7) (actual time=0.004..0.272 rows=180 loops=1) It still did a seq scan on the table, so I'm not sure what this has to do with index scans ? > The following query takes an extremely long time for only 180 rows, and what this means is that we would have to indexanything appearing in the where clause for every table in order to use views because the views seem not to considerthe select clause. Why is that and does anyone know a way around this? Is there a reason why you don't store the extracted value in its own column ? And maybe keep it up to date using an insert/update trigger on the xmlrecord column. -- Justin
Justin Pryzby <pryzby@telsasoft.com> writes: > On Mon, Apr 06, 2020 at 02:19:59PM +0000, Rick Vincent wrote: >> The following query takes an extremely long time for only 180 rows, and what this means is that we would have to indexanything appearing in the where clause for every table in order to use views because the views seem not to considerthe select clause. Why is that and does anyone know a way around this? > Is there a reason why you don't store the extracted value in its own column ? The planner seems to be quite well aware that the slower query is going to be slower, since the estimated costs are much higher. Since it's not choosing to optimize into a faster form, I wonder whether it's constrained by semantic requirements. In particular, I'm suspicious that some of those functions you have in the view are marked "volatile", preventing them from being optimized away. Beyond that guess, though, there's really not enough info here to say. The info we usually ask for to debug slow-query problems is explained at https://wiki.postgresql.org/wiki/Slow_Query_Questions regards, tom lane
On Mon, 2020-04-06 at 14:19 +0000, Rick Vincent wrote: > I am seeing a performance problem with postgresql v 11.7 on views, and I am wondering if > anyone can tell me why or has any suggestion. Your account is somewhat confused - too many questions rolled into one rant, I would say. There are two points that may clear up the case: - If you have no WHERE clause, a sequential scan of the table is usually the best way to do it. The exception is an index only scan if the index contains all that is required, but in PostgreSQL you need a recently VACUUMed table for that. - The expensive part in your view is the "extractValueJS" function. Try to tune that for better performance. If any of your problems are not explained by that, please say so. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
Hi Justin, You said, " Is there a reason why you don't store the extracted value in its own column ?" RV>> It simply is the way the application stores the data. For Oracle we are storing in XML and JSON format, for postgres,due do limitations of XML api, we are storing in VARCHAR. We can't break it out into columns very easily becauseof the legacy application. You said, "It still did a seq scan on the table, so I'm not sure what this has to do with index scans ?" RV>> On Oracle it will use the primary key index because it detects that all of the columns in the select clause are indexable. With Postgres, it might be doing a seq scan but on a 180 rows, a select on the underlying table is many timesfaster than the same select on the view. It seems all of the view columns are being triggered which makes it incrediblyslow. Thanks, Rick -----Original Message----- From: Justin Pryzby <pryzby@telsasoft.com> Sent: Tuesday, April 7, 2020 6:59 AM To: Rick Vincent <rvincent@temenos.com> Cc: pgsql-performance@postgresql.org; Manoj Kumar <manojkumar@temenos.com>; Herve Aubert <haubert@temenos.com> Subject: Re: Postgres not using index on views On Mon, Apr 06, 2020 at 02:19:59PM +0000, Rick Vincent wrote: > I am seeing a performance problem with postgresql v 11.7 on views, and I am wondering if anyone can tell me why or hasany suggestion. > > A table is created as: > > CREATE TABLE "FBNK_CUSTOMER" (RECID VARCHAR(255) NOT NULL PRIMARY KEY, > XMLRECORD VARCHAR) > > And contains only 180 rows. > > Doing an explain plan on the view created over this gives: > > EXPLAIN ANALYZE > select RECID from "V_FBNK_CUSTOMER" > > > Subquery Scan on "V_FBNK_CUSTOMER" (cost=0.00..19014.60 rows=180 width=7) (actual time=459.601..78642.189 rows=180 loops=1) > -> Seq Scan on "FBNK_CUSTOMER" a (cost=0.00..19012.80 rows=180 > width=14575) (actual time=459.600..78641.950 rows=180 loops=1) > > Yet an Explain plan on the underlying table( on select RECID from "FBNK_CUSTOMER") gives: > > Seq Scan on "FBNK_CUSTOMER" (cost=0.00..22.80 rows=180 width=7) > (actual time=0.004..0.272 rows=180 loops=1) It still did a seq scan on the table, so I'm not sure what this has to do with index scans ? > The following query takes an extremely long time for only 180 rows, and what this means is that we would have to indexanything appearing in the where clause for every table in order to use views because the views seem not to considerthe select clause. Why is that and does anyone know a way around this? Is there a reason why you don't store the extracted value in its own column ? And maybe keep it up to date using an insert/update trigger on the xmlrecord column. -- Justin The information in this e-mail and any attachments is confidential and may be legally privileged. It is intended solely forthe addressee or addressees. Any use or disclosure of the contents of this e-mail/attachments by a not intended recipientis unauthorized and may be unlawful. If you have received this e-mail in error please notify the sender. Pleasenote that any views or opinions presented in this e-mail are solely those of the author and do not necessarily representthose of TEMENOS. We recommend that you check this e-mail and any attachments against viruses. TEMENOS accepts noliability for any damage caused by any malicious code or virus transmitted by this e-mail.
From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Tuesday, April 7, 2020 7:09 AM
To: Justin Pryzby <pryzby@telsasoft.com>
Cc: Rick Vincent <rvincent@temenos.com>; pgsql-performance@postgresql.org; Manoj Kumar <manojkumar@temenos.com>; Herve Aubert <haubert@temenos.com>
Subject: Re: Postgres not using index on views
The information in this e-mail and any attachments is confidential and may be legally privileged. It is intended solely for the addressee or addressees. Any use or disclosure of the contents of this e-mail/attachments by a not intended recipient is unauthorized and may be unlawful. If you have received this e-mail in error please notify the sender. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of TEMENOS. We recommend that you check this e-mail and any attachments against viruses. TEMENOS accepts no liability for any damage caused by any malicious code or virus transmitted by this e-mail.
Rick Vincent schrieb am 07.04.2020 um 11:08: > The function is defined as below, so no use of VOLATILE. If you don't specify anything, the default is VOLATILE. So your function *is* volatile. > CREATE OR REPLACE FUNCTION extractValueJS (sVar text, nfm INTEGER, nvm INTEGER) > RETURNS VARCHAR as $$ > declare > sRet text := ''; > nSize int := 0; > retVal int := 0; > cVar text[] := regexp_split_to_array(sVar,''); > idx int := 1; > nStart int := 0; > nEnd int := 0; > begin > etc... > return sRet; > end; > $$ LANGUAGE plpgsql; You haven't shown us your actual code, but if you can turn that into a "language sql" function (defined as immutable, orat least stable), I would expect it to be way more efficient. Thomas
> RV>> It simply is the way the application stores the data. For Oracle > we are storing in XML and JSON format, for postgres, due do > limitations of XML api, we are storing in VARCHAR. Why not use JSON in Postgres then? Postgres' JSON functions are at least as powerful as Oracle's (if not better in a lot of areas). Would be interesting to see what XML function/feature from Oracle you can't replicate/migrate to Postgres. Another option might be to upgrade to Postgres 12 and define those columns as generated columns as part of the table, ratherthan a view. Then you only pay the performance penalty of the extracValueJS() function when you update the table, not for every select. Thomas
From: Rick Vincent
Sent: Tuesday, April 7, 2020 11:08 AM
To: 'Tom Lane' <tgl@sss.pgh.pa.us>; Justin Pryzby <pryzby@telsasoft.com>
Cc: pgsql-performance@postgresql.org; Manoj Kumar <manojkumar@temenos.com>; Herve Aubert <haubert@temenos.com>
Subject: RE: Postgres not using index on views
From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Tuesday, April 7, 2020 7:09 AM
To: Justin Pryzby <pryzby@telsasoft.com>
Cc: Rick Vincent <rvincent@temenos.com>; pgsql-performance@postgresql.org; Manoj Kumar <manojkumar@temenos.com>; Herve Aubert <haubert@temenos.com>
Subject: Re: Postgres not using index on views
The information in this e-mail and any attachments is confidential and may be legally privileged. It is intended solely for the addressee or addressees. Any use or disclosure of the contents of this e-mail/attachments by a not intended recipient is unauthorized and may be unlawful. If you have received this e-mail in error please notify the sender. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of TEMENOS. We recommend that you check this e-mail and any attachments against viruses. TEMENOS accepts no liability for any damage caused by any malicious code or virus transmitted by this e-mail.
Hi,I was wondering if anyone can explain the below problem. Should a bug be logged for this?Kind regards,Rick_____________________________________________
From: Rick Vincent
Sent: Tuesday, April 7, 2020 11:08 AM
To: 'Tom Lane' <tgl@sss.pgh.pa.us>; Justin Pryzby <pryzby@telsasoft.com>
Cc: pgsql-performance@postgresql.org; Manoj Kumar <manojkumar@temenos.com>; Herve Aubert <haubert@temenos.com>
Subject: RE: Postgres not using index on viewsHi Tom,The function is defined as below, so no use of VOLATILE. Let me know if you need any other information. I am hoping the below will further clarify the issue.
Hi David,
Oh, okay…I missed that implied part. Will try it and post back.
Thanks,
Rick
From: David G. Johnston <david.g.johnston@gmail.com>
Sent: Friday, April 17, 2020 4:55 PM
To: Rick Vincent <rvincent@temenos.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>; Justin Pryzby <pryzby@telsasoft.com>; pgsql-performance@postgresql.org; Manoj Kumar <manojkumar@temenos.com>; Herve Aubert <haubert@temenos.com>
Subject: Postgres not using index on views
On Friday, April 17, 2020, Rick Vincent <rvincent@temenos.com> wrote:
Hi,
I was wondering if anyone can explain the below problem. Should a bug be logged for this?
Kind regards,
Rick
_____________________________________________
From: Rick Vincent
Sent: Tuesday, April 7, 2020 11:08 AM
To: 'Tom Lane' <tgl@sss.pgh.pa.us>; Justin Pryzby <pryzby@telsasoft.com>
Cc: pgsql-performance@postgresql.org; Manoj Kumar <manojkumar@temenos.com>; Herve Aubert <haubert@temenos.com>
Subject: RE: Postgres not using index on views
Hi Tom,
The function is defined as below, so no use of VOLATILE. Let me know if you need any other information. I am hoping the below will further clarify the issue.
IIUC as Tom wrote you have volatile functions (implied/default as Thomas wrote) attached to view column outputs and the planner will not optimize those away.
Mark your function immutable (assuming it is) and retry your experiment with the where clause query.
David J.
The information in this e-mail and any attachments is confidential and may be legally privileged. It is intended solely for the addressee or addressees. Any use or disclosure of the contents of this e-mail/attachments by a not intended recipient is unauthorized and may be unlawful. If you have received this e-mail in error please notify the sender. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of TEMENOS. We recommend that you check this e-mail and any attachments against viruses. TEMENOS accepts no liability for any damage caused by any malicious code or virus transmitted by this e-mail.
Dear all,
Changing the function signature to IMMUTABLE worked like a dream. No issue now. Sorry for my confusion on VOLATILE being created as the default. Thanks to everyone for your help!
Kind regards,
Rick Vincent
From: David G. Johnston <david.g.johnston@gmail.com>
Sent: Friday, April 17, 2020 4:55 PM
To: Rick Vincent <rvincent@temenos.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>; Justin Pryzby <pryzby@telsasoft.com>; pgsql-performance@postgresql.org; Manoj Kumar <manojkumar@temenos.com>; Herve Aubert <haubert@temenos.com>
Subject: Postgres not using index on views
On Friday, April 17, 2020, Rick Vincent <rvincent@temenos.com> wrote:
Hi,
I was wondering if anyone can explain the below problem. Should a bug be logged for this?
Kind regards,
Rick
_____________________________________________
From: Rick Vincent
Sent: Tuesday, April 7, 2020 11:08 AM
To: 'Tom Lane' <tgl@sss.pgh.pa.us>; Justin Pryzby <pryzby@telsasoft.com>
Cc: pgsql-performance@postgresql.org; Manoj Kumar <manojkumar@temenos.com>; Herve Aubert <haubert@temenos.com>
Subject: RE: Postgres not using index on views
Hi Tom,
The function is defined as below, so no use of VOLATILE. Let me know if you need any other information. I am hoping the below will further clarify the issue.
IIUC as Tom wrote you have volatile functions (implied/default as Thomas wrote) attached to view column outputs and the planner will not optimize those away.
Mark your function immutable (assuming it is) and retry your experiment with the where clause query.
David J.
The information in this e-mail and any attachments is confidential and may be legally privileged. It is intended solely for the addressee or addressees. Any use or disclosure of the contents of this e-mail/attachments by a not intended recipient is unauthorized and may be unlawful. If you have received this e-mail in error please notify the sender. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of TEMENOS. We recommend that you check this e-mail and any attachments against viruses. TEMENOS accepts no liability for any damage caused by any malicious code or virus transmitted by this e-mail.