Обсуждение: Terrible plan choice for view with distinct on clause
Hey all, first off, Postgres version 9.4.4 (also tested on 9.5 beta).
I have been having a pretty hard time getting a view of mine to play nice with any other queries I need it for.
I have a few tables you'd need to know about to understand why i'm doing what i'm doing.
First thing is we have a "contract_item", it can have either a product or a grouping of products (only one or the other, not both) on it, and a few extra attributes.
The groupings are defined in a hierarchy, and they can have many of these groupings or products on a single "contract", but only unique records per product or grouping.
Now when it comes down to finding what the extra attributes that are valid on a contract are for the specific product you're looking for, the most relevant is the product being on the contract directly. If the product doesn't exist, we choose the lowest level grouping that is defined on the contract that contains that product.
The view I am having trouble with is able to push down it's where clause when the id's are directly specified like so:
SELECT *
FROM contract_product cp
WHERE cp.contract_id = '16d6df05-d8a0-4ec9-ae39-f4d8e13da597'
AND cp.product_id = '00c117d7-6451-4842-b17b-baa44baa375f';
But the where clause or join conditions are not pushed down in these cases (which is how I need to use the view):
SELECT *
FROM contract_product cp
WHERE EXISTS (
SELECT 1
WHERE cp.contract_id = '16d6df05-d8a0-4ec9-ae39-f4d8e13da597'
AND cp.product_id = '00c117d7-6451-4842-b17b-baa44baa375f'
);
or
SELECT *
FROM contract_product cp
INNER JOIN (
SELECT '16d6df05-d8a0-4ec9-ae39-f4d8e13da597'::uuid as contract_id, '00c117d7-6451-4842-b17b-baa44baa375f'::uuid as product_id
) p
ON cp.contract_id = p.contract_id
AND cp.product_id = p.product_id;
The definition of the view i'm having trouble with:
CREATE OR REPLACE VIEW contract_product AS
SELECT DISTINCT ON (ci.contract_id, p.product_id)
ci.contract_item_id,
ci.contract_id,
p.product_id,
ci.uom_type_id,
ci.rebate_direct_rate,
ci.decimal_model,
ci.rebate_deviated_value,
ci.rebate_deviated_type
FROM contract_item ci
LEFT JOIN grouping_hierarchy gh
ON gh.original_grouping_id = ci.grouping_id
AND NOT (EXISTS (
SELECT 1
FROM contract_item cig
WHERE cig.contract_id = ci.contract_id
AND gh.grouping_id = cig.grouping_id
AND cig.grouping_id <> ci.grouping_id)
)
LEFT JOIN product_grouping pg
ON pg.grouping_id = gh.grouping_id
AND NOT (EXISTS (
SELECT 1
FROM contract_item cip
WHERE cip.contract_id = ci.contract_id
AND pg.product_id = cip.product_id)
)
JOIN product p
ON p.product_id = COALESCE(ci.product_id, pg.product_id)
ORDER BY ci.contract_id, p.product_id, gh.level;
That view references another view to make it easy to find the correct level in my hierarchy, so here is the definition for that:
CREATE OR REPLACE VIEW grouping_hierarchy AS
WITH RECURSIVE groupings_list(original_grouping_id, parent_grouping_id, grouping_id) AS (
SELECT pg.grouping_id AS original_grouping_id,
pg.parent_grouping_id,
pg.grouping_id,
0 AS level
FROM grouping pg
UNION ALL
SELECT gl.original_grouping_id,
cg.parent_grouping_id,
cg.grouping_id,
gl.level + 1
FROM groupings_list gl
JOIN grouping cg ON cg.parent_grouping_id = gl.grouping_id
WHERE cg.active_ind = true
)
SELECT groupings_list.original_grouping_id,
groupings_list.parent_grouping_id,
groupings_list.grouping_id,
groupings_list.level
FROM groupings_list;
And here are the query plans (in order) for those three queries:
Any help would be greatly appreciated on how to speed this up, or if i'm doing something Postgres just doesn't like and what an alternative method would be.
Thanks,
-Adam
Adam Brusselback <adambrusselback@gmail.com> writes: > The view I am having trouble with is able to push down it's where clause > when the id's are directly specified like so: > SELECT * > FROM contract_product cp > WHERE cp.contract_id = '16d6df05-d8a0-4ec9-ae39-f4d8e13da597' > AND cp.product_id = '00c117d7-6451-4842-b17b-baa44baa375f'; > But the where clause or join conditions are not pushed down in these cases > (which is how I need to use the view): > SELECT * > FROM contract_product cp > WHERE EXISTS ( > SELECT 1 > WHERE cp.contract_id = '16d6df05-d8a0-4ec9-ae39-f4d8e13da597' > AND cp.product_id = '00c117d7-6451-4842-b17b-baa44baa375f' > ); This plea for help would be more convincing if you could explain *why* you needed to do that. As is, it sure looks like "Doctor, it hurts when I do this". What about that construction isn't just silly? (And if you say "it's produced by an ORM I have no control over", I'm going to say "your ORM was evidently written by blithering idiots, and you should not have any faith in it".) Having said that, the reason nothing good happens is that convert_EXISTS_sublink_to_join() punts on subqueries that have an empty FROM clause, as well as some other corner cases that I did not care to analyze carefully at the time. Just looking at this example, it seems like if the SELECT list is trivial then we could simply replace the EXISTS clause in toto with the contents of the lower WHERE clause, thereby undoing the silliness of the query author. I don't think this could be handled directly in convert_EXISTS_sublink_to_join(), because it's defined to return a JoinExpr which would not apply in such a case. But possibly it could be dealt with in make_subplan() without too much overhead. I'm not feeling motivated to work on this myself, absent a more convincing explanation of why we should expend any effort to support this query pattern. But if anyone else is, have at it. regards, tom lane
No ORM, just me.
Was somewhat similar to something I had seen done at an old job, but they used SQL Server and that type of query worked fine there.
There were a couple business cases that had to be satisfied, which is why I went the way I did:
The first was "allow products to be grouped together, and those groups be placed in a hierarchy. All of the products in child groupings are valid for the parent of the grouping. Products should be able to be added and removed from this group at any point."
The next was "allow a user to add a product, or a group of products to a contract and set pricing information, product is the most definitive and overrides any groupings the product may be in, and the lowest level of the grouping hierarchy should be used if the product is not directly on the contract."
The last was "adding and removing products from a group should immediately take effect to make those products valid or invalid on any contracts that grouping is a part of."
Now I am not going to say I love this design, I actually am not a fan of it at all. I just couldn't think of any other design pattern that would meet those business requirements. I was hoping to create a view to make working with the final result the rules specified above easy when you want to know what pricing is valid for a specific product on a contract.
So that is the "why" at least.
On Thu, Dec 17, 2015 at 12:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Adam Brusselback <adambrusselback@gmail.com> writes:
> The view I am having trouble with is able to push down it's where clause
> when the id's are directly specified like so:
> SELECT *
> FROM contract_product cp
> WHERE cp.contract_id = '16d6df05-d8a0-4ec9-ae39-f4d8e13da597'
> AND cp.product_id = '00c117d7-6451-4842-b17b-baa44baa375f';
> But the where clause or join conditions are not pushed down in these cases
> (which is how I need to use the view):
> SELECT *
> FROM contract_product cp
> WHERE EXISTS (
> SELECT 1
> WHERE cp.contract_id = '16d6df05-d8a0-4ec9-ae39-f4d8e13da597'
> AND cp.product_id = '00c117d7-6451-4842-b17b-baa44baa375f'
> );
This plea for help would be more convincing if you could explain *why*
you needed to do that. As is, it sure looks like "Doctor, it hurts when
I do this". What about that construction isn't just silly?
(And if you say "it's produced by an ORM I have no control over", I'm
going to say "your ORM was evidently written by blithering idiots, and
you should not have any faith in it".)
Having said that, the reason nothing good happens is that
convert_EXISTS_sublink_to_join() punts on subqueries that have an empty
FROM clause, as well as some other corner cases that I did not care to
analyze carefully at the time. Just looking at this example, it seems
like if the SELECT list is trivial then we could simply replace the EXISTS
clause in toto with the contents of the lower WHERE clause, thereby
undoing the silliness of the query author. I don't think this could be
handled directly in convert_EXISTS_sublink_to_join(), because it's defined
to return a JoinExpr which would not apply in such a case. But possibly
it could be dealt with in make_subplan() without too much overhead. I'm
not feeling motivated to work on this myself, absent a more convincing
explanation of why we should expend any effort to support this query
pattern. But if anyone else is, have at it.
regards, tom lane
Also, sorry if I wasn't clear. Those two example queries above that performed badly were not exact queries that I would use, they were just simple examples that performed identically to something like this (or the exists version of the same query):
SELECT cp.*
FROM contract_product cp
INNER JOIN claim_product clp
ON cp.contract_id = clp.contract_id
WHERE clp.claim_id = 'whatever';
On Thu, Dec 17, 2015 at 1:08 PM, Adam Brusselback <adambrusselback@gmail.com> wrote:
No ORM, just me.Was somewhat similar to something I had seen done at an old job, but they used SQL Server and that type of query worked fine there.There were a couple business cases that had to be satisfied, which is why I went the way I did:The first was "allow products to be grouped together, and those groups be placed in a hierarchy. All of the products in child groupings are valid for the parent of the grouping. Products should be able to be added and removed from this group at any point."The next was "allow a user to add a product, or a group of products to a contract and set pricing information, product is the most definitive and overrides any groupings the product may be in, and the lowest level of the grouping hierarchy should be used if the product is not directly on the contract."The last was "adding and removing products from a group should immediately take effect to make those products valid or invalid on any contracts that grouping is a part of."Now I am not going to say I love this design, I actually am not a fan of it at all. I just couldn't think of any other design pattern that would meet those business requirements. I was hoping to create a view to make working with the final result the rules specified above easy when you want to know what pricing is valid for a specific product on a contract.So that is the "why" at least.On Thu, Dec 17, 2015 at 12:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:Adam Brusselback <adambrusselback@gmail.com> writes:
> The view I am having trouble with is able to push down it's where clause
> when the id's are directly specified like so:
> SELECT *
> FROM contract_product cp
> WHERE cp.contract_id = '16d6df05-d8a0-4ec9-ae39-f4d8e13da597'
> AND cp.product_id = '00c117d7-6451-4842-b17b-baa44baa375f';
> But the where clause or join conditions are not pushed down in these cases
> (which is how I need to use the view):
> SELECT *
> FROM contract_product cp
> WHERE EXISTS (
> SELECT 1
> WHERE cp.contract_id = '16d6df05-d8a0-4ec9-ae39-f4d8e13da597'
> AND cp.product_id = '00c117d7-6451-4842-b17b-baa44baa375f'
> );
This plea for help would be more convincing if you could explain *why*
you needed to do that. As is, it sure looks like "Doctor, it hurts when
I do this". What about that construction isn't just silly?
(And if you say "it's produced by an ORM I have no control over", I'm
going to say "your ORM was evidently written by blithering idiots, and
you should not have any faith in it".)
Having said that, the reason nothing good happens is that
convert_EXISTS_sublink_to_join() punts on subqueries that have an empty
FROM clause, as well as some other corner cases that I did not care to
analyze carefully at the time. Just looking at this example, it seems
like if the SELECT list is trivial then we could simply replace the EXISTS
clause in toto with the contents of the lower WHERE clause, thereby
undoing the silliness of the query author. I don't think this could be
handled directly in convert_EXISTS_sublink_to_join(), because it's defined
to return a JoinExpr which would not apply in such a case. But possibly
it could be dealt with in make_subplan() without too much overhead. I'm
not feeling motivated to work on this myself, absent a more convincing
explanation of why we should expend any effort to support this query
pattern. But if anyone else is, have at it.
regards, tom lane