Обсуждение: Terrible plan choice for view with distinct on clause

Поиск
Список
Период
Сортировка

Terrible plan choice for view with distinct on clause

От
Adam Brusselback
Дата:
Hey all, first off, Postgres version 9.4.4 (also tested on 9.5 beta).

I have been having a pretty hard time getting a view of mine to play nice with any other queries I need it for.

I have a few tables you'd need to know about to understand why i'm doing what i'm doing.

First thing is we have a "contract_item", it can have either a product or a grouping of products (only one or the other, not both) on it, and a few extra attributes.

The groupings are defined in a hierarchy, and they can have many of these groupings or products on a single "contract", but only unique records per product or grouping.

Now when it comes down to finding what the extra attributes that are valid on a contract are for the specific product you're looking for, the most relevant is the product being on the contract directly. If the product doesn't exist, we choose the lowest level grouping that is defined on the contract that contains that product.

The view I am having trouble with is able to push down it's where clause when the id's are directly specified like so:
SELECT *
FROM contract_product cp
WHERE cp.contract_id = '16d6df05-d8a0-4ec9-ae39-f4d8e13da597'
AND cp.product_id = '00c117d7-6451-4842-b17b-baa44baa375f';

But the where clause or join conditions are not pushed down in these cases (which is how I need to use the view): 
SELECT *
FROM contract_product cp
WHERE EXISTS (
SELECT 1
WHERE cp.contract_id = '16d6df05-d8a0-4ec9-ae39-f4d8e13da597'
AND cp.product_id = '00c117d7-6451-4842-b17b-baa44baa375f'
);

or 

SELECT *
FROM contract_product cp
INNER JOIN (
SELECT '16d6df05-d8a0-4ec9-ae39-f4d8e13da597'::uuid as contract_id, '00c117d7-6451-4842-b17b-baa44baa375f'::uuid as product_id
) p
ON cp.contract_id = p.contract_id
AND cp.product_id = p.product_id;


The definition of the view i'm having trouble with:
CREATE OR REPLACE VIEW contract_product AS 
SELECT DISTINCT ON (ci.contract_id, p.product_id) 
  ci.contract_item_id,
  ci.contract_id,
  p.product_id,
  ci.uom_type_id,
  ci.rebate_direct_rate,
  ci.decimal_model,
  ci.rebate_deviated_value,
  ci.rebate_deviated_type
FROM contract_item ci
LEFT JOIN grouping_hierarchy gh
  ON gh.original_grouping_id = ci.grouping_id
  AND NOT (EXISTS (
  SELECT 1
  FROM contract_item cig
  WHERE cig.contract_id = ci.contract_id
  AND gh.grouping_id = cig.grouping_id
  AND cig.grouping_id <> ci.grouping_id)
  )
LEFT JOIN product_grouping pg
  ON pg.grouping_id = gh.grouping_id
  AND NOT (EXISTS (
  SELECT 1
  FROM contract_item cip
  WHERE cip.contract_id = ci.contract_id
  AND pg.product_id = cip.product_id)
  )
JOIN product p
  ON p.product_id = COALESCE(ci.product_id, pg.product_id)
ORDER BY ci.contract_id, p.product_id, gh.level;

That view references another view to make it easy to find the correct level in my hierarchy, so here is the definition for that:
CREATE OR REPLACE VIEW grouping_hierarchy AS 
 WITH RECURSIVE groupings_list(original_grouping_id, parent_grouping_id, grouping_id) AS (
         SELECT pg.grouping_id AS original_grouping_id,
            pg.parent_grouping_id,
            pg.grouping_id,
            0 AS level
           FROM grouping pg
        UNION ALL
         SELECT gl.original_grouping_id,
            cg.parent_grouping_id,
            cg.grouping_id,
            gl.level + 1
           FROM groupings_list gl
             JOIN grouping cg ON cg.parent_grouping_id = gl.grouping_id
          WHERE cg.active_ind = true
        )
 SELECT groupings_list.original_grouping_id,
    groupings_list.parent_grouping_id,
    groupings_list.grouping_id,
    groupings_list.level
   FROM groupings_list;


And here are the query plans (in order) for those three queries:

Any help would be greatly appreciated on how to speed this up, or if i'm doing something Postgres just doesn't like and what an alternative method would be.

Thanks,
-Adam

Re: Terrible plan choice for view with distinct on clause

От
Tom Lane
Дата:
Adam Brusselback <adambrusselback@gmail.com> writes:
> The view I am having trouble with is able to push down it's where clause
> when the id's are directly specified like so:
> SELECT *
> FROM contract_product cp
> WHERE cp.contract_id = '16d6df05-d8a0-4ec9-ae39-f4d8e13da597'
> AND cp.product_id = '00c117d7-6451-4842-b17b-baa44baa375f';

> But the where clause or join conditions are not pushed down in these cases
> (which is how I need to use the view):
> SELECT *
> FROM contract_product cp
> WHERE EXISTS (
> SELECT 1
> WHERE cp.contract_id = '16d6df05-d8a0-4ec9-ae39-f4d8e13da597'
> AND cp.product_id = '00c117d7-6451-4842-b17b-baa44baa375f'
> );

This plea for help would be more convincing if you could explain *why*
you needed to do that.  As is, it sure looks like "Doctor, it hurts when
I do this".  What about that construction isn't just silly?

(And if you say "it's produced by an ORM I have no control over", I'm
going to say "your ORM was evidently written by blithering idiots, and
you should not have any faith in it".)

Having said that, the reason nothing good happens is that
convert_EXISTS_sublink_to_join() punts on subqueries that have an empty
FROM clause, as well as some other corner cases that I did not care to
analyze carefully at the time.  Just looking at this example, it seems
like if the SELECT list is trivial then we could simply replace the EXISTS
clause in toto with the contents of the lower WHERE clause, thereby
undoing the silliness of the query author.  I don't think this could be
handled directly in convert_EXISTS_sublink_to_join(), because it's defined
to return a JoinExpr which would not apply in such a case.  But possibly
it could be dealt with in make_subplan() without too much overhead.  I'm
not feeling motivated to work on this myself, absent a more convincing
explanation of why we should expend any effort to support this query
pattern.  But if anyone else is, have at it.

            regards, tom lane


Re: Terrible plan choice for view with distinct on clause

От
Adam Brusselback
Дата:
No ORM, just me.  
Was somewhat similar to something I had seen done at an old job, but they used SQL Server and that type of query worked fine there.

There were a couple business cases that had to be satisfied, which is why I went the way I did:
The first was "allow products to be grouped together, and those groups be placed in a hierarchy. All of the products in child groupings are valid for the parent of the grouping. Products should be able to be added and removed from this group at any point."
The next was "allow a user to add a product, or a group of products to a contract and set pricing information, product is the most definitive and overrides any groupings the product may be in, and the lowest level of the grouping hierarchy should be used if the product is not directly on the contract."
The last was "adding and removing products from a group should immediately take effect to make those products valid or invalid on any contracts that grouping is a part of."

Now I am not going to say I love this design, I actually am not a fan of it at all. I just couldn't think of any other design pattern that would meet those business requirements. I was hoping to create a view to make working with the final result the rules specified above easy when you want to know what pricing is valid for a specific product on a contract.

So that is the "why" at least.

On Thu, Dec 17, 2015 at 12:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Adam Brusselback <adambrusselback@gmail.com> writes:
> The view I am having trouble with is able to push down it's where clause
> when the id's are directly specified like so:
> SELECT *
> FROM contract_product cp
> WHERE cp.contract_id = '16d6df05-d8a0-4ec9-ae39-f4d8e13da597'
> AND cp.product_id = '00c117d7-6451-4842-b17b-baa44baa375f';

> But the where clause or join conditions are not pushed down in these cases
> (which is how I need to use the view):
> SELECT *
> FROM contract_product cp
> WHERE EXISTS (
> SELECT 1
> WHERE cp.contract_id = '16d6df05-d8a0-4ec9-ae39-f4d8e13da597'
> AND cp.product_id = '00c117d7-6451-4842-b17b-baa44baa375f'
> );

This plea for help would be more convincing if you could explain *why*
you needed to do that.  As is, it sure looks like "Doctor, it hurts when
I do this".  What about that construction isn't just silly?

(And if you say "it's produced by an ORM I have no control over", I'm
going to say "your ORM was evidently written by blithering idiots, and
you should not have any faith in it".)

Having said that, the reason nothing good happens is that
convert_EXISTS_sublink_to_join() punts on subqueries that have an empty
FROM clause, as well as some other corner cases that I did not care to
analyze carefully at the time.  Just looking at this example, it seems
like if the SELECT list is trivial then we could simply replace the EXISTS
clause in toto with the contents of the lower WHERE clause, thereby
undoing the silliness of the query author.  I don't think this could be
handled directly in convert_EXISTS_sublink_to_join(), because it's defined
to return a JoinExpr which would not apply in such a case.  But possibly
it could be dealt with in make_subplan() without too much overhead.  I'm
not feeling motivated to work on this myself, absent a more convincing
explanation of why we should expend any effort to support this query
pattern.  But if anyone else is, have at it.

                        regards, tom lane

Re: Terrible plan choice for view with distinct on clause

От
Adam Brusselback
Дата:
Also, sorry if I wasn't clear. Those two example queries above that performed badly were not exact queries that I would use, they were just simple examples that performed identically to something like this (or the exists version of the same query):

SELECT cp.*
FROM contract_product cp
INNER JOIN claim_product clp
ON cp.contract_id = clp.contract_id
WHERE clp.claim_id = 'whatever';


On Thu, Dec 17, 2015 at 1:08 PM, Adam Brusselback <adambrusselback@gmail.com> wrote:
No ORM, just me.  
Was somewhat similar to something I had seen done at an old job, but they used SQL Server and that type of query worked fine there.

There were a couple business cases that had to be satisfied, which is why I went the way I did:
The first was "allow products to be grouped together, and those groups be placed in a hierarchy. All of the products in child groupings are valid for the parent of the grouping. Products should be able to be added and removed from this group at any point."
The next was "allow a user to add a product, or a group of products to a contract and set pricing information, product is the most definitive and overrides any groupings the product may be in, and the lowest level of the grouping hierarchy should be used if the product is not directly on the contract."
The last was "adding and removing products from a group should immediately take effect to make those products valid or invalid on any contracts that grouping is a part of."

Now I am not going to say I love this design, I actually am not a fan of it at all. I just couldn't think of any other design pattern that would meet those business requirements. I was hoping to create a view to make working with the final result the rules specified above easy when you want to know what pricing is valid for a specific product on a contract.

So that is the "why" at least.

On Thu, Dec 17, 2015 at 12:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Adam Brusselback <adambrusselback@gmail.com> writes:
> The view I am having trouble with is able to push down it's where clause
> when the id's are directly specified like so:
> SELECT *
> FROM contract_product cp
> WHERE cp.contract_id = '16d6df05-d8a0-4ec9-ae39-f4d8e13da597'
> AND cp.product_id = '00c117d7-6451-4842-b17b-baa44baa375f';

> But the where clause or join conditions are not pushed down in these cases
> (which is how I need to use the view):
> SELECT *
> FROM contract_product cp
> WHERE EXISTS (
> SELECT 1
> WHERE cp.contract_id = '16d6df05-d8a0-4ec9-ae39-f4d8e13da597'
> AND cp.product_id = '00c117d7-6451-4842-b17b-baa44baa375f'
> );

This plea for help would be more convincing if you could explain *why*
you needed to do that.  As is, it sure looks like "Doctor, it hurts when
I do this".  What about that construction isn't just silly?

(And if you say "it's produced by an ORM I have no control over", I'm
going to say "your ORM was evidently written by blithering idiots, and
you should not have any faith in it".)

Having said that, the reason nothing good happens is that
convert_EXISTS_sublink_to_join() punts on subqueries that have an empty
FROM clause, as well as some other corner cases that I did not care to
analyze carefully at the time.  Just looking at this example, it seems
like if the SELECT list is trivial then we could simply replace the EXISTS
clause in toto with the contents of the lower WHERE clause, thereby
undoing the silliness of the query author.  I don't think this could be
handled directly in convert_EXISTS_sublink_to_join(), because it's defined
to return a JoinExpr which would not apply in such a case.  But possibly
it could be dealt with in make_subplan() without too much overhead.  I'm
not feeling motivated to work on this myself, absent a more convincing
explanation of why we should expend any effort to support this query
pattern.  But if anyone else is, have at it.

                        regards, tom lane