Re: Plan to support predicate push-down into subqueries with aggregates?

Поиск
Список
Период
Сортировка
От David G. Johnston
Тема Re: Plan to support predicate push-down into subqueries with aggregates?
Дата
Msg-id CAKFQuwbfSUdCGPDsJJcd4_M15MiU4sUzNWNoPnA27+ZjZ=+fYQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Plan to support predicate push-down into subqueries with aggregates?  (Adam Brusselback <adambrusselback@gmail.com>)
Ответы Re: Plan to support predicate push-down into subqueries with aggregates?  (Adam Brusselback <adambrusselback@gmail.com>)
Список pgsql-general
We bottom (or inline) post on these lists.

On Tue, Mar 8, 2016 at 3:37 PM, Adam Brusselback <adambrusselback@gmail.com> wrote:
On Tue, Mar 8, 2016 at 5:17 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Adam Brusselback <adambrusselback@gmail.com> writes:
> I was wondering if there were any plans to support predicate push-down
> optimization for subqueries (and views) with aggregates?

Right offhand I would say that that's a complete mischaracterization
of the problem.  I've not tried to run your test case, but you say

> --Quick, is able to push down because it's a simple equality check
> SELECT *
> FROM header
> INNER JOIN header_total
> USING (header_id)
> WHERE header.header_id = 26;
>
> --Slow, no pushdown
> SELECT *
> FROM header
> INNER JOIN header_total
> USING (header_id)
> WHERE header.header_id < 200;
>
> --Slow, no pushdown
> SELECT *
> FROM header
> INNER JOIN header_total
> USING (header_id)
> WHERE header.description like '%5%';

There's no preference for equalities over other kinds of predicates
as far as subquery pushdown is concerned.  I think what your real
problem is is that in the first case, the system will derive the
additional condition "header_total.header_id = 26", while in the
second case it will not deduce "header_total.header_id < 200".
That's because the machinery for deducing such implied constraints
works only with equalities.  That's not very likely to change anytime
soon, and even if it did, the inference would only extend to operators
that are members of the same btree family as the join equality operator.
Your example with a LIKE clause is always going to be out in the cold,
because there is no principled basis for the planner to decide that
"a = b" means that "a LIKE x" and "b LIKE x" will give the same result.
It hasn't got enough information about the behavior of LIKE to know
if that's safe or not.  (It does, on the other hand, know very well that
SQL equality operators don't necessarily guarantee bitwise identity.)

So I'd suggest just modifying your queries to write out both constraints
explicitly.

Thanks Tom, appreciate the reply.

Sorry if I didn't call it the correct thing. I just know that with trying to encapsulate this aggregate logic in a view, I am unable to use that view in a query that I know is only going to touch a subset of the data without incurring a performance hit from the view doing seq scans on all of the rows in the detail_1 and detail_2 tables, and then throwing out 99% of the results when the filter is applied.

I had initially started creating functions that would take an array of ids as a parameter, and manually push them down in the subqueries.  That got really really messy though, and we moved away from doing that to having the aggregates eagerly materialized to a table with triggers.


Are there any other options for making this type of query faster?  It could be that I just am totally missing a better way to do this.  I do really want to be able to contain that logic within a view of some sort though, as a bunch of other stuff is built on top of that.  Having to push that aggregate query into all of those other queries would be hell.


​Maybe something like:

CREATE TYPE header_total_item (
     amount numeric,
     count    integer
)​;

CREATE VIEW AS
SELECT header_id, header_amount
COALESCE(
(SELECT ROW(sum(rate * quantity), count(detail_1_id))::header_total_item FROM detail_1 WHERE detail_1.header_id = header.header_id)
ROW(0.000,0)::header_total_item
) AS detail_1_total_item
FROM header;

​​Doesn't solve the "balance" column without going into a sub-query...which I suspect is possible but don't have time to test.

I do understand the question as to why the view cannot be re-written as:

I don't have time to get into this deeper right now.  The main question is basically how can you force a parameterized nested join on the LEFT JOIN sub-queries given that the system has equality joins between them and header and you've supposedly provided a very selective predicate for the rows being returned by header.

David J.

В списке pgsql-general по дате отправления:

Предыдущее
От: Adam Brusselback
Дата:
Сообщение: Re: Plan to support predicate push-down into subqueries with aggregates?
Следующее
От: "Thiemo Kellner, NHC Barhufpflege"
Дата:
Сообщение: Re: Logger into table and/or to cli