Обсуждение: Re: BUG #15160: planner overestimates number of rows in join whenthere are more than 200 rows coming from CTE

Поиск

Список

Период

Сортировка

Re: BUG #15160: planner overestimates number of rows in join whenthere are more than 200 rows coming from CTE

От

Melanie Plageman

Дата:

16 ноября 2018 г., 03:51:45

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           not tested
Documentation:            not tested

This patch applies cleanly and works for the case described in the original
email. All existing regression tests pass with the addition of the explain plan
update included in the patch.
 
I could not devise an example in which the previous method of calculating
selectivity would have produced a better estimate. However, one question I have
after thinking through the optimization is the following:
 
This new selectivity calculation (targeted at semi-joins though in eqjoinsel) is:
    selectivity = Min(semi-join selectivity, ntuples inner * inner-join selectivity);
 
For a join in which you do not have access to MCVs and assuming no NULLs, the
inner-join selectivity used to compare to the calculated semi-join selectivity*
is:
 
    selectivity = ntuples2 * 1 / max(nd1, nd2) = ntuples2 / max(nd1, nd2)
 
    if nd1 <= nd2:
        selectivity = ntuples2 / nd2
    else:
        selectivity = ntuples2 / nd1
 
To summarize:
 
Selectivity Type                  |  if nd1 <= nd2 |   if nd1 > nd2 |
----------------------------------|----------------|-----------------
inner-join selectivity * ntuples2 | ntuples2 / nd2 | ntuples2 / nd1 |
semi-join selectivity             |              1 |      nd2 / nd1 |
 
Notice that ntuples2 >= nd2 so no matter what nd1 and nd2 are:
 
    inner-join selectivity * ntuples >= semi-join selectivity
 
So, it seems like, unless you are missing NDVs of one of the sides of the join,
inner join selectivity can never be less than semi-join selectivity. If this is
true, why not use the default NDVs number in the semi-join selectivity
calculation?
 
* based on these summaries of the formulas for calculating selectivity
 
    inner-join selectivity when you don't have access to MCVs and assuming no NULLs is
        ~ 1 / max(nd1,nd2)
        if either nd1 or nd2 was default, nd[1,2] = min(ntuples[1,2], 200)
 
    semi-join selectivity when you don't have access to MCVs and assuming no NULLs is
        0.5 when either nd1 or nd2 is default
        when neither are default
            if nd2 < 0 || nd1 <= nd2
                1
            else
                nd2/nd1
 
If there is a reason to keep the existing formula, then I have an additional
question about the proposed selectivity calculation:
 
    selec = Min(selec, nd2 * selec_inner);
 
When would it be incorrect to instead multiply by inner side NDVs?
 
Besides the actual logic of the code added, Ekta and I did a code review and
had some feedback on the structure and clarity of the code and comments.
 
In the function eqjoinsel_semi, on line 2759 of the patched, rebased code,
could you not move the else condition:
 
    uncertainfrac = 0.5;
 
Up to the top of the if statement which starts on line 2663:
 
    if (have_mcvs1 && have_mcvs2 && OidIsValid(opfuncoid))
 
It seems like you already know and do not further modify the value of
isdefault1 and isdefault2 and could exit faster before looping through all the
MCVs in this case.
 
For the function eqjoinsel_inner, why pass in vardata1 and vardata2, as they
appear not to be used? Neither are the isdefault flags.
 
This is in the existing code, however, I thought I would ask here:
 
In eqjoinsel_semi, on line 2691 of the patched, rebased code, Why is this the
min of the number of MCVs captured and the distinct values? It seems like if
clamping resulted in an NDVs that is too low (i.e. impossibly low since the
number of distinct values cannot be less than the number of MCVs), then you
should bump it up to at least the number of MCVs:
 
    clamped_nvalues2 = Min(sslot2->nvalues, nd2);
 
I also found the new comment added above the new selectivity calculation to be
a little bit confusing:
            /*
             * We should never estimate the output of a semijoin to be more
             * rows than the equivalent inner join; it's obviously impossible
             * for that to happen.  The former is N1 * Psemi while the latter
             * is N1 * N2 * Pinner, so we may clamp Psemi <= N2 * Pinner.
             * Doing this is worthwhile because of the shakier estimation
             * rules we use in eqjoinsel_semi, particularly in cases where it
             * has to punt entirely.
             */
            selec = Min(selec, inner_rel->rows * selec_inner);

After re-reading it several times, I understood what it
was doing, however, it would be ideal if somehow the relationship between
selectivity and cardinality were more clear.
 
I don't have any great ideas for additional wording, however, maybe it would
help to clarify that in order to clamp the cardinality correctly, we must clamp
the selectivity using this formula. Basically, specify that we are not clamping
the selectivity of semi-join to the selectivity of inner join, but, rather,
that we are clamping the cardinality of semi-join to consider only the matching
rows of the inner side (also, if that sentence is actually not correct, then
some description that avoids confusion like that would be helpful).

The new status of this patch is: Waiting on Author

Re: BUG #15160: planner overestimates number of rows in join when there are more than 200 rows coming from CTE

От

Tom Lane

Дата:

16 ноября 2018 г., 19:18:41

Melanie Plageman <melanieplageman@gmail.com> writes:
> This patch applies cleanly and works for the case described in the original
> email. All existing regression tests pass with the addition of the explain plan
> update included in the patch.

Thanks for reviewing!

> I could not devise an example in which the previous method of calculating
> selectivity would have produced a better estimate. However, one question I have
> after thinking through the optimization is the following:
> ...
> To summarize:
> Selectivity Type                  |  if nd1 <= nd2 |   if nd1 > nd2 |
> ----------------------------------|----------------|-----------------
> inner-join selectivity * ntuples2 | ntuples2 / nd2 | ntuples2 / nd1 |
> semi-join selectivity             |              1 |      nd2 / nd1 |

Um, mumble.  Those functions could be using different values of nd2
thanks to the clamping logic near the head of eqjoinsel_semi, so I'm
not sure that the comparison you're making really holds.

> ... why not use the default NDVs number in the semi-join selectivity
> calculation?

Practical experience says that it doesn't work very well; see the thread
I referred to before,
https://www.postgresql.org/message-id/flat/201104112029.14738.uwe@oss4u.com
particularly my comment

::  While I don't have your specific example to try, I did some
::  experimenting with queries of this form, and I noticed that 8.4's
::  heuristic in eqjoinsel_semi() was going completely nuts and estimating
::  that all rows in the lefthand side have join partners (thus, no rows out
::  of the antijoin).  This is because it has stats for one side of the
::  comparison operator but not the other side (the one with the
::  sub-select).  But it's taking the totally-made-up ndistinct estimate for
::  the sub-select at face value.  It needs to be a bit warier I think.

That experience is what led to the "isdefault" checks that exist now
in eqjoinsel_semi.  I don't think that applying a clamp based on
eqjoinsel_inner is sufficient reason to remove those sanity checks.
In particular, even if the clamp removes the possibility of the semijoin
estimate being too high, it doesn't do anything to prevent a too-low
estimate due to using baseless numbers.

> If there is a reason to keep the existing formula, then I have an additional
> question about the proposed selectivity calculation:
>     selec = Min(selec, nd2 * selec_inner);
> When would it be incorrect to instead multiply by inner side NDVs?

I'm confused ... isn't that exactly what this is doing?

> In the function eqjoinsel_semi, on line 2759 of the patched, rebased code,
> could you not move the else condition:
>     uncertainfrac = 0.5;
> Up to the top of the if statement which starts on line 2663:
>     if (have_mcvs1 && have_mcvs2 && OidIsValid(opfuncoid))
> It seems like you already know and do not further modify the value of
> isdefault1 and isdefault2 and could exit faster before looping through all the
> MCVs in this case.

Not really, because we still need to know matchfreq1, so we still have
to do all the comparisons.  It's true that nmatches won't be used in
this case, but I don't see that we can get any meaningful savings by
not computing that.

> For the function eqjoinsel_inner, why pass in vardata1 and vardata2, as they
> appear not to be used? Neither are the isdefault flags.

It just seemed saner to keep the parameter lists similar for
eqjoinsel_inner and eqjoinsel_semi (a judgment call, I admit).
In practice, since there's only one call site for eqjoinsel_inner,
I'd expect it to get inlined so that there's no runtime cost anyway.

> This is in the existing code, however, I thought I would ask here:
> In eqjoinsel_semi, on line 2691 of the patched, rebased code, Why is this the
> min of the number of MCVs captured and the distinct values? It seems like if
> clamping resulted in an NDVs that is too low (i.e. impossibly low since the
> number of distinct values cannot be less than the number of MCVs), then you
> should bump it up to at least the number of MCVs:
>     clamped_nvalues2 = Min(sslot2->nvalues, nd2);

No, because sslot2->nvalues is the number of distinct values that ANALYZE
found in the whole table.  If nd2 got clamped to less than that, it's
because we have a WHERE clause that is filtering the table down to fewer
rows than there are distinct values in the table, so we can be sure
(at least, up to the reliability of the WHERE estimate) that not all of
the recorded MCVs are going to be present in the rows being joined.
We don't know which ones will be present, but it seems like a reasonable
bet that the most common ones will be present.  Since the list is already
ordered by decreasing frequency, just taking the first N of them gets us
that.  You could imagine trying to refine that, say by reducing the
number further to allow for some of the join input rows containing
non-MCVs, but I don't see an argument for increasing it.

> I also found the new comment added above the new selectivity calculation to be
> a little bit confusing:
>             /*
>              * We should never estimate the output of a semijoin to be more
>              * rows than the equivalent inner join; it's obviously impossible
>              * for that to happen.  The former is N1 * Psemi while the latter
>              * is N1 * N2 * Pinner, so we may clamp Psemi <= N2 * Pinner.
>              * Doing this is worthwhile because of the shakier estimation
>              * rules we use in eqjoinsel_semi, particularly in cases where it
>              * has to punt entirely.
>              */
>             selec = Min(selec, inner_rel->rows * selec_inner);

> After re-reading it several times, I understood what it
> was doing, however, it would be ideal if somehow the relationship between
> selectivity and cardinality were more clear.

Hm.  Maybe the "Psemi" and "Pinner" notation is not helpful ... would
"Ssemi" and "Sinner" be better?

            regards, tom lane

Re: BUG #15160: planner overestimates number of rows in join whenthere are more than 200 rows coming from CTE

От

Melanie Plageman

Дата:

16 ноября 2018 г., 21:31:47

Thanks for the quick responses. I've put some inline follow-up questions.

On a separate note, I had one additional code clarity feedback. I felt that
eqjoinsel could be reorganized a bit for readability/clarity for the reader.
For example, eqjoinsel_inner uses only the AttStatsSlots up until here and then
suddenly uses the original stats object and the ndvs which we passed in:

    else
    {
        ...
        double        nullfrac1 = stats1 ? stats1->stanullfrac : 0.0;
        double        nullfrac2 = stats2 ? stats2->stanullfrac : 0.0;

        selec = (1.0 - nullfrac1) * (1.0 - nullfrac2);
        if (nd1 > nd2)
            selec /= nd1;
       else
            selec /= nd2;
    }

It would make the process of calculating selectivity for an equijoin more clear
to the reader if the nullfraction calculation was pulled out into the main
eqjoinsel function.

Having a clear set of steps in eqjoinsel would be helpful. Basically, my
understanding of an overview of the steps is the following:

    1) get NDVs
    2) get nullfrac
    3) get MCVs
    4) calculate selectivity

Based on this assumption, I've attached a patch with a rough idea for an
alternative structure that I think would be more clear to the reader.

> I could not devise an example in which the previous method of calculating
> selectivity would have produced a better estimate. However, one question I have
> after thinking through the optimization is the following:
> ...
> To summarize:
> Selectivity Type | if nd1 <= nd2 | if nd1 > nd2 |
> ----------------------------------|----------------|-----------------
> inner-join selectivity * ntuples2 | ntuples2 / nd2 | ntuples2 / nd1 |
> semi-join selectivity | 1 | nd2 / nd1 |

Um, mumble. Those functions could be using different values of nd2
thanks to the clamping logic near the head of eqjoinsel_semi, so I'm
not sure that the comparison you're making really holds.

That's a good point. Taking another look at that clamping logic, I realized
that I don't really understand why that clamping would be done for a semi-join
and not for an inner join. It seems like for an inner join it is also true that
the the nd1 cannot be greater than outer rel estimated tuples and nd2 could not
be greater than inner rel estimated tuples.

Also, I don't understand when vardata2->rel->rows and inner_rel->rows would be
different. I thought the point of doing this clamping was that, if you have a
restriction, like the predicate in this subquery select * from foo where a in
(select b from bar where b > 10); your row estimate for bar and your row
estimate for the rows out for that subquery would be different. However, I
looked at the RelOptInfos for vardata2->rel and inner_rel for this query and it
seems like they are referencing the same relation and have the same rows
estimate, so I'm confused when the rows would be different.

> If there is a reason to keep the existing formula, then I have an additional
> question about the proposed selectivity calculation:
> selec = Min(selec, nd2 * selec_inner);
> When would it be incorrect to instead multiply by inner side NDVs?

I'm confused ... isn't that exactly what this is doing?

Sorry, typo, I was asking why

selec = Min(selec, nd2 * selec_inner);

could not be used instead of what is in the patch

selec = Min(selec, inner_rel->rows * selec_inner);

Thanks,

Melanie

Вложения

suggested_semijoin_selec_refactor.patch

Re: BUG #15160: planner overestimates number of rows in join when there are more than 200 rows coming from CTE

От

Tom Lane

Дата:

17 ноября 2018 г., 23:22:20

Melanie Plageman <melanieplageman@gmail.com> writes:
> On a separate note, I had one additional code clarity feedback. I felt that
> eqjoinsel could be reorganized a bit for readability/clarity for the reader.
> ...
> Based on this assumption, I've attached a patch with a rough idea for an
> alternative structure that I think would be more clear to the reader.

Hmm, that doesn't really seem like an improvement to me.  As things stand,
all the actual calculations are in eqjoinsel_inner/semi; eqjoinsel itself
is only responsible for some preliminary information lookup that's needed
in all cases.  My patch expands the amount of "preliminary information"
but doesn't fundamentally change that division of responsibility.  It
seems like what you want to do here does change that, and I don't see
the value of breaking down the division.  I also don't like the fact
that we might calculate a value that won't be used; admittedly, it's a
pretty cheap calculation so that doesn't matter much, but by the same
token we'd not be saving a lot of code by moving it.

> That's a good point. Taking another look at that clamping logic, I
> realized that I don't really understand why that clamping would be done
> for a semi-join and not for an inner join. It seems like for an inner
> join it is also true that the the nd1 cannot be greater than outer rel
> estimated tuples and nd2 could not be greater than inner rel estimated
> tuples.

The main way that eqjoinsel_inner uses those values is this bit:

         * We do not have MCV lists for both sides.  Estimate the join
         * selectivity as MIN(1/nd1,1/nd2)*(1-nullfrac1)*(1-nullfrac2). This
         * is plausible if we assume that the join operator is strict and the
         * non-null values are about equally distributed: a given non-null
         * tuple of rel1 will join to either zero or N2*(1-nullfrac2)/nd2 rows
         * of rel2, so total join rows are at most
         * N1*(1-nullfrac1)*N2*(1-nullfrac2)/nd2 giving a join selectivity of
         * not more than (1-nullfrac1)*(1-nullfrac2)/nd2.

In the expression N2*(1-nullfrac2)/nd2, all three values are meant to be
measured across the whole of the rel2 relation; if we were to decrease nd2
to reflect the effects of earlier filtering, we'd get an incorrect
selectivity.  The same applies to eqjoinsel_semi's calculations about the
outer rel, but *not* to the inner rel, as explained here:

     * We clamp nd2 to be not more than what we estimate the inner relation's
     * size to be.  This is intuitively somewhat reasonable since obviously
     * there can't be more than that many distinct values coming from the
     * inner rel.  The reason for the asymmetry (ie, that we don't clamp nd1
     * likewise) is that this is the only pathway by which restriction clauses
     * applied to the inner rel will affect the join result size estimate,
     * since set_joinrel_size_estimates will multiply SEMI/ANTI selectivity by
     * only the outer rel's size.  If we clamped nd1 we'd be double-counting
     * the selectivity of outer-rel restrictions.

(Here, both "outer rel's size" and "inner rel's size" mean the size after
earlier filtering steps.)  So that's why we only clamp nd2 and only do so
in eqjoinsel_semi: in the other three cases, we'd be double-counting the
selectivity of earlier filters if we did that.

So basically the inconsistency here comes from the fact that we define
the meaning of join selectivity differently for inner and semi joins.
I've occasionally wondered if that was a bad choice and we should just
say that selectivity should always be calculated so that the join size
is outer size times inner size times selectivity.  But that would
certainly not make for any less need for the selectivity estimator to
do things differently for inner and semi joins, so I am not seeing much
upside to changing it.

> Also, I don't understand when vardata2->rel->rows and inner_rel->rows would
> be different.

vardata2->rel->rows is going to reflect the size (post restriction quals)
of the base relation that var2 came from.  inner_rel->rows will be the
same if the inner side of the current join is just that one base relation;
but if the inner side is a lower join of several base relations,
inner_rel->rows will be the size of that join.

>> I'm confused ... isn't that exactly what this is doing?

> Sorry, typo, I was asking why
> selec = Min(selec, nd2 * selec_inner);
> could not be used instead of what is in the patch
> selec = Min(selec, inner_rel->rows * selec_inner);

Because the selectivity is defined as something you multiply the relation
size with, not the number of distinct values within the rel.

            regards, tom lane

Re: BUG #15160: planner overestimates number of rows in join whenthere are more than 200 rows coming from CTE

От

Melanie Plageman

Дата:

20 ноября 2018 г., 23:24:59

Given that you have addressed all of my feedback and that it's a pretty
low-risk change, I will change the status to "ready for committer".

There are a couple of minor follow-up clarifications inline that relate mostly
to the questions that I asked in previous emails.

I did have one other question:
Has there been discussion in the past about adding a planner test extension
similar to those in src/test/modules for cardinality estimation? I am imagining
something that is a "soft" check that either the rows estimation that comes out
of calc_joinrel_size_estimate is within an expected range (given differing
estimates across machines) or that the selectivity estimate that comes out of
eqjoinsel is within an expected range. The former seems relatively easy to do
in a manner similar to the test_predtest extension and the latter seems like it
could be done even more trivially.

On Sat, Nov 17, 2018 at 12:22 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

(Here, both "outer rel's size" and "inner rel's size" mean the size after
earlier filtering steps.) So that's why we only clamp nd2 and only do so
in eqjoinsel_semi: in the other three cases, we'd be double-counting the
selectivity of earlier filters if we did that.

I just want to make sure I am understanding what the comment is saying: So,
after we calculate the selectivity for inner join, when we return from
calc_joinrel_size_estimate we do this math:

nrows = outer_rows * inner_rows * fkselec * jselec;

and in that equation, the outer and inner rows have been adjusted to account
for any restrictions on the tables, so we don't clamp the ndvs for inner join
in eqjoinsel_inner. However, for semi-join, that calculation is

nrows = outer_rows * fkselec * jselec;

Which means that we have to adjust the rows of the inner side before we get
here?

So basically the inconsistency here comes from the fact that we define
the meaning of join selectivity differently for inner and semi joins.
I've occasionally wondered if that was a bad choice and we should just
say that selectivity should always be calculated so that the join size
is outer size times inner size times selectivity. But that would
certainly not make for any less need for the selectivity estimator to
do things differently for inner and semi joins, so I am not seeing much
upside to changing it.

I see what you are saying. I got tangled up in this part of the code, so I am
inclined to say that it could stand to be more clear. Selectivity is a ratio,
and, even if you calculate the two sides of the ratio differently, that doesn't
mean the definition of the ratio should be different.

Also, I wanted to address a question you asked in an earlier email:

You wrote:
> Hm. Maybe the "Psemi" and "Pinner" notation is not helpful ... would
> "Ssemi" and "Sinner" be better?

I think Ssemi and Sinner might be more clear--mostly because we haven't used
P/predicate here or in most of the other selectivity estimation comments that I
read. Also, in some cases when we have super limited information and make a
guess, the selectivity feels pretty detached from the join predicate.

Thanks!

Re: BUG #15160: planner overestimates number of rows in join when there are more than 200 rows coming from CTE

От

Tom Lane

Дата:

23 ноября 2018 г., 20:17:34

Melanie Plageman <melanieplageman@gmail.com> writes:
> Given that you have addressed all of my feedback and that it's a pretty
> low-risk change, I will change the status to "ready for committer".

Thanks for reviewing!

> Has there been discussion in the past about adding a planner test
> extension similar to those in src/test/modules for cardinality
> estimation? I am imagining something that is a "soft" check that either
> the rows estimation that comes out of calc_joinrel_size_estimate is
> within an expected range (given differing estimates across machines) or
> that the selectivity estimate that comes out of eqjoinsel is within an
> expected range. The former seems relatively easy to do in a manner
> similar to the test_predtest extension and the latter seems like it
> could be done even more trivially.

No, I don't recall any discussion about that.  The regression tests in
general embody a lot of checking that the planner makes expected plan
choices: obviously the cases where we do an explicit EXPLAIN do that,
but even where we don't, we'd be likely to get artifacts such as varying
row order if an unexpected plan were chosen.  Perhaps there's a use-case
for a lower-level test harness such as you suggest, but I haven't really
felt a need for it.

> On Sat, Nov 17, 2018 at 12:22 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> (Here, both "outer rel's size" and "inner rel's size" mean the size after
>> earlier filtering steps.)  So that's why we only clamp nd2 and only do so
>> in eqjoinsel_semi: in the other three cases, we'd be double-counting the
>> selectivity of earlier filters if we did that.

> I just want to make sure I am understanding what the comment is saying: So,
> after we calculate the selectivity for inner join, when we return from
> calc_joinrel_size_estimate we do this math:
>     nrows = outer_rows * inner_rows * fkselec * jselec;
> and in that equation, the outer and inner rows have been adjusted to account
> for any restrictions on the tables, so we don't clamp the ndvs for inner
> join in eqjoinsel_inner. However, for semi-join, that calculation is
>     nrows = outer_rows * fkselec * jselec;
> Which means that we have to adjust the rows of the inner side before we get
> here?

Yeah.  Basically the point is that if we have some WHERE clause that
eliminates rows from the inner side of a semijoin, we can expect that
that means the size of the semijoin result will be smaller than if the
WHERE clause hadn't been there --- because some of the outer-rel rows
only had matches among those excluded rows.  But the equation in
calc_joinrel_size_estimate provides no way to factor that in, except
by adjusting the selectivity value, so that's what we do.

> You wrote:
>> Hm.  Maybe the "Psemi" and "Pinner" notation is not helpful ... would
>> "Ssemi" and "Sinner" be better?

> I think Ssemi and Sinner might be more clear--mostly because we haven't used
> P/predicate here or in most of the other selectivity estimation comments
> that I
> read. Also, in some cases when we have super limited information and make a
> guess, the selectivity feels pretty detached from the join predicate.

OK, thanks --- I'll have another go at writing that comment.

            regards, tom lane

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Re: BUG #15160: planner overestimates number of rows in join whenthere are more than 200 rows coming from CTE

Re: BUG #15160: planner overestimates number of rows in join whenthere are more than 200 rows coming from CTE

Re: BUG #15160: planner overestimates number of rows in join when there are more than 200 rows coming from CTE

Re: BUG #15160: planner overestimates number of rows in join whenthere are more than 200 rows coming from CTE

Вложения

Re: BUG #15160: planner overestimates number of rows in join when there are more than 200 rows coming from CTE

Re: BUG #15160: planner overestimates number of rows in join whenthere are more than 200 rows coming from CTE

Re: BUG #15160: planner overestimates number of rows in join when there are more than 200 rows coming from CTE