On Wed, Jul 1, 2020 at 7:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Attached are a couple of quick-hack patches along each of those lines.
> Either one resolves the crazy number-of-groups estimate for Jeff's
> example; neither changes any existing regression test results.
>
> On the whole I'm not sure I like 0001 (ie, changing estimate_num_groups).
> Sure, it makes that function "more robust", but it does so at the cost
> of believing what might be a default or otherwise pretty insane
> reldistinct estimate. We put in the clamping behavior for a reason,
> and I'm not sure we should disable it just because reltuples = 0.
>
> 0002 seems like a better answer on the whole, but it has a pretty
> significant issue as well: it's changing the API for FDW
> GetForeignRelSize functions, because now we're expecting them to set
> both rows and tuples to something sane, contrary to the existing docs.
postgres_fdw already sets both rows and tuples if use_remote_estimate
is false, and we have pages=0 and tuples=0, so the contrary seems OK
to me.
In the 0002 patch:
+ /*
+ * plancat.c copied baserel->pages and baserel->tuples from pg_class.
+ * If the foreign table has never been ANALYZEd, or if its stats are
+ * out of date, baserel->tuples might now be less than baserel->rows,
+ * which will confuse assorted logic. Hack it to appear minimally
+ * sensible. (Do we need to hack baserel->pages too?)
+ */
+ baserel->tuples = Max(baserel->tuples, baserel->rows);
for consistency, this should be
baserel->tuples = clamp_row_est(baserel->rows / sel);
where sel is the selectivity of the baserestrictinfo clauses?
> What I'm sort of inclined to do is neither of these exactly, but
> instead put the
>
> baserel->tuples = Max(baserel->tuples, baserel->rows);
>
> clamping behavior into the core code, immediately after the call to
> GetForeignRelSize. This'd still let the FDW set baserel->tuples if
> it has a mind to, while not requiring that; and it prevents the
> situation where the rows and tuples estimates are inconsistent.
I'm not sure this would address the inconsistency. Consider the
postgres_fdw case where use_remote_estimate is true, and the stats are
out of date, eg, baserel->tuples copied from pg_class is much larger
than the actual tuples and hence baserel->rows (I assume here that
postgres_fdw doesn't do anything about baserel->tuples). In such a
case the inconsistency would make the estimate_num_groups() estimate
more inaccurate. I think the consistency is the responsibility of the
FDW rather than the core, so I would vote for the 0002 patch. Maybe
I'm missing something.
Thanks for working on this!
Best regards,
Etsuro Fujita