Robert Haas wrote:
>> Maybe, to come up with something remotely realistic, a formula like
>>
>> sum of locally estimated costs of sequential scan for the base table
>> plus count of estimated result rows (times a factor)
>
> Was this meant to say "the base tables", plural?
Yes.
> I think whatever we do here should try to extend the logic in
> postgres_fdw's estimate_path_cost_size() to foreign tables in some
> reasonably natural way, but I'm not sure exactly what that should look
> like. Maybe do what that function currently does for single-table
> scans, and then add all the values up, or something like that. I'm a
> little worried, though, that the planner might then view a query that
> will be executed remotely as a nested loop with inner index-scan as
> not worth pushing down, because in that case the join actually will
> not touch every row from both tables, as a hash or merge join would.
That's exactly what I meant, minus a contribution for the estimated
result set size.
There are cases where a nested loop is faster than a hash join,
but I think it is rare that this is by orders of magnitude.
So I'd say it is a decent rough estimate, and that's the best we can
hope for here, if we cannot ask the remote server.
Yours,
Laurenz Albe