Shigeru HANADA <hanada@metrosystems.co.jp> writes:
> [ 20110218-file_fdw.patch ]
I've adjusted this to fit the extensions infrastructure and the
committed version of the FDW API patch, and applied it.
>> * You might forget some combination or unspecified options in
>> file_fdw_validator().
>> For example, format == NULL or !csv && header cases. I've not tested all
>> cases, but please recheck validations used in BeginCopy().
> Right, I've revised validation based on BeginCopy(), and added
> regression tests about validation.
This approach struck me as entirely unmaintainable. I modified the core
COPY code to allow its option validation code to be called directly.
>> If so, we need alternative
>> solution in estimate_costs(). We adjust statistics with runtime relation
>> size in estimate_rel_size(). Also, we use get_rel_data_width() for not
>> analyzed tables. We could use similar technique in file_fdw, too.
> Ah, using get_relation_data_width(), exported version of
> get_rel_data_width(), seems to help estimation. I'll research around
> it little more. By the way, adding ANALYZE support for foreign tables
> is reasonable idea for this issue?
I did some quick hacking so that the numbers are at least a little bit
credible, but of course without ANALYZE support the qualification
selectivity estimates are likely to be pretty bogus. I am not sure
whether there's much of a use-case for supporting ANALYZE though.
I would think that if someone is going to read the same file in multiple
queries, they'd be far better off importing the data into a real table.
In any case, it's too late to worry about that for 9.1. I suggest
waiting to see what sort of real-world usage file_fdw gets before we
worry about whether it needs ANALYZE support.
regards, tom lane