On 12/4/18 10:44 AM, Alvaro Herrera wrote:
> After reading this thread, I think I like WHERE better than FILTER.
> Tally:
>
> WHERE: Adam Berlin, Lim Myungkyu, Dean Rasheed, yours truly
> FILTER: Tomas Vondra, Surafel Temesgen
>
> Couldn't find others expressing an opinion in this regard.
>
While I still like FILTER more, I won't object to using WHERE if others
thinks it's a better choice.
> On 2018-Nov-30, Tomas Vondra wrote:
>
>> I think it should be enough just to switch to CIM_SINGLE and
>> increment the command counter after each inserted row.
>
> Do we apply command counter increment per row with some other COPY
> option?
I don't think we increment the command counter anywhere, most likely
because COPY is not allowed to run any queries directly so far.
> Per-row CCI makes me a bit uncomfortable because with you'd get in
> trouble with a large copy. I think it's particularly nasty here,
> precisely because you may want to filter out some rows of a very
> large file, and the CCI may prevent that from working.
Sure.
> I'm not convinced by the example case of reading how many tuples
> you've imported so far in the WHERE/WHEN/FILTER clause each time
> (that'd become incrementally slower as it progresses).
>
Well, not sure how else am I supposed to convince you? It's an example
of a behavior that's IMHO surprising and inconsistent with things that
might be reasonably expected to behave similarly. It may not be a
perfect example, but that's the price for simplicity.
FWIW, another way to achieve mostly the same filtering feature is a
BEFORE INSERT trigger:
create or replace function copy_filter() returns trigger as $$
declare
v_c int;
begin
select count(*) into v_c from t;
if v_c >= 100 then
return null;
end if;
return NEW;
end; $$ language plpgsql;
create trigger filter before insert on t
for each row execute procedure copy_filter();
This behaves consistently with INSERT, i.e. it enforces the total count
constraint the same way. And the COPY FILTER behaves differently.
FWIW I do realize this is not a particularly great check - for example,
it will not see effects of concurrent transactions etc. All I'm saying
is I find it annoying/strange that it behaves differently.
Also, considering the trigger does the right thing, maybe I spoke too
early about the command counter not being incremented?
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services