Tom Lane wrote:
> Josh Berkus <josh@agliodbs.com> writes:
> > It's not as if we don't have the ability to measure performance impact.
> > It's reasonable to make a requirement that new options to COPY
> > shouldn't slow it down noticeably if those options aren't used. And we
> > can test that, and even make such testing part of the patch review.
>
> Really? Where is your agreed-on, demonstrated-to-be-reproducible
> benchmark for COPY speed?
>
> My experience is that reliably measuring performance costs in the
> percent-or-so range is *hard*. It's only after you've added a few of
> them and they start to mount up that it becomes obvious that all those
> insignificant additions really did cost something.
>
> But in any case, I think that having a clear distinction between
> "straight data import" and "data transformation" features is a good
> thing. COPY is already pretty much of an unmanageable monstrosity,
> and continuing to accrete features into it without any sort of structure
> is something we are going to regret.
I have read up on this thread and the new copy syntax thread. I think
there is clearly documented demand for such extensions to COPY.
We are definitely opening the floodgates by allowing COPY to process
invalid data. I think everyone admits COPY is already quite
complicated, both in its API and C code.
If we are going to add to COPY, I think we need to do it in a way that
has a clean user API, doesn't make the C code any more complicated, and
doesn't introduce a performance impact for people not using these new
features. If we don't do that, we are going to end up like 'bcp' that
is perpetually buggy, as someone explained.
-- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB
http://enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +