On 03/24/2014 02:50 PM, Jim Nasby wrote:
> On 3/22/14, 11:26 AM, Jim Nasby wrote:
>> On 3/21/14, 4:54 PM, Tom Lane wrote:
>>> Merlin Moncure <mmoncure@gmail.com> writes:
>>>> There is no way for psql to handle that case though unless you'd strip
>>>> *all* BOMs encountered. Compounding this problem is that there's no
>>>> practical way AFAIK to send multiple file to psql via single command
>>>> line invocation. If you pass multiple -f arguments all but one is
>>>> ignored.
>>>
>>> Well, that seems like a solvable but rather independent problem.
>>> I guess one issue is how you'd define the meaning of --single ...
>>> one transaction per run, or one per file?
>>
>> Well, if you're catting multiple files into psql -1, you'd get all
>> the files in one transaction, right? So I'd say that's what should
>> happen.
>
> It occurs to me that we're going about this the wrong way...
>
> The error here isn't being generated by psql; it's generated by the
> backend. In the context of a statement (and not, say, a COPY command).
>
> So instead of trying to handle this on the psql side[1], I think we
> need to handle it in the backend; specifically in the parser. Is there
> an easy way to get the parser to ignore the BOM character in the
> context of commands (but not in strings)?
>
> [1]: Obviously, BOM could still screw up a psql command like \d. We'd
> want to address that as well; but I suspect backends are the more
> common scenario.
But what about COPY files? I don't see why there is less of a case for
eating a leading BOM for a COPY file (or COPY stdin for that matter,
given that it can come from \copy) than for an SQL file.
I suspect suspect trying to do this in the parser will be quite messy.
This needs to happen before the input is converted to the server
encoding, I think.
cheers
andrew