Re: UTF8 encoding and non-text data types

Поиск
Список
Период
Сортировка
От Steve Midgley
Тема Re: UTF8 encoding and non-text data types
Дата
Msg-id 20080114205255.60EC62E0068@postgresql.org
обсуждение исходный текст
Ответ на Re: UTF8 encoding and non-text data types  ("Medi Montaseri" <montaseri@gmail.com>)
Список pgsql-sql
At 12:43 PM 1/14/2008, Medi Montaseri wrote:<br /><blockquote cite="" class="cite" type="cite">Here is my traces from
perlCGI code, I'll include two samples one in ASCII and one UTF so we know what to expect<br /><br /> Here is actual
SQLstatement being executed in Perl and DBI. I do not quote the numerical value, just provided to DBI raw. <br /><br />
insertinto t1 (c1, cost) values ('tewt', 1234)<br /> this works find....<br /> insert into t1 (c1, cost) values
('&#1588;&#1583;',&#1777;&#1778;&#1779;&#1780;)<br />  DBD::Pg::db do failed: ERROR:  syntax
errorat or near ";" at character 59, <br /><br /> And the PG log itself is very similar and says<br /> ERROR:  syntax
errorat or near ";" at character 59<br /><br /> Char 59 by the way is the first accurance of semi-colon as in
&#17777;which is being caught by PG parser. <br /><br /> Medi<br /><br /><br /> On Jan 14, 2008 12:18 PM, Steve
Midgley<<a href="mailto:public@misuse.org">public@misuse.org</a>> wrote:<br /><dl><br /><blockquote cite=""
class="cite"type="cite"></blockquote><dd>On Jan 13, 2008 8:51 PM, Steve Midgley <<a
href="mailto:public@misuse.org">public@misuse.org</a>>wrote: <dl><dd>At 02:22 PM 1/13/2008, <a
href="mailto:pgsql-sql-owner@postgresql.org">pgsql-sql-owner@postgresql.org</a> wrote: <dd>>Date: Sat, 12 Jan 2008
14:21:00-0800 <dd>>From: "Medi Montaseri" <<a href="mailto:montaseri@gmail.com"> montaseri@gmail.com</a>>
<dd>>To:<a href="mailto:pgsql-sql@postgresql.org">pgsql-sql@postgresql.org</a><dd>>Subject: UTF8 encoding and
non-textdata types <dd>>Message-ID: <dd> ><<a
href="mailto:8078a1730801121421l1c9b90c1lc65d36cf6e752a6f@mail.gmail.com">
8078a1730801121421l1c9b90c1lc65d36cf6e752a6f@mail.gmail.com</a>><dd>> <dd>>I understand PG supports UTF-8
encodingand I have sucessfully <dd>>inserted <dd>>Unicode text into columns. I was wondering about other data
typessuch <dd>>as <dd>>numbers, decimal, dates <dd>> <dd>>That is, say I have a table t1 with
<dd>>createtable t1 { name text, cost decimal } <dd>>I can insert UTF8 text datatype into this table with no
problem<dd>>But if my application attempts to insert numbers encloded in UTF8, <dd>>then I <dd>>get wrong
datatypeerror <dd>> <dd>>Is the solution for the application layer (not database) to convert <dd>>the
<dd>>non-textUTF8 numbers to ASCII and then insert it into database ? <dd>> <dd>>Thanks <dd>>Medi<br
/><dd>HiMedi,<br /><dd>I have only limited experience in this area, but it sounds like you <dd>sending your numbers as
strings?In your example:<br /><dd>>create table t1 { name text, cost decimal }; <br /><dd>insert into t1 (name,
cost)values ('name1', '1');<br /><dd>I can't think of how else you're sending numeric values as UTF8? I know <dd>that
Pgwill accept numbers as strings and convert internally (that has <dd>worked for me in some object relational
environmentswhere I don't <dd>choose to cope with data types), but I think it would be better if you <dd>simply didn't
sendyour numeric data in quotations, whether as UTF8 or <dd>ASCII. If you don't have control over this layer (that
quotesyour <dd>values), then I'd say converting to ASCII would solve the problem. But <dd>better to convert to numeric
andnot ship quoted strings at all.<br /><dd>I may be totally off-base and missing something fundamental and I'm
<dd>veryopen to correction (by anyone), but that's what I can see here.<br /><dd>Best regards,<font color="#888888">
</font><dd>Steve</dl><dd>At 11:01 AM 1/14/2008, Medi Montaseri wrote:<br /><dd>Thanks Steve,<br /><br /><dd>Actually I
donot insert text data into my numeric field.<br /><dd>As I mentioned given <br /><dd>create table t1 { name text, cost
decimal}<br /><dd>then I would like to insert numeric data into column "cost" because then I can later benefit from
numericaloperators like SUM, AVG, etc <br /><br /><dd>More specifically, I am using HTML, Perl and PG. So from the HTML
pointof view a textfield is just some strings. So my user would enter 12345 but expressed in UTF8. Perl would get this
anduse DBI to insert it into PG <br /><br /><dd>What I am experiencing now is that DB errors that I am trying to insert
anincorrect data into column "cost" which is numeric and the data is coming in from HTML in UTF8<br /><br /><dd>Mybe I
haveto convert it to ASCII numbers in Perl before inserting  them into PG <br /><br /><dd>Thanks<br
/><dd>Medi</dl></blockquote><br/><dl><dd>Hi Medi,<br /><br /><dd>I agree that you should convert your values in Perl
beforehanding to DBI. I'm not familiar with DBI but presumably if you're sending it UTF8 values it's attempting to
quotethem or do something with them, that a numeric field in Pg can't handle. Can you trap/monitor the exact sql
statementthat is generated by DBI and sent to Pg? That would help a lot in knowing what it is doing, but I suspect if
youjust convert your numbers from the HTML/UTF8 source values into actual Perl numeric values and then ship to DBI
you'llbe better off. And you'll get some input validation for free.<br /><br /><dd>I hope this helps,<br /><br
/><dd>Steve<br/><br /></dl>Hi Medi,<br /><br /> That structure for numeric values is never going to work, as best as I
understandPostgres (and other sql pipes). You have to convert those UTF chars to straight numeric format. Hopefully
thatsolves your problem? I hope it's not too hard for you to get at the code which is sending the numbers as UTF?<br
/><br/> Steve<br /><br /><br /> 

В списке pgsql-sql по дате отправления:

Предыдущее
От: "Medi Montaseri"
Дата:
Сообщение: Re: UTF8 encoding and non-text data types
Следующее
От: Tom Lane
Дата:
Сообщение: Re: UTF8 encoding and non-text data types