Re: multi line text data/query ?bug?

Поиск
Список
Период
Сортировка
От Martijn van Oosterhout
Тема Re: multi line text data/query ?bug?
Дата
Msg-id 20050324101539.GA17934@svana.org
обсуждение исходный текст
Ответ на Re: multi line text data/query ?bug?  ("Sim Zacks" <sim@compulab.co.il>)
Ответы Re: multi line text data/query ?bug?  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
On Thu, Mar 24, 2005 at 09:24:11AM +0200, Sim Zacks wrote:
> The difference between a Tab and a newline is that tab is a universally
> recognized single ascii character while newline is in flux. Aside from this,
> a tab is a quasi-viewable character as the cursor will not go to the middle
> of the tab. Meaning if the tab takes up the space of 10 characters, you
> could not scroll to the place where the 5th character would be if it were in
> fact 10 spaces. You cannot highlight half of a tab in editors that allow
> text highlighting. I would therefore say that a tab is as visible as a space
> and can be easily differentiated. On the other hand, it is impossible to
> determine which binary charcters the editor stuck in at the end of a newline
> without looking at the binary/hex code.

Actually, Emacs has a space-through-tab mode in which you can just move
through a tab as if it were spaces. If you delete or insert it
automatically reforms the tabs and spaces around it. Several editors
have an auto-lineending mode in which they'll detect the end of line
character and apply that everywhere. Admittedly this is an extreme case
but editors regularly show things that don't reflect the underlying
file.

> I understand the complexity of dealing with multiple operating systems, but
> seriously, how many non-viewable characters can be embedded in text that
> actually make a difference between operating systems? Are there any besides
> newline?

Sure, the character 0xE9 means different things depending on the
encoding and will sort differently based on the locale. Text files
generally don't indicate what encoding they are, leading to all sorts
of confusion. Unix tends to use Latin1 or UTF8. Windows has it's own
encoding.

IMHO, if you're trying to write portably, don't just hit enter when you
want an end of line, use \n or \r to indicate *exactly* what you mean.
Using a variable behaviour and expecting the server to fix it for you
is wrong. I beleive the server should take exactly what the client
gives as the client is the only one who knows for sure the type.
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Вложения

В списке pgsql-general по дате отправления:

Предыдущее
От: Sergey Levchenko
Дата:
Сообщение: Re: postgresql unicode lower/upper problem
Следующее
От: William Shatner
Дата:
Сообщение: Re: MS Access to PostgreSQL