Re: PostgreSQL Top 10 Wishlist

Поиск
Список
Период
Сортировка
От Chris Browne
Тема Re: PostgreSQL Top 10 Wishlist
Дата
Msg-id 60d5iqert2.fsf@dba2.int.libertyrms.com
обсуждение исходный текст
Ответ на PostgreSQL Top 10 Wishlist  ("rlee0001" <robeddielee@hotmail.com>)
Ответы Re: PostgreSQL Top 10 Wishlist  ("Jim C. Nasby" <jnasby@pervasive.com>)
Список pgsql-general
jnasby@pervasive.com ("Jim C. Nasby") writes:
> On Mon, Jan 16, 2006 at 12:13:15PM -0500, Chris Browne wrote:
>> What you seem to be after, here, would confine your telno formatting
>> to telephone numbers for Canada and the United States, and would break
>> any time people have a need to express telephone numbers outside those
>> two countries.
>>
>> It would be quite interesting to add an EB164 type, as it could
>> represent phone numbers considerably more compactly than is the case
>> for plain strings.  The 20 digits permissible across 1. and 2. could
>> be encoded in... 68 bits.
>
> And it would be trivial to provide functions to map that into the
> customary format for various countries. In fact, since there's 4 bits
> left over, it might be possible to encode the formatting used for the
> number in the storage itself.
>
> BTW, you sure about 68 bits? That doesn't seem to allow for a full 20
> digit number; or are there restrictions on the max value for one of the
> fields?

Hmm.

 [3]> (let ((ttl 1)) (loop
           for i from 1 to 68
           do (setf ttl (* ttl 2))
           do (format t "n=~D  2^n=~D~%" i ttl)))
n=1  2^n=2
n=2  2^n=4
n=3  2^n=8
n=4  2^n=16
n=5  2^n=32
n=6  2^n=64
n=7  2^n=128
n=8  2^n=256
n=9  2^n=512
... boring bits elided :-) ...
n=60  2^n=1152921504606846976
n=61  2^n=2305843009213693952
n=62  2^n=4611686018427387904
n=63  2^n=9223372036854775808
n=64  2^n=18446744073709551616
n=65  2^n=36893488147419103232
n=66  2^n=73786976294838206464
n=67  2^n=147573952589676412928
n=68  2^n=295147905179352825856
NIL

Actually, we pass 10^20 at 2^67, so I went 1 bit too far.  We could,
in principle, get 20 digits in 67 bits.  Not that this is necessarily
the wisest move.

That's a bit over 8 bytes, which is certainly more space-efficient
than 20 bytes.  It's certainly an open question whether that
efficiency is actually worth anything.

My sense is that it would be better to use something equivalent to BCD
(which is what we do with NUMERIC), packing two digits per byte, and
have two "segments" to the telno, a country code segment, and a "local
number" segment.  Using BCD, this would occupy 10 bytes, which, by
magical happenstance, is the same size as a US/Canada phone number.

The real questions are how to deal with:

a) Parsing incoming data, since formats people use vary so inanely

   Note: I spent most of yesterday dealing with this very issue,
   writing up a total of 31 eye-destroying regular expressions to
   generate a pl/tcl function to parse cases that I had handy...  I
   daresay that even with so many regexps, I was _still_ left with a
   reasonably material number that would not be parsed...

b) What sorts of extra operators would be interesting to add in.

   Extracting country code is an obvious thing.  Applying formatting
   rules based on country code is another.
--
let name="cbbrowne" and tld="ntlug.org" in name ^ "@" ^ tld;;
http://www.ntlug.org/~cbbrowne/linux.html
Rules of the Evil Overlord #35. "I  will not grow a goatee. In the old
days they made  you look diabolic. Now they just make  you look like a
disaffected member of Generation X." <http://www.eviloverlord.com/>

В списке pgsql-general по дате отправления:

Предыдущее
От: "A. Kretschmer"
Дата:
Сообщение: Re: LIMIT + OFFSET
Следующее
От: Michael Fuhr
Дата:
Сообщение: Re: Is there a way to list runaway queries and kill them?