Re: Best way to index IP data?

Поиск
Список
Период
Сортировка
От Michael Stone
Тема Re: Best way to index IP data?
Дата
Msg-id 20080112015845.GD5294@mathom.us
обсуждение исходный текст
Ответ на Best way to index IP data?  (Kevin Kempter <kevin@kevinkempterllc.com>)
Список pgsql-performance
Tom wrote:
>There seem to be a number of people in this thread laboring under the
>illusion that we store a netmask as a mask.  It's a bit count (think
>/32 or /128) and occupies a whole one byte on disk.  Killer overhead,
>for sure.

There's no need to be quite so snarky. The netmask isn't the only part
of the overhead, but you did invite discussion of the netmask in
particular when you said "3 bytes overhead on a 16-byte address is not
'ridiculously bloated', *especially if you want a netmask with it*." You
might be hearing less about netmasks if you hadn't used them to justify
the size of the inet type. :-) There's also a number of issues being
conflated, as tends to happen when the pent up displeasure about inet
erupts on its semi-annual schedule. For myself, I mentioned two distinct
issues:

1) overhead (over absolute minimum required): 20% for IPv6 and *75%* for
IPv4. (In fairness, I actually am testing using inet for ipv6 tables, on
the assumption that I'll get another order of magnitude out of the
hardware before I really need high-volume ipv6 storage, and then it
really won't matter. But today, for the kind of addresses seen in the
real world, it really does matter. Also, recall that while you live in
the development version, those of us in the release world are dealing
with overheads of 50% for IPv6 and *200%* for IPv4. It'll take us a
while to recalibrate. :-)

2) ambiguity/completeness of data types (is it a host? is it a network?
what data type do I use if I really want to ensure that people don't
stick routing information into my host column?)

netmasks are kinda part of both, but they aren't the main point of
either.

As far as the hostility surrounding discussions of inet, I understand
and appreciate that there are reasons inet & cidr work the way they do,
and I find inet to be very useful in certain cases. But there are also
cases where they suck, and responses along the lines of "you should be
using ipv6 anyway" don't ease the pain any. :-) Responses like "it would
just be too much work to support seperate ipv4 & ipv6 data types" would
still suck :-) but at least they wouldn't be telling people that they're
imagining the problems that inet has had meeting their particular
requirements. And regardless of whether postgres gets a seperate ipv4
data type, I'd still like to have an "inethost" data type as a
complement to the "cidr" data type. :-)

Mike Stone

В списке pgsql-performance по дате отправления:

Предыдущее
От: "James DeMichele"
Дата:
Сообщение: Simple select, but takes long time
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Simple select, but takes long time