Обсуждение: Is there any plan to add unsigned integer types?

Поиск
Список
Период
Сортировка

Is there any plan to add unsigned integer types?

От
crocket
Дата:
MySQL already has unsigned INT type, and it has double the range of
signed INT type.
It's not just the bigger range that UINT type brings.
If unsigned INT type exists, I wouldn't have to execute "create domain
UINT" in every database.

If "INT unsigned" and "SERIAL unsigned" exist, PostgreSQL would bring
more convenience to users.


Re: Is there any plan to add unsigned integer types?

От
Peter Eisentraut
Дата:
On mån, 2011-09-26 at 19:41 +0900, crocket wrote:
> MySQL already has unsigned INT type, and it has double the range of
> signed INT type.
> It's not just the bigger range that UINT type brings.
> If unsigned INT type exists, I wouldn't have to execute "create domain
> UINT" in every database.
> 
> If "INT unsigned" and "SERIAL unsigned" exist, PostgreSQL would bring
> more convenience to users.
> 

I believe there have been many discussions about this in the past,
outlining the various issues that would come with this project.  A first
step would be to start implementing this in user space and see how much
breaks.




Re: Is there any plan to add unsigned integer types?

От
Merlin Moncure
Дата:
On Mon, Sep 26, 2011 at 5:41 AM, crocket <crockabiscuit@gmail.com> wrote:
> MySQL already has unsigned INT type, and it has double the range of
> signed INT type.
> It's not just the bigger range that UINT type brings.
> If unsigned INT type exists, I wouldn't have to execute "create domain
> UINT" in every database.
>
> If "INT unsigned" and "SERIAL unsigned" exist, PostgreSQL would bring
> more convenience to users.

This comes up now and then.  The problem is the benefit gained is not
really worth the pain.  In today's 64 bit world, choosing a 64 bit int
nails the cases where you need the extra range and you have the
ability to use constraints (if necessary, through a domain) to enforce
correctness.

On the downside, you have to add a lot of casts, overloads, etc.
Figuring out the casting rules is non trivial and could lead to
surprising behaviors...inferring the type of 'unknown' strings is bad
enough as it is.

TBH, what I'd greatly prefer to see is to have domains be finished up
so that you don't have to carefully consider their use (for example,
you can't make arrays out of them).  Then an unsigned int could simply
be:

create domain uint as bigint check (value >= 0);

merlin


Re: Is there any plan to add unsigned integer types?

От
Robert Haas
Дата:
On Mon, Sep 26, 2011 at 10:02 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Mon, Sep 26, 2011 at 5:41 AM, crocket <crockabiscuit@gmail.com> wrote:
>> MySQL already has unsigned INT type, and it has double the range of
>> signed INT type.
>> It's not just the bigger range that UINT type brings.
>> If unsigned INT type exists, I wouldn't have to execute "create domain
>> UINT" in every database.
>>
>> If "INT unsigned" and "SERIAL unsigned" exist, PostgreSQL would bring
>> more convenience to users.
>
> This comes up now and then.  The problem is the benefit gained is not
> really worth the pain.  In today's 64 bit world, choosing a 64 bit int
> nails the cases where you need the extra range and you have the
> ability to use constraints (if necessary, through a domain) to enforce
> correctness.
>
> On the downside, you have to add a lot of casts, overloads, etc.
> Figuring out the casting rules is non trivial and could lead to
> surprising behaviors...inferring the type of 'unknown' strings is bad
> enough as it is.
>
> TBH, what I'd greatly prefer to see is to have domains be finished up
> so that you don't have to carefully consider their use (for example,
> you can't make arrays out of them).  Then an unsigned int could simply
> be:
>
> create domain uint as bigint check (value >= 0);

Even if we did that, there might still be cases where people would
want unsigned integers as a means of reducing storage.  4 extra bytes
may not seem like that much, but if you have billions of rows, it adds
up - not just in terms of actual storage space, but also in terms of
disk and memory bandwidth requirements when you want to do anything
with that data.  I have seen some recent data (which is not entirely
conclusive) that suggests that memory bandwidth can be a huge problem
for PostgreSQL performance on large boxes; and I think Greg Smith has
made similar comments in the past (correct me if I'm wrong, Greg).

I think, though, that if we choose to attack that problem in the first
instance by adding support for unsigned integers, we're probably going
to be only nibbling around the edges of the problem.  Reducing
alignment padding and adding block-level compression would benefit a
much larger number of workloads.  Those are not easy projects, but
unfortunately, due to the constraints of our type system, neither is
this.  :-(

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Is there any plan to add unsigned integer types?

От
Merlin Moncure
Дата:
On Mon, Sep 26, 2011 at 9:21 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Sep 26, 2011 at 10:02 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> On Mon, Sep 26, 2011 at 5:41 AM, crocket <crockabiscuit@gmail.com> wrote:
>>> MySQL already has unsigned INT type, and it has double the range of
>>> signed INT type.
>>> It's not just the bigger range that UINT type brings.
>>> If unsigned INT type exists, I wouldn't have to execute "create domain
>>> UINT" in every database.
>>>
>>> If "INT unsigned" and "SERIAL unsigned" exist, PostgreSQL would bring
>>> more convenience to users.
>>
>> This comes up now and then.  The problem is the benefit gained is not
>> really worth the pain.  In today's 64 bit world, choosing a 64 bit int
>> nails the cases where you need the extra range and you have the
>> ability to use constraints (if necessary, through a domain) to enforce
>> correctness.
>>
>> On the downside, you have to add a lot of casts, overloads, etc.
>> Figuring out the casting rules is non trivial and could lead to
>> surprising behaviors...inferring the type of 'unknown' strings is bad
>> enough as it is.
>>
>> TBH, what I'd greatly prefer to see is to have domains be finished up
>> so that you don't have to carefully consider their use (for example,
>> you can't make arrays out of them).  Then an unsigned int could simply
>> be:
>>
>> create domain uint as bigint check (value >= 0);
>
> Even if we did that, there might still be cases where people would
> want unsigned integers as a means of reducing storage.  4 extra bytes
> may not seem like that much, but if you have billions of rows, it adds
> up - not just in terms of actual storage space, but also in terms of
> disk and memory bandwidth requirements when you want to do anything
> with that data.  I have seen some recent data (which is not entirely
> conclusive) that suggests that memory bandwidth can be a huge problem
> for PostgreSQL performance on large boxes; and I think Greg Smith has
> made similar comments in the past (correct me if I'm wrong, Greg).
>
> I think, though, that if we choose to attack that problem in the first
> instance by adding support for unsigned integers, we're probably going
> to be only nibbling around the edges of the problem.  Reducing
> alignment padding and adding block-level compression would benefit a
> much larger number of workloads.  Those are not easy projects, but
> unfortunately, due to the constraints of our type system, neither is
> this.  :-(

right -- exactly.  most 'savings' in this vein are nothing but due to
padding and other factors such that (at least today) there is no
disadvantage to going to 64 bit in range constrained cases.  also, I'd
submit history has been unkind to hardware dependent optimization
strategies in userland -- the engine should be dealing with this
problem.  better to define your data the proper way and let the
assembly instruction counting gurus in -hackers worry about it :-).

compression is an interesting topic: the guys over at tokudb are
making some wild claims...i'm curious if they are real, and what the
real tradeoffs are.

merlin


Re: Is there any plan to add unsigned integer types?

От
Leonardo Francalanci
Дата:
 

> compression is an interesting topic: the guys over at tokudb are
> making some wild claims...i'm curious if they are real, and what the
> real tradeoffs are.


I don't know how much of the performance they claim comes from
compression and how much from the different indexing technique they
use (see the my post here, where nobody answered...

http://archives.postgresql.org/pgsql-general/2011-09/msg00615.php


)