Обсуждение: Question about integer out of range in function
Hello ppl,
I have a question about ERROR: integer out of range in one function.
I modify the generate_ulid() function to accept also UNIX timestamp as
input parameter.
I drop old function and make new one:
CREATE FUNCTION generate_ulid(fromtime bigint default 0)
Then I declare two new variables:
rand_int INTEGER;
new_time BIGINT;
and then begin:
BEGIN
-- 6 timestamp bytes
IF fromtime = 0 THEN
unix_time = (EXTRACT(EPOCH FROM NOW()) * 1000)::BIGINT;
ELSE
rand_int = (random() * 1000)::INT;
new_time = fromtime * 1000; -- here is line 19
unix_time = (new_time + rand_int)::BIGINT;
END IF;
(Yes, I know it's can be one line, but this is for debug)
When I start the function I receive error:
test_db=# select generate_ulid(extract(epoch from now())::int);
ERROR: integer out of range
CONTEXT: PL/pgSQL function generate_ulid(integer) line 19 at assignment
If I modify line 19 to : new_time = (fromtime * 1000)::BIGINT;
Everything is work. Well, until I write the email I figured out, but I
don't know is this normal behavior or just a problem.
I think, this is happened because I send INT to function
generate_ulid(extract(epoch from now())::int) but in function I expect
this to be BIGINT and my variable is cast automatic to INT.
My question is this normal and should input param not be treated as
bigint automatic as defined ? Did I can change whit this way input type
to other functions for example
get_random_bytes(34423423423423423424234::BIGINT) ?
Version: PostgreSQL 13.2 on x86_64-slackware-linux-gnu, compiled by
x86_64-slackware-linux-gcc (GCC) 10.3.0, 64-bit
Regards,
HS
On 5/14/21 1:38 AM, Condor wrote: > > Hello ppl, > > I have a question about ERROR: integer out of range in one function. > I modify the generate_ulid() function to accept also UNIX timestamp as > input parameter. > > I drop old function and make new one: > > CREATE FUNCTION generate_ulid(fromtime bigint default 0) > > Then I declare two new variables: > > rand_int INTEGER; > new_time BIGINT; > > and then begin: > > BEGIN > -- 6 timestamp bytes > IF fromtime = 0 THEN > unix_time = (EXTRACT(EPOCH FROM NOW()) * 1000)::BIGINT; > ELSE > rand_int = (random() * 1000)::INT; > new_time = fromtime * 1000; -- here is line 19 > unix_time = (new_time + rand_int)::BIGINT; > END IF; > > (Yes, I know it's can be one line, but this is for debug) > > When I start the function I receive error: > > test_db=# select generate_ulid(extract(epoch from now())::int); > ERROR: integer out of range > CONTEXT: PL/pgSQL function generate_ulid(integer) line 19 at assignment > > If I modify line 19 to : new_time = (fromtime * 1000)::BIGINT; > > Everything is work. Well, until I write the email I figured out, but I > don't know is this normal behavior or just a problem. > > I think, this is happened because I send INT to function > generate_ulid(extract(epoch from now())::int) but in function I expect > this to be BIGINT and my variable is cast automatic to INT. > > My question is this normal and should input param not be treated as bigint > automatic as defined ? Did I can change whit this way input type to other > functions for example get_random_bytes(34423423423423423424234::BIGINT) ? What is fromtime? Since MAX_INT is 2147483648, and you're multiplying fromtime by 1000, the largest that fromtime can be is 2147483 without some INTEGER variable (possibly internal) overflowing. > Version: PostgreSQL 13.2 on x86_64-slackware-linux-gnu, compiled by > x86_64-slackware-linux-gcc (GCC) 10.3.0, 64-bit -- Angular momentum makes the world go 'round.
Condor <condor@stz-bg.com> writes:
> new_time = fromtime * 1000; -- here is line 19
Are you entirely certain that you counted lines correctly?
If new_time and fromtime are both declared bigint, and
fromtime is on the order of
# select extract(epoch from now())::bigint;
extract
------------
1621003124
(1 row)
then there's no way that that statement fails on integer overflow.
What I'm wondering about is the next line:
> unix_time = (new_time + rand_int)::BIGINT;
You've not shown us the declaration of unix_time ...
regards, tom lane
On 14-05-2021 17:42, Tom Lane wrote: > Condor <condor@stz-bg.com> writes: >> new_time = fromtime * 1000; -- here is line 19 > > Are you entirely certain that you counted lines correctly? > If new_time and fromtime are both declared bigint, and > fromtime is on the order of > > # select extract(epoch from now())::bigint; > extract > ------------ > 1621003124 > (1 row) > > then there's no way that that statement fails on integer overflow. > > What I'm wondering about is the next line: > >> unix_time = (new_time + rand_int)::BIGINT; > > You've not shown us the declaration of unix_time ... > > regards, tom lane Sorry, I was thinking I get the function from contrib and I think you guys know it. unix_time is BIGINT but I make a few more changes and now can't restore old function to reproduce the problem. I will try to recover the code later. Regards, HS
On 14-05-2021 17:42, Tom Lane wrote:
> Condor <condor@stz-bg.com> writes:
>> new_time = fromtime * 1000; -- here is line 19
>
> Are you entirely certain that you counted lines correctly?
> If new_time and fromtime are both declared bigint, and
> fromtime is on the order of
>
> # select extract(epoch from now())::bigint;
> extract
> ------------
> 1621003124
> (1 row)
>
> then there's no way that that statement fails on integer overflow.
>
> What I'm wondering about is the next line:
>
>> unix_time = (new_time + rand_int)::BIGINT;
>
> You've not shown us the declaration of unix_time ...
>
> regards, tom lane
Hello Tom,
I can't figure out where the problem is. The only explanation I can
guess is that my expectations are plpgsql will sum variables like asm or
c, well .. let me explain.
CREATE FUNCTION generate_ulid(fromtime int default 0) -- Yes is INT,
that is the problem. Explain bellow.
RETURNS TEXT
AS $$
DECLARE
unix_time BIGINT;
ulid BYTEA;
rand_int INTEGER;
new_time BIGINT;
BEGIN
-- 6 timestamp bytes
IF fromtime = 0 THEN
unix_time = (EXTRACT(EPOCH FROM NOW()) * 1000)::BIGINT;
ELSE
rand_int = (random() * 1000)::INT;
new_time = fromtime * 1000; -- here is line 19
unix_time = (new_time + rand_int)::BIGINT;
END IF;
new_time is bigint, fromtime is int so I expect to be like assembly or c
:
new_time = fromtime;
new_time * 1000;
/*
mov %rip, %eax
imul $0x3e8, %eax, %eax
mov %eax, %rip
*/
but I guest it's seems plpgsql do:
fromtime = fromtime * 1000;
new_time = fromtime;
Here is the complete func:
CREATE EXTENSION IF NOT EXISTS pgcrypto;
DROP FUNCTION IF EXISTS generate_ulid(int) CASCADE;
DROP FUNCTION IF EXISTS generate_ulid(BIGINT) CASCADE;
DROP FUNCTION IF EXISTS generate_ulid() CASCADE;
CREATE FUNCTION generate_ulid(fromtime int default 0)
RETURNS TEXT
AS $$
DECLARE
-- Crockford's Base32
encoding BYTEA = '0123456789ABCDEFGHJKMNPQRSTVWXYZ';
timestamp BYTEA = E'\\000\\000\\000\\000\\000\\000';
output TEXT = '';
unix_time BIGINT;
ulid BYTEA;
rand_int INTEGER;
new_time BIGINT;
BEGIN
-- 6 timestamp bytes
IF fromtime = 0 THEN
unix_time = (EXTRACT(EPOCH FROM NOW()) * 1000)::BIGINT;
ELSE
rand_int = (random() * 1000)::INT;
new_time = fromtime * 1000; -- I think this is the broken line
unix_time = (new_time + rand_int)::BIGINT;
-- the line will look like this after debug purpose: unix_time =
((fromtime::BIGINT * 1000) + (random() * 1000)::INT)::BIGINT;
END IF;
timestamp = SET_BYTE(timestamp, 0, (unix_time >>
40)::BIT(8)::INTEGER);
timestamp = SET_BYTE(timestamp, 1, (unix_time >>
32)::BIT(8)::INTEGER);
timestamp = SET_BYTE(timestamp, 2, (unix_time >>
24)::BIT(8)::INTEGER);
timestamp = SET_BYTE(timestamp, 3, (unix_time >>
16)::BIT(8)::INTEGER);
timestamp = SET_BYTE(timestamp, 4, (unix_time >> 8)::BIT(8)::INTEGER);
timestamp = SET_BYTE(timestamp, 5, unix_time::BIT(8)::INTEGER);
-- 10 entropy bytes
ulid = timestamp || gen_random_bytes(10);
-- Encode the timestamp
output = output || CHR(GET_BYTE(encoding, (GET_BYTE(ulid, 0) & 224) >>
5));
output = output || CHR(GET_BYTE(encoding, (GET_BYTE(ulid, 0) & 31)));
output = output || CHR(GET_BYTE(encoding, (GET_BYTE(ulid, 1) & 248) >>
3));
output = output || CHR(GET_BYTE(encoding, ((GET_BYTE(ulid, 1) & 7) <<
2) | ((GET_BYTE(ulid, 2) & 192) >> 6)));
output = output || CHR(GET_BYTE(encoding, (GET_BYTE(ulid, 2) & 62) >>
1));
output = output || CHR(GET_BYTE(encoding, ((GET_BYTE(ulid, 2) & 1) <<
4) | ((GET_BYTE(ulid, 3) & 240) >> 4)));
output = output || CHR(GET_BYTE(encoding, ((GET_BYTE(ulid, 3) & 15) <<
1) | ((GET_BYTE(ulid, 4) & 128) >> 7)));
output = output || CHR(GET_BYTE(encoding, (GET_BYTE(ulid, 4) & 124) >>
2));
output = output || CHR(GET_BYTE(encoding, ((GET_BYTE(ulid, 4) & 3) <<
3) | ((GET_BYTE(ulid, 5) & 224) >> 5)));
output = output || CHR(GET_BYTE(encoding, (GET_BYTE(ulid, 5) & 31)));
-- Encode the entropy
output = output || CHR(GET_BYTE(encoding, (GET_BYTE(ulid, 6) & 248) >>
3));
output = output || CHR(GET_BYTE(encoding, ((GET_BYTE(ulid, 6) & 7) <<
2) | ((GET_BYTE(ulid, 7) & 192) >> 6)));
output = output || CHR(GET_BYTE(encoding, (GET_BYTE(ulid, 7) & 62) >>
1));
output = output || CHR(GET_BYTE(encoding, ((GET_BYTE(ulid, 7) & 1) <<
4) | ((GET_BYTE(ulid, 8) & 240) >> 4)));
output = output || CHR(GET_BYTE(encoding, ((GET_BYTE(ulid, 8) & 15) <<
1) | ((GET_BYTE(ulid, 9) & 128) >> 7)));
output = output || CHR(GET_BYTE(encoding, (GET_BYTE(ulid, 9) & 124) >>
2));
output = output || CHR(GET_BYTE(encoding, ((GET_BYTE(ulid, 9) & 3) <<
3) | ((GET_BYTE(ulid, 10) & 224) >> 5)));
output = output || CHR(GET_BYTE(encoding, (GET_BYTE(ulid, 10) & 31)));
output = output || CHR(GET_BYTE(encoding, (GET_BYTE(ulid, 11) & 248)
>> 3));
output = output || CHR(GET_BYTE(encoding, ((GET_BYTE(ulid, 11) & 7) <<
2) | ((GET_BYTE(ulid, 12) & 192) >> 6)));
output = output || CHR(GET_BYTE(encoding, (GET_BYTE(ulid, 12) & 62) >>
1));
output = output || CHR(GET_BYTE(encoding, ((GET_BYTE(ulid, 12) & 1) <<
4) | ((GET_BYTE(ulid, 13) & 240) >> 4)));
output = output || CHR(GET_BYTE(encoding, ((GET_BYTE(ulid, 13) & 15)
<< 1) | ((GET_BYTE(ulid, 14) & 128) >> 7)));
output = output || CHR(GET_BYTE(encoding, (GET_BYTE(ulid, 14) & 124)
>> 2));
output = output || CHR(GET_BYTE(encoding, ((GET_BYTE(ulid, 14) & 3) <<
3) | ((GET_BYTE(ulid, 15) & 224) >> 5)));
output = output || CHR(GET_BYTE(encoding, (GET_BYTE(ulid, 15) & 31)));
RETURN output;
END
$$
LANGUAGE plpgsql
VOLATILE;
Regards,
HS
On Sunday, May 16, 2021, Condor <condor@stz-bg.com> wrote:
new_time = fromtime * 1000; -- here is line 19
An integer times an integer results in an integer. Period. Neither fromtime nor new_time have been assigned to yet, the in-memory result of the computation is only allocated integer bits and if you overflow that you get an error. If there is no error the result of that computation is stored in new_time. Since new_time is a bigint during assignment the in-memory integer is implicitly converted to bigint to match the assignment type.
Either fromtime or the 1000 need to be declared as bigint if you want the computation type to be bigint as well, and thus avoid the overflow. The implicit cast to store into new_time goes away. And whichever, if any, of the two values you do not change to bigint gets implicitly cast to bigint in order for the multiplication operator to match “*(bigint,bigint)”.
David J.
On 2021-05-16 11:09:38 -0700, David G. Johnston wrote:
> On Sunday, May 16, 2021, Condor <condor@stz-bg.com> wrote:
>
>
> new_time = fromtime * 1000; -- here is line 19
>
>
>
> An integer times an integer results in an integer. Period. Neither fromtime
> nor new_time have been assigned to yet, the in-memory result of the computation
> is only allocated integer bits and if you overflow that you get an error.
This is also true in C, btw, except that in C an overflow of a signed
int is undefined behaviour while an unsigned int is required to wrap
around.
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"