Обсуждение: INTERVAL overflow detection is terribly broken
Hi, after studying ITERVAL and having a long chat with RhoidumToad and StuckMojo on #postgresql, I am presenting you 3 bugs regarding INTERVAL.
As far as I understand, the Interval struct (binary internal representation) consists of:
Even wireder:
SELECT INTERVAL '2147483647 hours' + '1 hour'
"--2147483648:00:00"
notice the double minus? Don't ask how I came across this two bugs.
2. OPERATION ERRORS: When summing two intervals, the user is not notified when overflow occurs:
int32 months
int32 days
int64 microseconds
SELECT INTERVAL '2147483647 hours' + INTERVAL '5 hours'
"-2147483644:00:00"
1. OUTPUT ERRORS: Since 2^63 microseconds equals 2,562,047,788 > 2^31 hours, the overflow in pg_tm when displaying the value causes overflow. The value of Interval struct is actually correct, error happens only on displaying it.
"-2147483644:00:00"
Even wireder:
SELECT INTERVAL '2147483647 hours' + '1 hour'
"--2147483648:00:00"
notice the double minus? Don't ask how I came across this two bugs.
2. OPERATION ERRORS: When summing two intervals, the user is not notified when overflow occurs:
SELECT INT '2147483647' + INT '1'
ERROR: integer out of range
ERROR: integer out of range
SELECT INTERVAL '2147483647 days' + INTERVAL '1 day'
"-2147483648 days"
"-2147483648 days"
This should be analogous.
3. PARSER / INPUT ERRORS:
This is perhaps the hardest one to explain, since this is an architectural flaw. You are checking the overflows when parsing string -> pg_tm struct. However, at this point, the parser doesn't know, that weeks and days are going to get combined, or years are going to get converted to months, for example.
Unawarness of underlying Interval struct causes two types of suberrors:
a) False positive
SELECT INTERVAL '2147483648 microseconds'
ERROR: interval field value out of range: "2147483648 microseconds"
This is not right. Microseconds are internally stored as 64 bit signed integer. The catch is: this amount of microseconds is representable in Interval data structure, this shouldn't be an error.
b) False negative
SELECT INTERVAL '1000000000 years'
"-73741824 years"
We don't catch errors like this, because parser only checks for overflow in pg_tm. If overflow laters happens in Interval, we don't seem to care.
4. POSSIBLE SOLUTIONS:
a) Move the overflow checking just after the conversion of pg_tm -> Interval is made. This way, you can accurately predict if the result is really not store-able.
b) Because of 1), you have to declare tm_hour as int64, if you want to use that for the output. But, why not use Interval struct for printing directly, without intermediate pg_tm?
5. Offtopic: Correct the documentation, INTERVAL data size is 16 bytes, not 12.
Rok Kralj
So, any insights on these problems?
They might not be critical, but might be silently corrupting someone's data.
2013/6/23 Rok Kralj <rok.kralj@gmail.com>
Hi, after studying ITERVAL and having a long chat with RhoidumToad and StuckMojo on #postgresql, I am presenting you 3 bugs regarding INTERVAL.As far as I understand, the Interval struct (binary internal representation) consists of:int32 monthsint32 daysint64 microseconds1. OUTPUT ERRORS: Since 2^63 microseconds equals 2,562,047,788 > 2^31 hours, the overflow in pg_tm when displaying the value causes overflow. The value of Interval struct is actually correct, error happens only on displaying it.SELECT INTERVAL '2147483647 hours' + INTERVAL '5 hours'
"-2147483644:00:00"
Even wireder:
SELECT INTERVAL '2147483647 hours' + '1 hour'
"--2147483648:00:00"
notice the double minus? Don't ask how I came across this two bugs.
2. OPERATION ERRORS: When summing two intervals, the user is not notified when overflow occurs:SELECT INT '2147483647' + INT '1'
ERROR: integer out of rangeSELECT INTERVAL '2147483647 days' + INTERVAL '1 day'
"-2147483648 days"This should be analogous.3. PARSER / INPUT ERRORS:This is perhaps the hardest one to explain, since this is an architectural flaw. You are checking the overflows when parsing string -> pg_tm struct. However, at this point, the parser doesn't know, that weeks and days are going to get combined, or years are going to get converted to months, for example.Unawarness of underlying Interval struct causes two types of suberrors:a) False positiveSELECT INTERVAL '2147483648 microseconds'ERROR: interval field value out of range: "2147483648 microseconds"This is not right. Microseconds are internally stored as 64 bit signed integer. The catch is: this amount of microseconds is representable in Interval data structure, this shouldn't be an error.b) False negativeSELECT INTERVAL '1000000000 years'"-73741824 years"We don't catch errors like this, because parser only checks for overflow in pg_tm. If overflow laters happens in Interval, we don't seem to care.4. POSSIBLE SOLUTIONS:a) Move the overflow checking just after the conversion of pg_tm -> Interval is made. This way, you can accurately predict if the result is really not store-able.b) Because of 1), you have to declare tm_hour as int64, if you want to use that for the output. But, why not use Interval struct for printing directly, without intermediate pg_tm?5. Offtopic: Correct the documentation, INTERVAL data size is 16 bytes, not 12.Rok Kralj
eMail: rok.kralj@gmail.com
On Sun, Jun 23, 2013 at 03:00:59PM +0200, Rok Kralj wrote: > Hi, after studying ITERVAL and having a long chat with RhoidumToad and > StuckMojo on #postgresql, I am presenting you 3 bugs regarding INTERVAL. OK, I am going to merge this with the previous report/patch which fixes: SELECT INTERVAL '2000000000 years'; ERROR: interval out of range LINE 1: SELECT INTERVAL '2000000000 years'; and SELECT INTERVAL '2000000000-3 years'; ERROR: interval field value out of range: "2000000000-3 years" LINE 1: SELECT INTERVAL '2000000000-3 years'; > As far as I understand, the Interval struct (binary internal representation) > consists of: > > int32 months > int32 days > int64 microseconds > > 1. OUTPUT ERRORS: Since 2^63 microseconds equals 2,562,047,788 > 2^31 hours, > the overflow in pg_tm when displaying the value causes overflow. The value of > Interval struct is actually correct, error happens only on displaying it. > > SELECT INTERVAL '2147483647 hours' + INTERVAL '5 hours' > "-2147483644:00:00" Fixed: SELECT INTERVAL '2147483647 hours' + INTERVAL '5 hours'; ERROR: interval out of range > Even wireder: > > SELECT INTERVAL '2147483647 hours' + '1 hour' > "--2147483648:00:00" Fixed: SELECT INTERVAL '2147483647 hours' + '1 hour'; ERROR: interval out of range > notice the double minus? Don't ask how I came across this two bugs. > > 2. OPERATION ERRORS: When summing two intervals, the user is not notified when > overflow occurs: > > SELECT INT '2147483647' + INT '1' > ERROR: integer out of range > > SELECT INTERVAL '2147483647 days' + INTERVAL '1 day' > "-2147483648 days" > > This should be analogous. Fixed: SELECT INTERVAL '2147483647 days' + INTERVAL '1 day'; ERROR: interval out of range > 3. PARSER / INPUT ERRORS: > > This is perhaps the hardest one to explain, since this is an architectural > flaw. You are checking the overflows when parsing string -> pg_tm struct. > However, at this point, the parser doesn't know, that weeks and days are going > to get combined, or years are going to get converted to months, for example. > > Unawarness of underlying Interval struct causes two types of suberrors: > > a) False positive > > SELECT INTERVAL '2147483648 microseconds' > ERROR: interval field value out of range: "2147483648 microseconds" > > This is not right. Microseconds are internally stored as 64 bit signed integer. > The catch is: this amount of microseconds is representable in Interval data > structure, this shouldn't be an error. I don't see a way to fix this as we are testing too early to know what type of value it is, as you stated. > b) False negative > > SELECT INTERVAL '1000000000 years' > "-73741824 years" > > We don't catch errors like this, because parser only checks for overflow in > pg_tm. If overflow laters happens in Interval, we don't seem to care. Fixed: SELECT INTERVAL '1000000000 years'; ERROR: interval out of range LINE 1: SELECT INTERVAL '1000000000 years'; > 4. POSSIBLE SOLUTIONS: > > a) Move the overflow checking just after the conversion of pg_tm -> Interval is > made. This way, you can accurately predict if the result is really not > store-able. > > b) Because of 1), you have to declare tm_hour as int64, if you want to use that > for the output. But, why not use Interval struct for printing directly, without > intermediate pg_tm? > > 5. Offtopic: Correct the documentation, INTERVAL data size is 16 bytes, not 12. Fixed. Patch attached. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Вложения
On Jan26, 2014, at 03:50 , Bruce Momjian <bruce@momjian.us> wrote: > Patch attached. > + if ((float)tm->tm_year * MONTHS_PER_YEAR + tm->tm_mon > INT_MAX) > + return -1; Is this bullet-proof? If float and int are both 32-bit, the float's mantissa will be less than 32-bit (24 or so, I think), and thus won't be able to represent INT_MAX accurately. This means there's a value of tm_year * MONTHS_PER_YEAR which is less than INT_MAX, yet for which tm_year * MONTHS_PER_YEAR + tm_mon will return tm_year * MONTHS_PER_YEAR unmodified if tm_mon is small enough (e.g, 1). But if tm_year * MONTHS_PER_YEAR was close enough to INT_MAX, the actual value of tm_year * MONTHS_PER_YEAR + tm_mon might still be larger than INT_MAX, the floating-point computation just won't detect it. In that case, the check would succeed, yet the actual integer computation would still overflow. It should be safe with "double" instead of "float". > + if (SAMESIGN(span1->month, span2->month) && > + !SAMESIGN(result->month, span1->month)) This assumes wrapping semantics for signed integral types, which isn't guaranteed by the C standard - it says overflows of signed integral types produce undefined results. We currently depend on wrapping semantics for these types in more places, and therefore need GCC's "-fwrapv" anway, but I still wonder if adding more of these kinds of checks is a good idea. best regards, Florian Pflug
On Sun, Jan 26, 2014 at 02:13:03PM +0100, Florian Pflug wrote: > On Jan26, 2014, at 03:50 , Bruce Momjian <bruce@momjian.us> wrote: > > Patch attached. > > > + if ((float)tm->tm_year * MONTHS_PER_YEAR + tm->tm_mon > INT_MAX) > > + return -1; > > Is this bullet-proof? If float and int are both 32-bit, the float's mantissa > will be less than 32-bit (24 or so, I think), and thus won't be able to > represent INT_MAX accurately. This means there's a value of > tm_year * MONTHS_PER_YEAR which is less than INT_MAX, yet for which > tm_year * MONTHS_PER_YEAR + tm_mon will return tm_year * MONTHS_PER_YEAR > unmodified if tm_mon is small enough (e.g, 1). But if tm_year * MONTHS_PER_YEAR > was close enough to INT_MAX, the actual value of > tm_year * MONTHS_PER_YEAR + tm_mon might still be larger than INT_MAX, > the floating-point computation just won't detect it. In that case, the > check would succeed, yet the actual integer computation would still overflow. > > It should be safe with "double" instead of "float". You are correct. I have adjusted the code to use "double". > > + if (SAMESIGN(span1->month, span2->month) && > > + !SAMESIGN(result->month, span1->month)) > > This assumes wrapping semantics for signed integral types, which isn't > guaranteed by the C standard - it says overflows of signed integral types > produce undefined results. We currently depend on wrapping semantics for > these types in more places, and therefore need GCC's "-fwrapv" anway, but > I still wonder if adding more of these kinds of checks is a good idea. Well, I copied this from int.c, so what I did was to mention the other function I got this code from. If we ever change int.c we would need to change this too. The updated attached patch has overflow detection for interval subtraction, multiply, and negate. There are also some comparison cleanups. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Вложения
On Mon, Jan 27, 2014 at 04:47:22PM -0500, Bruce Momjian wrote: > The updated attached patch has overflow detection for interval > subtraction, multiply, and negate. There are also some comparison > cleanups. Oh, one odd thing about this patch. I found I needed to use INT64_MAX, but I don't see it used anywhere else in our codebase. Is this OK? Is there a better way? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Bruce Momjian <bruce@momjian.us> writes: > Oh, one odd thing about this patch. I found I needed to use INT64_MAX, > but I don't see it used anywhere else in our codebase. Is this OK? Is > there a better way? Most of the overflow tests in int.c and int8.c are coded to avoid relying on the MIN or MAX constants; which seemed like better style at the time. I'm not sure whether relying on INT64_MAX to exist is portable. regards, tom lane
On Mon, Jan 27, 2014 at 07:19:21PM -0500, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > Oh, one odd thing about this patch. I found I needed to use INT64_MAX, > > but I don't see it used anywhere else in our codebase. Is this OK? Is > > there a better way? > > Most of the overflow tests in int.c and int8.c are coded to avoid relying > on the MIN or MAX constants; which seemed like better style at the time. Yes, I looked at those but they seemed like overkill for interval. For a case where there was an int64 multiplied by a double, then cast back to an int64, I checked the double against INT64_MAX before casting to an int64. > I'm not sure whether relying on INT64_MAX to exist is portable. The only use I found was in pgbench: #ifndef INT64_MAX#define INT64_MAX INT64CONST(0x7FFFFFFFFFFFFFFF)#endif so I have just added that to my patch, and INT64_MIN: #ifndef INT64_MIN#define INT64_MIN (-INT64CONST(0x7FFFFFFFFFFFFFFF) - 1)#endif This is only used for HAVE_INT64_TIMESTAMP. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Mon, Jan 27, 2014 at 10:48:16PM -0500, Bruce Momjian wrote: > On Mon, Jan 27, 2014 at 07:19:21PM -0500, Tom Lane wrote: > > Bruce Momjian <bruce@momjian.us> writes: > > > Oh, one odd thing about this patch. I found I needed to use INT64_MAX, > > > but I don't see it used anywhere else in our codebase. Is this OK? Is > > > there a better way? > > > > Most of the overflow tests in int.c and int8.c are coded to avoid relying > > on the MIN or MAX constants; which seemed like better style at the time. > > Yes, I looked at those but they seemed like overkill for interval. For > a case where there was an int64 multiplied by a double, then cast back > to an int64, I checked the double against INT64_MAX before casting to an > int64. > > > I'm not sure whether relying on INT64_MAX to exist is portable. > > The only use I found was in pgbench: > > #ifndef INT64_MAX > #define INT64_MAX INT64CONST(0x7FFFFFFFFFFFFFFF) > #endif > > so I have just added that to my patch, and INT64_MIN: > > #ifndef INT64_MIN > #define INT64_MIN (-INT64CONST(0x7FFFFFFFFFFFFFFF) - 1) > #endif > > This is only used for HAVE_INT64_TIMESTAMP. Adjusted patch applied for PG 9.4. Thanks for the report. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +