Re: Re: A bug when use get_bit() function for a long bytea string

Поиск
Список
Период
Сортировка
От Ashutosh Bapat
Тема Re: Re: A bug when use get_bit() function for a long bytea string
Дата
Msg-id CAG-ACPXeQEuvdVBzz+NWgFYoBNwcJfv1W=pJExp4LM_3fgar6w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Re: A bug when use get_bit() function for a long bytea string  ("movead.li@highgo.ca" <movead.li@highgo.ca>)
Список pgsql-hackers


On Fri, 13 Mar 2020 at 08:48, movead.li@highgo.ca <movead.li@highgo.ca> wrote:
Thanks for the reply.

>Why have you used size? Shouldn't we use int64?
Yes, thanks for the point, I have changed the patch.
 

Thanks for the patch.
 
>If get_bit()/set_bit() accept the second argument as int32, it can not
>be used to set bits whose number does not fit 32 bits. I think we need
>to change the type of the second argument as well.
Because int32 can cover the length of bytea that PostgreSQL support,

I think the second argument indicates the bit position, which would be max bytea length * 8. If max bytea length covers whole int32, the second argument needs to be wider i.e. int64.

Some more comments on the patch
 struct pg_encoding
 {
- unsigned (*encode_len) (const char *data, unsigned dlen);
+ int64 (*encode_len) (const char *data, unsigned dlen);
  unsigned (*decode_len) (const char *data, unsigned dlen);
  unsigned (*encode) (const char *data, unsigned dlen, char *res);
  unsigned (*decode) (const char *data, unsigned dlen, char *res);

Why not use return type of int64 for rest of the functions here as well?

  res = enc->encode(VARDATA_ANY(data), datalen, VARDATA(result));
 
  /* Make this FATAL 'cause we've trodden on memory ... */
- if (res > resultlen)
+ if ((int64)res > resultlen)

if we change return type of all those functions to int64, we won't need this cast.

Right now we are using int64 because bytea can be 1GB, but what if we increase
that limit tomorrow, will int64 be sufficient? That may be unlikely in the near
future, but nevertheless a possibility. Should we then define a new datatype
which resolves to int64 right now but really depends upon the bytea length. I
am not suggesting that we have to do it right now, but may be something to
think about.
 
 hex_enc_len(const char *src, unsigned srclen)
 {
- return srclen << 1;
+ return (int64)(srclen << 1);

why not to convert srclen also to int64. That will also change the pg_encoding
member definitions. But if encoded length requires int64 to fit the possibly
values, same would be true for the source lengths. Why can't the source length
also be int64?

If still we want the cast, I think it should be ((int64)srclen << 1) rather
than casting the result.
 
  /* 3 bytes will be converted to 4, linefeed after 76 chars */
- return (srclen + 2) * 4 / 3 + srclen / (76 * 3 / 4);
+ return (int64)((srclen + 2) * 4 / 3 + srclen / (76 * 3 / 4));
similar comments as above.
 
 SELECT set_bit(B'0101011000100100', 16, 1); -- fail
 ERROR:  bit index 16 out of valid range (0..15)
+SELECT get_bit(
+       set_bit((repeat('Postgres', 512 * 1024 * 1024 / 8))::bytea, 0, 0)
+       ,0);
+ get_bit
+---------
+       0
+(1 row)

It might help to add a test where we could pass the second argument something
greater than 1G. But it may be difficult to write such a test case.

--
Best Wishes,
Ashutosh

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [PATCH] Use PKG_CHECK_MODULES to detect the libxml2 library
Следующее
От: Thunder
Дата:
Сообщение: Standby got fatal after the crash recovery