Re: A bug when use get_bit() function for a long bytea string

Поиск
Список
Период
Сортировка
От Ashutosh Bapat
Тема Re: A bug when use get_bit() function for a long bytea string
Дата
Msg-id CAG-ACPUbQwpCHhn_KkLAGEc08A91yZhARazj2BGjXvK+eFCVog@mail.gmail.com
обсуждение исходный текст
Ответ на Re: A bug when use get_bit() function for a long bytea string  ("movead.li@highgo.ca" <movead.li@highgo.ca>)
Ответы Re: A bug when use get_bit() function for a long bytea string  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: A bug when use get_bit() function for a long bytea string  ("Daniel Verite" <daniel@manitou-mail.org>)
Список pgsql-hackers


On Wed, 18 Mar 2020 at 08:18, movead.li@highgo.ca <movead.li@highgo.ca> wrote:

Hello thanks for the detailed review,

>I think the second argument indicates the bit position, which would be max bytea length * 8. If max bytea length covers whole int32, the second argument >needs to be wider i.e. int64.
Yes, it makes sence and followed.

I think we need a similar change in byteaGetBit() and byteaSetBit() as well.
 

> Some more comments on the patch
> struct pg_encoding
> {
>- unsigned (*encode_len) (const char *data, unsigned dlen);
>+ int64 (*encode_len) (const char *data, unsigned dlen);
 unsigned (*decode_len) (const char *data, unsigned dlen);
 unsigned (*encode) (const char *data, unsigned dlen, char *res);
>  unsigned (*decode) (const char *data, unsigned dlen, char *res);
> Why not use return type of int64 for rest of the functions here as well?
>  res = enc->encode(VARDATA_ANY(data), datalen, VARDATA(result));
>  /* Make this FATAL 'cause we've trodden on memory ... */
>- if (res > resultlen)
>+ if ((int64)res > resultlen)
>
>if we change return type of all those functions to int64, we won't need this cast.
I change the 'encode' function, it needs an int64 return type, but keep other 
functions in 'pg_encoding', because I think it of no necessary reason.

Ok, let's leave it for a committer to decide. 


>Right now we are using int64 because bytea can be 1GB, but what if we increase
>that limit tomorrow, will int64 be sufficient? That may be unlikely in the near
>future, but nevertheless a possibility. Should we then define a new datatype
>which resolves to int64 right now but really depends upon the bytea length. I
>am not suggesting that we have to do it right now, but may be something to
>think about.
I decide to use int64 because if we really want to increase the limit,  it should be
the same change with 'text', 'varchar' which have the same limit. So it may need
a more general struct. Hence I give up the idea.

Hmm, Let's see what a committer says.

Some more review comments.
+   int64       res,resultlen;

We need those on separate lines, possibly.

+   byteNo = (int32)(n / BITS_PER_BYTE);
Does it hurt to have byteNo as int64 so as to avoid a cast. Otherwise, please
add a comment explaining the reason for the cast. The comment applies at other
places where this change appears.

-       int         len;
+       int64       len;
Why do we need this change?
        int         i;
 


>It might help to add a test where we could pass the second argument something
>greater than 1G. But it may be difficult to write such a test case.
Add two test cases.
 
+
+select get_bit(
+        set_bit((repeat('Postgres', 512 * 1024 * 1024 / 8))::bytea, 1024 * 1024 * 1024 + 1, 0)
+       ,1024 * 1024 * 1024 + 1);

This bit position is still within int4.
postgres=# select pg_column_size(1024 * 1024 * 1024 + 1);
 pg_column_size
----------------
              4  
(1 row)

You want something like
postgres=# select pg_column_size(512::bigint * 1024 * 1024 * 8);
 pg_column_size
----------------
              8  
(1 row)

--
Best Wishes,
Ashutosh

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Julien Rouhaud
Дата:
Сообщение: Re: Planning counters in pg_stat_statements (using pgss_store)
Следующее
От: Fujii Masao
Дата:
Сообщение: Re: Patch: to pass query string to pg_plan_query()