Re: [DOCS] Let's document a bytea bug

Поиск
Список
Период
Сортировка
От Joe Conway
Тема Re: [DOCS] Let's document a bytea bug
Дата
Msg-id 8ba460ba-9947-73e0-6920-a8a2bca436b9@joeconway.com
обсуждение исходный текст
Ответ на Re: [DOCS] Let's document a bytea bug  (Bruce Momjian <bruce@momjian.us>)
Список pgsql-docs
On 8/3/20 4:20 PM, Bruce Momjian wrote:
> On Fri, Jul 31, 2020 at 10:13:48AM +0500, Andrey M. Borodin wrote:
>> Hi Anna!
>>
>> > 23 мая 2018 г., в 20:33, Anna Akenteva <a.akenteva@postgrespro.ru> написал(а):
>> >
>> >
>> > Some time ago I've encountered a problem with the bytea type: we can't SELECT
>> > bytea strings whose textual representation is too big to fit into StringInfoData.
>> > And as a side effect, pg_dump refuses to dump tables with big bytea strings.
>> >
>> > It's a bug, it's pretty confusing, but it seems like there's no pretty way
>> > to fix it so far. Here's a link to a recent discussion on the issue:
>> >
https://www.postgresql.org/message-id/flat/c8bdf802d41ec37003ec3b726db79428@postgrespro.ru#c8bdf802d41ec37003ec3b726db79428@postgrespro.ru
>> >
>> > Since it won't be fixed anytime soon, I thought it could be worth documenting.
>> > Attaching a patch for the documentation: I added some text to the "Binary Data Types"
>> > part where I tried to describe the issue and to explain how to deal with it.
>> >
>> > My patch in plain text (for convenience):
>> >
>> > It is not recommended to use bytea strings whose textual representation
>> > exceeds 1GB, as it may not be possible to SELECT them due to output size
>> > limitations. Consequently, a table containing such big strings cannot be
>> > properly processed by pg_dump, as pg_dump will try to SELECT these values from the
>> > table and fail. The exact size limit advised for bytea strings depends on their
>> > content, the external format and encoding that you are using, the context in
>> > which they will be selected. The general rule is that when you use SELECT,
>> > the returned tuple should not exceed 1GB. Although even if SELECT does not
>> > work, you can still retrieve big bytea strings using COPY in binary format.
>>
>> Thanks for this message. It took me a while to find out what was the problem.
>> +1 for documenting this, maybe even with exact error like
>> [ 2020-07-30 01:20:32.248 MSK pg_dump - 10.3.3.30,XX000 ]:ERROR:  invalid memory alloc request size 1472599557
>> It's really really scary. My first feeling was that it's TOAST corruption.
>
> I still have Anna Akenteva's 2018 email in my mailbox because I wanted
> to research this further.  Now that you have re-reported the problem, I
> am on it!  ;-)
>
> I looked for a clearer reproducible test case, and I have found this
> one:

It is easier to reproduce than that:

select repeat('x',600000000)::bytea;
ERROR:  invalid memory alloc request size 1200000003

select octet_length(repeat('x',600000000)::bytea);
 octet_length
--------------
    600000000
(1 row)

CREATE TABLE big_data AS
 select repeat('x',600000000)::bytea;
SELECT 1

SELECT repeat FROM big_data;
ERROR:  invalid memory alloc request size 1200000003

\q

pg_dump -d postgres > /tmp/a
pg_dump: error: Dumping the contents of table "big_data" failed: PQgetResult()
failed.
pg_dump: error: Error message from server: ERROR:  invalid memory alloc request
size 1200000003
pg_dump: error: The command was: COPY public.big_data (repeat) TO stdout;


There are other cases that also hit the StringInfo limits (i.e. MaxAllocSize)
with TEXT data type (and I guess others, like JSON or XML, too):

select repeat('x',600000000), repeat('y',600000000);
ERROR:  out of memory
DETAIL:  Cannot enlarge string buffer containing 600000010 bytes by 600000000
more bytes.


Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Вложения

В списке pgsql-docs по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: [DOCS] Let's document a bytea bug
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: 7 decimal digits precision for real