Re: pg_dump / copy bugs with "big lines" ?

Поиск
Список
Период
Сортировка
От Jim Nasby
Тема Re: pg_dump / copy bugs with "big lines" ?
Дата
Msg-id 5522C7BC.3070705@BlueTreble.com
обсуждение исходный текст
Ответ на Re: pg_dump / copy bugs with "big lines" ?  (Ronan Dunklau <ronan.dunklau@dalibo.com>)
Ответы Re: pg_dump / copy bugs with "big lines" ?  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On 3/31/15 3:46 AM, Ronan Dunklau wrote:
>> >StringInfo uses int's to store length, so it could possibly be changed,
>> >but then you'd just error out due to MaxAllocSize.
>> >
>> >Now perhaps those could both be relaxed, but certainly not to the extent
>> >that you can shove an entire 1.6TB row into an output buffer.
> Another way to look at it would be to work in small chunks. For the first test
> case (rows bigger than 1GB), maybe the copy command could be rewritten to work
> in chunks, flushing the output more often if needed.

Possibly; I'm not sure how well the FE/BE protocol or code would 
actually support that.

>> >The other issue is that there's a LOT of places in code that blindly
>> >copy detoasted data around, so while we technically support 1GB toasted
>> >values you're probably going to be quite unhappy with performance. I'm
>> >actually surprised you haven't already seen this with 500MB objects.
>> >
>> >So long story short, I'm not sure how worthwhile it would be to try and
>> >fix this. We probably should improve the docs though.
>> >
> I think that having data that can't be output by pg_dump is quite surprising,
> and if this is not fixable, I agree that it should clearly be documented.
>
>> >Have you looked at using large objects for what you're doing? (Note that
>> >those have their own set of challenges and limitations.)
> Yes I do. This particular customer of ours did not mind the performance
> penalty of using bytea objects as long as it was convenient to use.

What do they do when they hit 1GB? Presumably if they're this close to 
the limit they're already hitting 1GB, no? Or is this mostly hypothetical?

>> >
>>> > >We also hit a second issue, this time related to bytea encoding.
>> >
>> >There's probably several other places this type of thing could be a
>> >problem. I'm thinking of conversions in particular.
> Yes, thats what the two other test cases I mentioned are about: any conversion
> leadng to a size greater than 1GB results in an error, even implicit
> conversions like doubling antislashes in the output.

I think the big issue with encoding is going to be the risk of changing 
encoding and ending up with something too large to fit back into 
storage. They might need to consider using something like bytea(990MB).

In any case, I don't think it would be terribly difficult to allow a bit 
more than 1GB in a StringInfo. Might need to tweak palloc too; ISTR 
there's some 1GB limits there too.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jim Nasby
Дата:
Сообщение: Re: Freeze avoidance of very large table.
Следующее
От: Petr Jelinek
Дата:
Сообщение: Re: TABLESAMPLE patch