Re: Aggregate versions of hashing functions (md5, sha1, etc...)

Поиск
Список
Период
Сортировка
От Dominique Devienne
Тема Re: Aggregate versions of hashing functions (md5, sha1, etc...)
Дата
Msg-id CAFCRh-9dMQC99F22VreuOF9sv7kNjqVzXvaHZQerk0aBHUyhTA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Aggregate versions of hashing functions (md5, sha1, etc...)  (Dominique Devienne <ddevienne@gmail.com>)
Список pgsql-general
On Fri, Jul 11, 2025 at 11:00 AM Dominique Devienne <ddevienne@gmail.com> wrote:
> The current md5() and pgcrypto.digest() functions roll the x1
> init, xN process, and x1 finish into a single call, processing a
> single bytea (or perhaps more intelligently for TOAST'ed values, the
> 2K "rows" of those in streaming-fashion, hopefully. Can a dev confirm?)

FWIW, I've [asked ChatGPT about that][1], and assuming it's right (md5
and pgcrypto.digest not leveraging the "substring-optimization" on
TOASTED bytea), that's an unfortunate lost opportunity, especially for
byteas reaching close to the 1GB limit. And again (sorry to lay it on
thick...), when required to manually chunk for sizes > 1GB, the lack
of aggregate is a bit crippling, I'm afraid.

So again, can a dev confirm what ChatGPT blurted out?

And if true, any interest in improving that for better TOAST support
for true streaming hashing for current scalar digests?

And of course, the main point of this thread, add (true streaming)
aggregate support in a future version?

Thanks, --DD

[1]: https://chatgpt.com/share/6870fe03-416c-800e-8633-a76e478a794a



В списке pgsql-general по дате отправления: