Re: MD5 aggregate

Поиск
Список
Период
Сортировка
От Marko Kreen
Тема Re: MD5 aggregate
Дата
Msg-id CACMqXCJNrpTttpMFW8u5fvy7sEJCkYCep5278nJB3-vpGHcdcw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: MD5 aggregate  (Dean Rasheed <dean.a.rasheed@gmail.com>)
Ответы Re: MD5 aggregate  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Thu, Jun 27, 2013 at 11:28 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> On 26 June 2013 21:46, Peter Eisentraut <peter_e@gmx.net> wrote:
>> On 6/26/13 4:04 PM, Dean Rasheed wrote:
>>> A quick google search reveals several people asking for something like
>>> this, and people recommending md5(string_agg(...)) or
>>> md5(string_agg(md5(...))) based solutions, which are doomed to failure
>>> on larger tables.
>>
>> The thread discussed several other options of checksumming tables that
>> did not have the air of a crytographic offering, as Noah put it.
>>
>
> True but md5 has the advantage of being directly comparable with the
> output of Unix md5sum, which would be useful if you loaded data from
> external files and wanted to confirm that your import process didn't
> mangle it.

The problem with md5_agg() is that it's only useful in toy scenarios.

It's more useful give people script that does same sum(hash(row))
on dump file than try to run MD5 on ordered rows.

Also, I don't think anybody actually cares about MD5(table-as-bytes), instead
people want way to check if 2 tables or table and dump are same.

-- 
marko



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Reduce maximum error in tuples estimation after vacuum.
Следующее
От: Dimitri Fontaine
Дата:
Сообщение: Re: in-catalog Extension Scripts and Control parameters (templates?)