Обсуждение: murmur3 hash binary data migration from Oracle to PostgreSQL

Поиск
Список
Период
Сортировка

murmur3 hash binary data migration from Oracle to PostgreSQL

От
Jagmohan Kaintura
Дата:
Hi Team,

We are doing Migration from Oracle to PostgreSQL. In SOurce database we have Binary data stored using murmur3 hashing function. In Oracle this data is being generated from the Java code and inserted into the Oracle database. 

As part of Migration processes the reference data on which this murmur3 is generated is also getting changed while migrating to PostgreSQL. 

In PostgreSQL do we have any mechanism for fetching this murmur3 hash function for any UUID.

Please let me know, what ever solution is available for the implementation.



--
Best Regards,
Jagmohan 

Re: murmur3 hash binary data migration from Oracle to PostgreSQL

От
Erik Wienhold
Дата:
> On 25/04/2023 03:21 CEST Jagmohan Kaintura <jagmohan@tecorelabs.com> wrote:
>
> We are doing Migration from Oracle to PostgreSQL. In SOurce database we have
> Binary data stored using murmur3 hashing function. In Oracle this data is
> being generated from the Java code and inserted into the Oracle database.

Do you store the hash and the binary data?  The hash is a key to the binary
data?

> As part of Migration processes the reference data on which this murmur3 is
> generated is also getting changed while migrating to PostgreSQL.

Why is the data changing during migration?  Shouldn't a migration preserve
the data and only adapt it if the database model needs to change?

> In PostgreSQL do we have any mechanism for fetching this murmur3 hash
> function for any UUID.

I don't understand what you mean by that.  What does it have to do with UUID?

Do you want to generate the MurmurHash in Postgres?  Postgres has no builtin
support for that hash function and I can't find any extension in a quick
online search.

Or do you want to just look up rows by the MurmurHash?  That's a trivial
SELECT statement.  Store the hash in an indexed column of type bytea to have
performant lookups.

--
Erik



Re: murmur3 hash binary data migration from Oracle to PostgreSQL

От
Jeffrey Walton
Дата:
On Mon, Apr 24, 2023 at 9:22 PM Jagmohan Kaintura
<jagmohan@tecorelabs.com> wrote:
>
> We are doing Migration from Oracle to PostgreSQL. In SOurce database we have Binary data stored using murmur3 hashing
function.In Oracle this data is being generated from the Java code and inserted into the Oracle database. 
>
> As part of Migration processes the reference data on which this murmur3 is generated is also getting changed while
migratingto PostgreSQL. 
>
> In PostgreSQL do we have any mechanism for fetching this murmur3 hash function for any UUID.
>
> Please let me know, what ever solution is available for the implementation.

My apologies if I misparsed a couple of statements. I am having
trouble determining if you are migrating away from Murmur3.

If you are selecting a new digest, then SipHash would be an excellent
choice. It was designed to avoid collisions and be fast. Plus it was
designed by Jean-Philippe Aumasson and Daniel J. Bernstein. It doesn't
get much better than those two fellows.

Jeff



Re: murmur3 hash binary data migration from Oracle to PostgreSQL

От
Jagmohan Kaintura
Дата:
Hi ,
No we want to generate murmur3 format only.

BR,
Jagmohan

> On 25-Apr-2023, at 8:52 AM, Jeffrey Walton <noloader@gmail.com> wrote:
>
> On Mon, Apr 24, 2023 at 9:22 PM Jagmohan Kaintura
> <jagmohan@tecorelabs.com> wrote:
>>
>> We are doing Migration from Oracle to PostgreSQL. In SOurce database we have Binary data stored using murmur3
hashingfunction. In Oracle this data is being generated from the Java code and inserted into the Oracle database. 
>>
>> As part of Migration processes the reference data on which this murmur3 is generated is also getting changed while
migratingto PostgreSQL. 
>>
>> In PostgreSQL do we have any mechanism for fetching this murmur3 hash function for any UUID.
>>
>> Please let me know, what ever solution is available for the implementation.
>
> My apologies if I misparsed a couple of statements. I am having
> trouble determining if you are migrating away from Murmur3.
>
> If you are selecting a new digest, then SipHash would be an excellent
> choice. It was designed to avoid collisions and be fast. Plus it was
> designed by Jean-Philippe Aumasson and Daniel J. Bernstein. It doesn't
> get much better than those two fellows.
>
> Jeff



Re: murmur3 hash binary data migration from Oracle to PostgreSQL

От
Erik Wienhold
Дата:
> On 25/04/2023 12:44 CEST Jagmohan Kaintura <jagmohan@tecorelabs.com> wrote:
>
> No we want to generate murmur3 format only.
>
> > On 25-Apr-2023, at 8:52 AM, Jeffrey Walton <noloader@gmail.com> wrote:
> >
> > My apologies if I misparsed a couple of statements. I am having
> > trouble determining if you are migrating away from Murmur3.
> >
> > If you are selecting a new digest, then SipHash would be an excellent
> > choice. It was designed to avoid collisions and be fast. Plus it was
> > designed by Jean-Philippe Aumasson and Daniel J. Bernstein. It doesn't
> > get much better than those two fellows.

I don't understand.  In your original message you write:

> On 25/04/2023 03:21 CEST Jagmohan Kaintura <jagmohan@tecorelabs.com> wrote:
>
> In Oracle this data is being generated from the Java code and inserted into
> the Oracle database.

Do you generate the hash in Java or in Oracle?  I don't know if Oracle
Database provides a MurmurHash function but in Java you could use
apache-commons[0] and change your Java code to also insert the hash (if you
don't already do so).

[0] https://commons.apache.org/proper/commons-codec/apidocs/org/apache/commons/codec/digest/MurmurHash3.html

PS: Please don't top post. https://wiki.postgresql.org/wiki/Mailing_Lists#Email_etiquette_mechanics

--
Erik



Re: murmur3 hash binary data migration from Oracle to PostgreSQL

От
Imre Samu
Дата:
No we want to generate murmur3 format only. 

If you need a server-side murmur3 function 
-  one alternative is the https://github.com/markokr/pghashlib extension.

psql (15.2 (Debian 15.2-2))
Type "help" for help.

db=# create extension hashlib;
CREATE EXTENSION

db=# 
select hash_string('', 'murmur3');
select hash_string('a', 'murmur3');
select hash_string('abcdefg', 'murmur3');
select encode(hash128_string('abcdefg', 'murmur3'), 'hex');
+-------------+
| hash_string |
+-------------+
|           0 |
+-------------+
(1 row)

+-------------+
| hash_string |
+-------------+
|  1009084850 |
+-------------+
(1 row)

+-------------+
| hash_string |
+-------------+
| -2009294074 |
+-------------+
(1 row)

+----------------------------------+
|              encode              |
+----------------------------------+
| 069b3c88000000000000000000000000 |
+----------------------------------+
(1 row)




RUN   mkdir -p /pghashlib \
    && git clone  --depth 1 --branch bgdevlab/builds https://github.com/bgdevlab/pghashlib.git  /pghashlib \
    && cd /pghashlib \
    && make USE_PGXS=1 \
    && make USE_PGXS=1 install \
    && rm -rf /pghashlib

Regards,
 Imre