Обсуждение: Exporting TBs of data in Oracle schema/database into CSV format tomigrate into PostgreSQL

Поиск
Список
Период
Сортировка

Exporting TBs of data in Oracle schema/database into CSV format tomigrate into PostgreSQL

От
Prashant Kulkarni
Дата:
Hi Team,

Please can you let me know the menthds/tools/approches to export large amount of data (TBs) present in the schemas into the CSV/delimated format as we have to import that into PostgreSQL later.

I tried for options like SQL Developer/Toad but those are not feasible for vast data size, these tool are OK for GBs of data to export.

Kindly let me know if you have any solution to complete this task.

Thanks.

Ora2pg,Oracle Golden Gate and EDB Migration toolkit.

 

From: Prashant Kulkarni <ppk10.prashant@gmail.com>
Sent: maandag 20 april 2020 13:53
To: pgsql-admin <pgsql-admin@postgresql.org>
Subject: Exporting TBs of data in Oracle schema/database into CSV format to migrate into PostgreSQL

 

Hi Team,

 

Please can you let me know the menthds/tools/approches to export large amount of data (TBs) present in the schemas into the CSV/delimated format as we have to import that into PostgreSQL later.

 

I tried for options like SQL Developer/Toad but those are not feasible for vast data size, these tool are OK for GBs of data to export.

 

Kindly let me know if you have any solution to complete this task.

 

Thanks.

On 4/20/20 6:53 AM, Prashant Kulkarni wrote:
> Hi Team,
>
> Please can you let me know the menthds/tools/approches to export large 
> amount of data (TBs) present in the schemas into the CSV/delimated format 
> as we have to import that into PostgreSQL later.
>
> I tried for options like SQL Developer/Toad but those are not feasible for 
> vast data size, these tool are OK for GBs of data to export.
>
> Kindly let me know if you have any solution to complete this task.

What kind of data (transaction history, binary blobs, GIS, etc)?

-- 
Angular momentum makes the world go 'round.



Re: Exporting TBs of data in Oracle schema/database into CSV formatto migrate into PostgreSQL

От
Thomas Kellerer
Дата:
Prashant Kulkarni schrieb am 20.04.2020 um 13:53:
> Hi Team,
>
> Please can you let me know the menthds/tools/approches to export
> large amount of data (TBs) present in the schemas into the
> CSV/delimated format as we have to import that into PostgreSQL
> later.
>
> I tried for options like SQL Developer/Toad but those are not
> feasible for vast data size, these tool are OK for GBs of data to
> export.
>
> Kindly let me know if you have any solution to complete this task.

Using a foreign table would remove the need to dump and transfer
the data to begin with.

Then you could access the Oracle table directly from within Postgres:

insert into some_table (...)
select ...
from oracle_fdw_table;




Re: Exporting TBs of data in Oracle schema/database into CSV formatto migrate into PostgreSQL

От
Pepe TD Vo
Дата:
Mr. Kellerer,

we can access Oracle table within Postgres?  How?  Need to create a database link and/or how can both talk to each other?

My data is from production, can turn off the foreign key.  Is there a way to copy data from Oracle to Postgres faster?  Should I export the Oracle data table via dump file as text/csv and insert it into Postgres?  Can we use an oracle dump file?

thank you.

Bach-Nga

No one in this world is pure and perfect.  If you avoid people for their mistakes you will be alone. So judge less, love and forgive more.
To call him a dog hardly seems to do him justice though in as much as he had four legs, a tail, and barked, I admit he was, to all outward appearances. But to those who knew him well, he was a perfect gentleman (Hermione Gingold)

**Live simply **Love generously **Care deeply **Speak kindly.
*** Genuinely rich *** Faithful talent *** Sharing success


On Monday, April 20, 2020, 09:28:18 AM EDT, Thomas Kellerer <shammat@gmx.net> wrote:


Prashant Kulkarni schrieb am 20.04.2020 um 13:53:

> Hi Team,
>
> Please can you let me know the menthds/tools/approches to export
> large amount of data (TBs) present in the schemas into the
> CSV/delimated format as we have to import that into PostgreSQL
> later.
>
> I tried for options like SQL Developer/Toad but those are not
> feasible for vast data size, these tool are OK for GBs of data to
> export.
>
> Kindly let me know if you have any solution to complete this task.


Using a foreign table would remove the need to dump and transfer
the data to begin with.

Then you could access the Oracle table directly from within Postgres:

insert into some_table (...)
select ...
from oracle_fdw_table;



Re: Exporting TBs of data in Oracle schema/database into CSV formatto migrate into PostgreSQL

От
Laurenz Albe
Дата:
On Mon, 2020-04-20 at 14:02 +0000, Pepe TD Vo wrote:
> we can access Oracle table within Postgres?  How?  Need to create a database link
> and/or how can both talk to each other?

That would be the Oracle foreign data wrapper:
https://laurenz.github.io/oracle_fdw/

> My data is from production, can turn off the foreign key.  Is there a way to copy
> data from Oracle to Postgres faster?  Should I export the Oracle data table via
> dump file as text/csv and insert it into Postgres?  Can we use an oracle dump file?

You cannot use an Oracle dump, because that is in a proprietary format.

Using the foreign data wrapper, you define a foreign table in PostgreSQL.
When you select from that table, the data are directly fetched from Oracle.

Then you can do

   INSERT INTO localtable SELECT * FROM foreign_table;

to transfer the data, without any intermediary file storage.
In a way, the data are streamed from Oracle to PostgreSQL.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: Exporting TBs of data in Oracle schema/database into CSV formatto migrate into PostgreSQL

От
manish yadav
Дата:
You may consider pgloader for copying data from csv to PostgreSQL in a faster way(in parallel)  and with explicit error details.

Thanks and Regards,

Manish Yadav




On Monday, 20 April 2020, 5:23:52 PM IST, Prashant Kulkarni <ppk10.prashant@gmail.com> wrote:


Hi Team,

Please can you let me know the menthds/tools/approches to export large amount of data (TBs) present in the schemas into the CSV/delimated format as we have to import that into PostgreSQL later.

I tried for options like SQL Developer/Toad but those are not feasible for vast data size, these tool are OK for GBs of data to export.

Kindly let me know if you have any solution to complete this task.

Thanks.

Re: Exporting TBs of data in Oracle schema/database into CSV formatto migrate into PostgreSQL

От
Avinash Kumar
Дата:
Hi,

On Mon, Apr 20, 2020 at 11:39 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Mon, 2020-04-20 at 14:02 +0000, Pepe TD Vo wrote:
> we can access Oracle table within Postgres?  How?  Need to create a database link
> and/or how can both talk to each other?

That would be the Oracle foreign data wrapper:
https://laurenz.github.io/oracle_fdw/

> My data is from production, can turn off the foreign key.  Is there a way to copy
> data from Oracle to Postgres faster?  Should I export the Oracle data table via
> dump file as text/csv and insert it into Postgres?  Can we use an oracle dump file?

You cannot use an Oracle dump, because that is in a proprietary format.

Using the foreign data wrapper, you define a foreign table in PostgreSQL.
When you select from that table, the data are directly fetched from Oracle.

Then you can do

   INSERT INTO localtable SELECT * FROM foreign_table;

to transfer the data, without any intermediary file storage.
In a way, the data are streamed from Oracle to PostgreSQL.
I echo. In my experience i have seen copy using oracle_fdw performing (almost 3 times faster than Ora2PG in some tests) a lot faster than what is existing today for the offline data copy. So, one may just avoid the longer route of generating a CSV and then looking for options to load it faster. 

However, for data validation - do you think we could do something like -> creating a md5 hash out of oracle (using dbms_crypto.hash()) foreign table and local postgres table (using md5()) ? 
What else do you think is the best to validate the data between Oracle and Postgres, other than Application and QA tests related to the App ? 

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com 


--
Regards,
Avinash Vallarapu

Re: Exporting TBs of data in Oracle schema/database into CSV formatto migrate into PostgreSQL

От
Laurenz Albe
Дата:
On Mon, 2020-04-20 at 11:59 -0300, Avinash Kumar wrote:
> On Mon, Apr 20, 2020 at 11:39 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> > On Mon, 2020-04-20 at 14:02 +0000, Pepe TD Vo wrote:
> > > we can access Oracle table within Postgres?  How?  Need to create a database link
> > > and/or how can both talk to each other?
> > 
> > That would be the Oracle foreign data wrapper:
> > https://laurenz.github.io/oracle_fdw/
>
> I echo. In my experience i have seen copy using oracle_fdw performing (almost 3 times
> faster than Ora2PG in some tests) a lot faster than what is existing today for the
> offline data copy. So, one may just avoid the longer route of generating a CSV
> and then looking for options to load it faster. 
> 
> However, for data validation - do you think we could do something like -> creating
> a md5 hash out of oracle (using dbms_crypto.hash()) foreign table and local postgres
> table (using md5()) ? 
> What else do you think is the best to validate the data between Oracle and Postgres,
> other than Application and QA tests related to the App ?

If there is no error transferring the data, why should there be need for validation?
Unless you don't trust oracle_fdw of course.

Calculating MD5-sums or similar of the data before and after would be difficult,
because for example timestamps are stored quite differently.
You'd have to convert the data into an identical text format on both systems,
and I cannot think of a good way to do that.

I think that everything beyond comparing the row count would be paranoid.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: Exporting TBs of data in Oracle schema/database into CSV formatto migrate into PostgreSQL

От
Avinash Kumar
Дата:
Hi,

On Mon, Apr 20, 2020 at 12:12 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Mon, 2020-04-20 at 11:59 -0300, Avinash Kumar wrote:
> On Mon, Apr 20, 2020 at 11:39 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> > On Mon, 2020-04-20 at 14:02 +0000, Pepe TD Vo wrote:
> > > we can access Oracle table within Postgres?  How?  Need to create a database link
> > > and/or how can both talk to each other?
> >
> > That would be the Oracle foreign data wrapper:
> > https://laurenz.github.io/oracle_fdw/
>
> I echo. In my experience i have seen copy using oracle_fdw performing (almost 3 times
> faster than Ora2PG in some tests) a lot faster than what is existing today for the
> offline data copy. So, one may just avoid the longer route of generating a CSV
> and then looking for options to load it faster.
>
> However, for data validation - do you think we could do something like -> creating
> a md5 hash out of oracle (using dbms_crypto.hash()) foreign table and local postgres
> table (using md5()) ?
> What else do you think is the best to validate the data between Oracle and Postgres,
> other than Application and QA tests related to the App ?

If there is no error transferring the data, why should there be need for validation?
Unless you don't trust oracle_fdw of course.
I have seen perfect ORA errors or PG errors/exceptions when something goes wrong with copy using FDW. Oracle_FDW has no chance of making a mistake. However, the QA teams still have their Job, for various reasons, when some critical application data is being copied from Oracle to PG. So, it is difficult to deny testing, even if it was a solution like Oracle GG (GG does not do this job but as an example) etc. 

Calculating MD5-sums or similar of the data before and after would be difficult,
because for example timestamps are stored quite differently.
You'd have to convert the data into an identical text format on both systems,
and I cannot think of a good way to do that.
I think for this reason, we may just import the Oracle schema to a different schema, lets say -> scott_fdw. And then JOIN PG and corresponding foreign table and generate hash and validate. That may rule out the possibilities of different storage formats. Let me know if you still see anything going wrong. 

I think that everything beyond comparing the row count would be paranoid.
May be, if the above does not work.  

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com



--
Regards,
Avinash Vallarapu

Re: Exporting TBs of data in Oracle schema/database into CSV formatto migrate into PostgreSQL

От
Scott Ribe
Дата:
So, the big question is: how much downtime can you afford on the current Oracle DB?

Yes, you should investigate the most efficient way to transfer data, but with TBs it will still be significant time. If
youcan afford the down time, great--kill all other connections to Oracle and go for it. If not, that's more complicated
sinceyou'll have to figure out a way to track and apply changes that happen during the copy... 





Re: Exporting TBs of data in Oracle schema/database into CSV formatto migrate into PostgreSQL

От
Avinash Kumar
Дата:
Hi,
On Mon, Apr 20, 2020 at 12:31 PM Scott Ribe <scott_ribe@elevated-dev.com> wrote:
So, the big question is: how much downtime can you afford on the current Oracle DB?

Yes, you should investigate the most efficient way to transfer data, but with TBs it will still be significant time. If you can afford the down time, great--kill all other connections to Oracle and go for it. If not, that's more complicated since you'll have to figure out a way to track and apply changes that happen during the copy...
If downtime is a concern, then, start a transaction on PG, using isolation level - REPEATABLE READ, get the SCN number from Oracle and copy the existing data from each table (table by table) using FDW for the initial data load. And then, if you have Oracle GG license, start the replicat from that SCN so that the data gets to sync eventually. 
And then, when all the tables are in Sync, go ahead and switch your Apps. If we dont have GG license, we need to look for some other extensions or Kafka or something else that could do this job. Remember, migration is always challenging when lots of data is involved, and there never exists a direct magic. 


--
Regards,
Avinash Vallarapu
+1-902-221-5976

Re: Exporting TBs of data in Oracle schema/database into CSV formatto migrate into PostgreSQL

От
Marlene Villanueva
Дата:

On Mon, Apr 20, 2020, 8:39 AM Avinash Kumar <avinash.vallarapu@gmail.com> wrote:
Hi,
On Mon, Apr 20, 2020 at 12:31 PM Scott Ribe <scott_ribe@elevated-dev.com> wrote:
So, the big question is: how much downtime can you afford on the current Oracle DB?

Yes, you should investigate the most efficient way to transfer data, but with TBs it will still be significant time. If you can afford the down time, great--kill all other connections to Oracle and go for it. If not, that's more complicated since you'll have to figure out a way to track and apply changes that happen during the copy...
If downtime is a concern, then, start a transaction on PG, using isolation level - REPEATABLE READ, get the SCN number from Oracle and copy the existing data from each table (table by table) using FDW for the initial data load. And then, if you have Oracle GG license, start the replicat from that SCN so that the data gets to sync eventually. 
And then, when all the tables are in Sync, go ahead and switch your Apps. If we dont have GG license, we need to look for some other extensions or Kafka or something else that could do this job. Remember, migration is always challenging when lots of data is involved, and there never exists a direct magic. 


--
Regards,
Avinash Vallarapu
+1-902-221-5976

Re: Exporting TBs of data in Oracle schema/database into CSV formatto migrate into PostgreSQL

От
"Peter M. Groen"
Дата:
Actually we have something that is called the "ModelMapper" and is used
for large amounts of data transports between databases and even
different database models and this is part of our "Caelus" software
infrastructure.

Contact us for more information.

-- 
Open Systems Development B.V.
Peter M. Groen ( Managing Director )
Rooseveltstraat 18P
2321 BM  Leiden
Email : peter@osdev.nl


On Mon, 20 Apr 2020 11:58:42 +0000, soumik.bhattacharjee@kpn.com wrote:

>Ora2pg,Oracle Golden Gate and EDB Migration toolkit.
>
>From: Prashant Kulkarni <ppk10.prashant@gmail.com>
>Sent: maandag 20 april 2020 13:53
>To: pgsql-admin <pgsql-admin@postgresql.org>
>Subject: Exporting TBs of data in Oracle schema/database into CSV
>format to migrate into PostgreSQL
>
>Hi Team,
>
>Please can you let me know the menthds/tools/approches to export large
>amount of data (TBs) present in the schemas into the CSV/delimated
>format as we have to import that into PostgreSQL later.
>
>I tried for options like SQL Developer/Toad but those are not feasible
>for vast data size, these tool are OK for GBs of data to export.
>
>Kindly let me know if you have any solution to complete this task.
>
>Thanks.



ora2pg would be the most straightforward way. 



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-admin-f2076596.html



Re: Exporting TBs of data in Oracle schema/database into CSV formatto migrate into PostgreSQL

От
Pepe TD Vo
Дата:
Laurenz,

is a foreign-data wrapper open source?  

v/r,

Bach-Nga

No one in this world is pure and perfect.  If you avoid people for their mistakes you will be alone. So judge less, love and forgive more.
To call him a dog hardly seems to do him justice though in as much as he had four legs, a tail, and barked, I admit he was, to all outward appearances. But to those who knew him well, he was a perfect gentleman (Hermione Gingold)

**Live simply **Love generously **Care deeply **Speak kindly.
*** Genuinely rich *** Faithful talent *** Sharing success


On Monday, April 20, 2020, 11:12:03 AM EDT, Laurenz Albe <laurenz.albe@cybertec.at> wrote:


On Mon, 2020-04-20 at 11:59 -0300, Avinash Kumar wrote:
> On Mon, Apr 20, 2020 at 11:39 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> > On Mon, 2020-04-20 at 14:02 +0000, Pepe TD Vo wrote:
> > > we can access Oracle table within Postgres?  How?  Need to create a database link
> > > and/or how can both talk to each other?
> >
> > That would be the Oracle foreign data wrapper:
> > https://laurenz.github.io/oracle_fdw/
>
> I echo. In my experience i have seen copy using oracle_fdw performing (almost 3 times
> faster than Ora2PG in some tests) a lot faster than what is existing today for the
> offline data copy. So, one may just avoid the longer route of generating a CSV
> and then looking for options to load it faster.
>
> However, for data validation - do you think we could do something like -> creating
> a md5 hash out of oracle (using dbms_crypto.hash()) foreign table and local postgres
> table (using md5()) ?
> What else do you think is the best to validate the data between Oracle and Postgres,
> other than Application and QA tests related to the App ?

If there is no error transferring the data, why should there be need for validation?
Unless you don't trust oracle_fdw of course.

Calculating MD5-sums or similar of the data before and after would be difficult,
because for example timestamps are stored quite differently.
You'd have to convert the data into an identical text format on both systems,
and I cannot think of a good way to do that.

I think that everything beyond comparing the row count would be paranoid.


Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

Re: Exporting TBs of data in Oracle schema/database into CSV formatto migrate into PostgreSQL

От
Laurenz Albe
Дата:
On Fri, 2020-04-24 at 13:28 +0000, Pepe TD Vo wrote:
> is a foreign-data wrapper open source? 

Yes, see the license.
The page is https://laurenz.github.io/oracle_fdw/

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: Exporting TBs of data in Oracle schema/database into CSV formatto migrate into PostgreSQL

От
Avinash Kumar
Дата:


On Fri, Apr 24, 2020 at 10:31 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Fri, 2020-04-24 at 13:28 +0000, Pepe TD Vo wrote:
> is a foreign-data wrapper open source?

Yes, see the license.
The page is https://laurenz.github.io/oracle_fdw/
Laurenz,

Was it due to the tricky part of Oracle libraries linking and dependency on Oracle client that you did not plan to think of creating an RPM of it ? 
Not sure if it was discussed before, but what was the reason for Oracle_FDW not being available in PGDG repo ? 


Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com



--
Regards,
Avinash Vallarapu
+1-902-221-5976

Re: Exporting TBs of data in Oracle schema/database into CSV formatto migrate into PostgreSQL

От
Laurenz Albe
Дата:
On Fri, 2020-04-24 at 10:35 -0300, Avinash Kumar wrote:
> On Fri, Apr 24, 2020 at 10:31 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> > On Fri, 2020-04-24 at 13:28 +0000, Pepe TD Vo wrote:
> > > is a foreign-data wrapper open source? 
> > 
> > Yes, see the license.
> > The page is https://laurenz.github.io/oracle_fdw/
> 
> Was it due to the tricky part of Oracle libraries linking and dependency on Oracle client that you did not plan to
thinkof creating an RPM of it ? 
 
> Not sure if it was discussed before, but what was the reason for Oracle_FDW not being available in PGDG repo ?

No, but rather that there is a plethora of Linux distributions in various versions
on various architectures, and I'd have to create an RPM for any of them
(not to speak of the distributions that use other package managers).

And on Linux it is really easy to build the binaries.

About the PGDB packages, that's not for me to answer.
Perhaps it is not deemed important enough, perhaps there are technical or
license problems.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: Exporting TBs of data in Oracle schema/database into CSV formatto migrate into PostgreSQL

От
"Jonah H. Harris"
Дата:
On Mon, Apr 20, 2020 at 10:03 AM Pepe TD Vo <pepevo@yahoo.com> wrote:
Can we use an oracle dump file?

While I generally think the FDW copy is the best (and fastest) approach, I have a tool that can import Oracle export files into Postgres. What version of Oracle?

--
Jonah H. Harris

Re: Exporting TBs of data in Oracle schema/database into CSV formatto migrate into PostgreSQL

От
Pepe TD Vo
Дата:
oracle version 12.1 in linux 5.6 - production
and in lab aws, oracle 12.1 as well.

Gov needs to see how can we migrate or not before approve to use postgres and move all out off oracle.

thank you

Bach-Nga

No one in this world is pure and perfect.  If you avoid people for their mistakes you will be alone. So judge less, love and forgive more.
To call him a dog hardly seems to do him justice though in as much as he had four legs, a tail, and barked, I admit he was, to all outward appearances. But to those who knew him well, he was a perfect gentleman (Hermione Gingold)

**Live simply **Love generously **Care deeply **Speak kindly.
*** Genuinely rich *** Faithful talent *** Sharing success


On Friday, April 24, 2020, 12:57:29 PM EDT, Jonah H. Harris <jonah.harris@gmail.com> wrote:


On Mon, Apr 20, 2020 at 10:03 AM Pepe TD Vo <pepevo@yahoo.com> wrote:
Can we use an oracle dump file?


While I generally think the FDW copy is the best (and fastest) approach, I have a tool that can import Oracle export files into Postgres. What version of Oracle?

--
Jonah H. Harris