Обсуждение: COPY from STDIN vs file with large CSVs
Hi Wells,
I am trying import CSV file in to Amazon RDS PostgreSQL
I am getting following error , In internal PostgreSQL it works
Here is the PSQL Script : can you also help me with Powershell to execute this script
COPY product."tblProductLicenseType" FROM E'//XXXXXXXXXXX/Product_Licensing_Orders/Development/tblProductLicenseType.csv' DELIMITER E',' NULL '\N' QUOTE '"' HEADER CSV ;
COPY product."tblProductLicense" FROM E'//XXXXXXXXXXX/Product_Licensing_Orders/Development/tblProductLicense.csv' WITH (format csv, DELIMITER ',', header true, force_null ("LicenseId", "Season", "ProductCode", "Account", "UserId", "StatusId", "CreateDate", "UpdateDate", "SourceId", ord_num, item_num, "lType"), ENCODING 'UTF8' );
COPY product."tblCobranderUserSeasonProduct" FROM E'//XXXXXXXXXXX/Product_Licensing_Orders/Development/tblCobranderUserSeasonProduct.csv' WITH (format csv, DELIMITER ',', header true, force_null ("uspId", "CobranderUserLinkId", season, prod_cd, "statusId", "statusDate", "createdBy", "createdDate", "modifiedBy", "modifiedDate"), ENCODING 'UTF8' );
COPY product."tblDailyFundedBankProducts" FROM E'//XXXXXXXXXXX/Product_Licensing_Orders/Development/tblDailyFundedBankProducts.csv' WITH (format csv, DELIMITER ',', header true, force_null ("Account", "UserID", "BankProductSeason", "TransmitterPaymentDate", "FundedBankProducts", "ProtectionPlusFunded"), ENCODING 'UTF8' );
COPY product."tblProductSoftwarePackage"("ProductCode", "Season", "SoftwarePackageId", "CreateDate", "UpdateDate" , "createdBy", "updatedBy") FROM E'//XXXXXXXXXXX/Product_Licensing_Orders/Development/tblProductSoftwarePackage.csv' WITH (format csv, DELIMITER ',', header true, force_null ("ProductCode", "Season", "SoftwarePackageId", "CreateDate", "UpdateDate", "createdBy", "updatedBy"), ENCODING 'UTF8' );
COPY product."tblUserProductEval" FROM E'//XXXXXXXXXXX/Product_Licensing_Orders/Development/tblUserProductEval.csv' WITH (format csv, DELIMITER ',', header true, force_null ("evalId", "userId", "prodCd", "licenseId", "evalStartDate", "evalEndDate", "createdBy", "createdDate", "modifiedBy", "modifiedDate"), ENCODING 'UTF8' );
COPY orders."OrderType" FROM E'//XXXXXXXXXXX/Product_Licensing_Orders/Development/OrderType.csv' WITH (format csv, DELIMITER ',', header true, force_null ("OrderTypeCode", "Description"), ENCODING 'UTF8' );
COPY orders.orders(ord_num, ord_stat, account, user_id, season, county, tax_rate, tax_amt, disc_pct, disc_amt, freight, ord_tot, bal_due, pmt1_amt, pmt1_dt, pmt2_amt, pmt2_dt, note1, note2, sales_exec, split, chk_num1, chk_name1, chk_amt1, exp_mon1, exp_yr1, auth1, ref1, chk_num2, chk_name2, chk_amt2, exp_mon2, exp_yr2, auth2, ref2, b_fname, b_lname, b_company, b_addr1, b_addr2, b_city, b_state, b_zip, s_phone, s_fname, s_lname, s_company, s_addr1, s_addr2, s_city, s_state, s_zip, orig_by, orig_dt, edit_by, edit_dt, security_cd1, security_cd2, export_date, exec_email, s_fax, do_not_ship, ship_email, "CommissionType", "OrderType", "PaymentPlan", "sfOrderId")
FROM E'//XXXXXXXXXXX/Product_Licensing_Orders/Development/orders.csv' WITH (format csv, DELIMITER ',', header true, force_null (ord_num, ord_stat, account, user_id, season, county, tax_rate, tax_amt, disc_pct, disc_amt, freight, ord_tot, bal_due, pmt1_amt, pmt1_dt, pmt2_amt, pmt2_dt, note1, note2, sales_exec, split, chk_num1, chk_name1, chk_amt1, exp_mon1, exp_yr1, auth1, ref1, chk_num2, chk_name2, chk_amt2, exp_mon2, exp_yr2, auth2, ref2, b_fname, b_lname, b_company, b_addr1, b_addr2, b_city, b_state, b_zip, s_phone, s_fname, s_lname, s_company, s_addr1, s_addr2, s_city, s_state, s_zip, orig_by, orig_dt, edit_by, edit_dt, security_cd1, security_cd2, export_date, exec_email, s_fax, do_not_ship, ship_email, "CommissionType", "OrderType", "PaymentPlan", "sfOrderId"), ENCODING 'UTF8' );
COPY orders.ord_items FROM E'//XXXXXXXXXXX/Product_Licensing_Orders/Development/ord_items.csv' WITH (format csv, DELIMITER ',', header true, force_null (item_num, ord_num, prod_cd, qty, uprice, dt, bank_id, ship_via, bank_stat, "sfOrderLineId"), ENCODING 'UTF8' );
COPY orders.ord_pkg FROM E'//XXXXXXXXXXX/Product_Licensing_Orders/Development/ord_pkg.csv' WITH (format csv, DELIMITER ',', header true, force_null ("rowID", parent_cd, child_cd, season, sequence_number), ENCODING 'UTF8' );
COPY orders.ord_stock FROM E'//XXXXXXXXXXX/Product_Licensing_Orders/Development/ord_stock.csv' WITH (format csv, DELIMITER ',', header true, force_null ("rowID", prod_cd, season, idescr, qty, uprice, backorder, shippable, bank_dependent, derived, taxable, pkg_id1, pkg_id2, act_code, "Discountable", "UserActivationTypeID", "eFileActive"), ENCODING 'UTF8' );
COPY orders.orders_ext FROM E'//XXXXXXXXXXX/Product_Licensing_Orders/Development/orders_ext.csv' WITH (format csv, DELIMITER ',', header true, force_null (ord_num, svc_fee, rest_fee, ref_credit), ENCODING 'UTF8' );
COPY orders.sls_comm4ord FROM E'//XXXXXXXXXXX/Product_Licensing_Orders/Development/sls_comm4ord.csv' WITH (format csv, DELIMITER ',', header true, force_null (ord_comm_id, ord_num, sales_exec, amount, upd_by, upd_dt, com_dt), ENCODING 'UTF8' );
RETURN True;
Thanks,
chetty
From: Wells Oliver <wells.oliver@gmail.com>
Sent: Wednesday, January 8, 2020 8:54 AM
To: pgsql-admin <pgsql-admin@postgresql.org>
Subject: COPY from STDIN vs file with large CSVs
I have a CSV that's ~30GB. Some 400m rows. Would there be a meaningful performance difference to run COPY from STDIN using: cat f.csv | psql "COPY .. FROM STDIN WITH CSV" versus just doing "COPY ... FROM 'f.csv' WITH CSV"?
Thanks. It took about four hours to copy one and I felt that was a little much.
--
Wells Oliver
wells.oliver@gmail.com
Вложения
On 1/8/20 10:54 AM, Wells Oliver wrote: > I have a CSV that's ~30GB. Some 400m rows. Would there be a meaningful > performance difference to run COPY from STDIN using: cat f.csv | psql > "COPY .. FROM STDIN WITH CSV" versus just doing "COPY ... FROM 'f.csv' > WITH CSV"? > > Thanks. It took about four hours to copy one and I felt that was a little > much. catting the file starts another process, and opens a pipe. That can't be faster than "COPY ... FROM ... WITH CSV". pg_bulkload (which might be in your repository) is probably what you really want. -- Angular momentum makes the world go 'round.
> On Jan 8, 2020, at 10:05 AM, Indirani Venkopa Chetty <ivchetty@crosslinktax.com> wrote: > > I am trying import CSV file in to Amazon RDS PostgreSQL > I am getting following error , In internal PostgreSQL it works > <image001.jpg> > The error message is telling you that you will have to connect to postgres as a user with pg superuser privileges.
On 1/8/20 10:54 AM, Wells Oliver wrote:
> I have a CSV that's ~30GB. Some 400m rows. Would there be a meaningful
> performance difference to run COPY from STDIN using: cat f.csv | psql
> "COPY .. FROM STDIN WITH CSV" versus just doing "COPY ... FROM 'f.csv'
> WITH CSV"?
>
> Thanks. It took about four hours to copy one and I felt that was a little
> much.
catting the file starts another process, and opens a pipe. That can't be
faster than "COPY ... FROM ... WITH CSV".
pg_bulkload (which might be in your repository) is probably what you really
want.
--
Angular momentum makes the world go 'round.
--
wells.oliver@gmail.com
Hi Wells,
I am trying import CSV file in to Amazon RDS PostgreSQL
I am getting following error , In internal PostgreSQL it works
Вложения
I have a CSV that's ~30GB. Some 400m rows. Would there be a meaningful performance difference to run COPY from STDIN using: cat f.csv | psql "COPY .. FROM STDIN WITH CSV" versus just doing "COPY ... FROM 'f.csv' WITH CSV"?