Обсуждение: 10 row/second insert in ssis

Поиск
Список
Период
Сортировка

10 row/second insert in ssis

От
Bill Kuhn
Дата:
Greetings.

Is there anything special to configure to use the 32 bit odbc driver?

I'm attempting to use the odbc driver to transfer data from microsoft sql server to postgresql and I'm seeing dismal insert performance.  It is as if each insert is being performed individually.

The rows I'm inserting only have one integer column.

Is there anything special that needs to be done to configure the driver to improve performance?

Many thanks.

Re: 10 row/second insert in ssis

От
Adrian Klaver
Дата:
On 03/30/2016 12:00 PM, Bill Kuhn wrote:
> Greetings.
>
> Is there anything special to configure to use the 32 bit odbc driver?
>
> I'm attempting to use the odbc driver to transfer data from microsoft
> sql server to postgresql and I'm seeing dismal insert performance.  It
> is as if each insert is being performed individually.
>
> The rows I'm inserting only have one integer column.
>
> Is there anything special that needs to be done to configure the driver
> to improve performance?

Well given the almost complete lack of information provided, I would the
answer is maybe, depending on:

psqlodbc version?

SQL Server version?

Postgres version?

Exactly how are you transferring the data?

Or to put it another way, what client are you using?

What does the Postgres log show?

What does the ODBC log show?

>
> Many thanks.


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: 10 row/second insert in ssis

От
Adrian Klaver
Дата:
On 03/30/2016 03:35 PM, Bill Kuhn wrote:

Ccing list
> Adrian,
>
> Sorry for not including details earlier.
>
> Here is detail information:
> psqlodbc version = psqlodbc_09_05_0100-x86
> sql server version = 11.0.3401.0 (2012)
> postgres version = PostgreSQL 9.5.1 on x86_64-pc-linux-gnu, compiled by
> gcc (Debian 4.9.2-10) 4.9.2, 64-bit
> client/ETL tool = ssis
> postgresql log = ?  I don't have ssh access to the box.  If this
> information is needed I'll work with the admin to get it
> odbc log = didn't see anything jumping out at me but I attached a copy
> if you'd like to take a look
>
> Iteration 1:
> 1.  Created a odbc system dsn via odbad32 tool (windows) and
> successfully tested connection
> 2.  Created an odbc connection manager (ssis) using odbc dsn created in
> step #1
> 3.  Created an odbc destination using connection manager created in step #2
> 4.  Created a source (sql server ole db, query that select a single
> integer column)
> 5.  Mapped the source column to the destination column
> 6.  Executed data flow for 100 records
> 7.  Observed execution time = 10 seconds (10 records/second)
>
> Iteration 2:
> The only thing I changed from Iteration 1 was to use an ado.net
> <http://ado.net> connection manager but it still used the same odbc dsn.
> Same result.
>
> To prove that postgres is not having issues I cross joined the table
> several times in postgres and inserted 11 million records in 14
> seconds.  I did this just to make sure it wasn't something poorly
> configured on the postgres side.

>
> Finally, I reproduced the steps in Iteration 1 on my laptop (originally
> I was developing on a remote server) and I observed the same 10
> records/second throughput.
>
> After trying the above I then resorted to tweaking a setting (either on
> the odbc dsn or within ssis), running, not seeing any difference,
> reverting, tweaking a different setting, etc.  No change in performance
> regardless of any settings.
>
> I work a lot with sql server and oracle in ssis and I'm used to commits
> being done in batches.  With postgresql it seems like each insert is
> being executed/committed separately (not in batch).


By default psqlodbc runs in autocommit mode which would explain the above.

I do not use SSIS so I am not going to be of much help there. All I can
say is maybe check the 2,3,4 steps you mention above for a setting that
turns autocommit off. This means though there will need to be  explicit
BEGIN/COMMIT commands somewhere. Do you know how the Oracle setup does that?

You might also want to look at:

https://technet.microsoft.com/en-us/library/ms131281%28v=sql.105%29.aspx

>
> Thanks in advance for any help you can provide.
>
> My real task is to migrate an entire database containing a few hundred
> million rows from sql server to postgresql and continue to send data to
> it every 10 minutes via an ssis package.  I have no doubt that I could
> figure out a way (dump to text and load into postgres) to load the
> database initially but given the throughput I'm seeing the incremental
> load would likely not be able to keep up.
>
>
>
> On Wed, Mar 30, 2016 at 4:31 PM, Adrian Klaver
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
>
>     On 03/30/2016 12:00 PM, Bill Kuhn wrote:
>
>         Greetings.
>
>         Is there anything special to configure to use the 32 bit odbc
>         driver?
>
>         I'm attempting to use the odbc driver to transfer data from
>         microsoft
>         sql server to postgresql and I'm seeing dismal insert
>         performance.  It
>         is as if each insert is being performed individually.
>
>         The rows I'm inserting only have one integer column.
>
>         Is there anything special that needs to be done to configure the
>         driver
>         to improve performance?
>
>
>     Well given the almost complete lack of information provided, I would
>     the answer is maybe, depending on:
>
>     psqlodbc version?
>
>     SQL Server version?
>
>     Postgres version?
>
>     Exactly how are you transferring the data?
>
>     Or to put it another way, what client are you using?
>
>     What does the Postgres log show?
>
>     What does the ODBC log show?
>
>
>         Many thanks.
>
>
>
>     --
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: 10 row/second insert in ssis

От
"Tsunakawa, Takayuki"
Дата:
Hello, Bill,

10 row inserts/s is very slow, isn't it?  Let's where the slow spot is.

* To know how long the server takes to execute inserts, add "log_min_duration_statement = 0" in postgresql.conf.  This
willrecord timing infomation like this in the server log:
 

LOG:  duration: 129.236 ms  statement: insert into a values(1);

* To know how long for what the ODBC driver takes, enable logging by adding the following registry values and
restartingthe application (SSIS?):
 

key: {HKEY_LOCAL_MACHINE or HKEY_CURRENT_USER}\SOFTWARE\ODBC\ODBC.INI\<your_DSN>
value:
  CommLog: REG_DWORD: 1
  Debug: REG_DWORD: 1
  Logdir: REG_SZ: folder path where you want to store the log files

This will create psqlodbc_<pid>.log and mylog_<pid>.log in the specified folder.  <pid> is the process ID of the
applicationusing ODBC.  If not found, the log files may be stored in the following places:
 

* C:\
* C:\podbclog
* %HOMEDRIVE%%HOMEPATH%

Regards
Takayuki Tsunakawa