Обсуждение: Slow transfer speeds

Поиск
Список
Период
Сортировка

Slow transfer speeds

От
hansell baran
Дата:
Hi. I'm new at using PostgreSQL. I have found posts related to this one but there is not a definite answer or solution. Here it goes.
Where I work, all databases were built with MS Access. The Access files are hosted by computers with Windows 2000 and Windows XP. A new server is on its way and only Open Source Software is going to be installed. The OS is going to be SUSE Linux 10.1 and we are making comparisons between MySQL, PostgreSQL and MS Access. We installed MySQL and PostgreSQL on both SUSE and Windows XP (MySQL & PostgreSQL DO NOT run at the same time)(There is one HDD for Windows and one for Linux)
The "Test Server" in which we install the DBMS has the following characteristics:

CPU speed = 1.3 GHz
RAM = 512 MB
HDD = 40 GB

The biggest table has 544371 rows(tuples?) with 55 rows. All fields are float8. Only 1 is varchar(255) and 1 timestamp.
We query the MS Access databases through Visual Basic Programs and ODBC Drivers. We made a Visual Basic program that uses ADO to connect to ALL three DBMS using ODBC drivers.

When we run the following query "SELECT * FROM big_table", we get the following resutls:

MS Access
- Execution time ~ 51 seconds (Depending on the client machine, it can go as low as 20 seconds)
- Network Utilization ~ 75 Mbps (According to Windows Task Manager)

MySQL 5.0(under Windows)
- Execution time ~ 630 seconds
- Network Utilization ~ 8 Mbps

PostgreSQL 8.1(under Windows)
- Execution time ~ 290 seconds)
- Network Utilization ~ 13 Mbps


MS Access (under Linux. MS Access files are in the Linux computer which has the SAMBA server running. The client computer has a mapped network drive that conects to the Linux files.)
- Execution time ~ 55 seconds (Depending on the client machine, it can go as low as 20 seconds)
- Network Utilization ~ 70 Mbps (According to Windows Task Manager)

MySQL 5.0(under Linux)
- Execution time ~ 440 seconds
- Network Utilization ~ 11 Mbps

PostgreSQL 8.1(under Linux)
- Execution time ~ 180 seconds)
- Network Utilization ~ 18 Mbps

Due to the fact that the query returns a lot of rows, I cannot use the ODBC driver with the "Use Declare/Fetch" option disabled. If I run the query with this option disabled, the transfer speed goes up to about 20 Mpbs (PostgreSQL in Windows) and ~35 Mbps (PostgreSQL in Linux) (The transfer speed never goes beyond 40 Mbps even if we query from several clients at the same time. If we query MS Access from several machines, the transfer speed goes almost to 85 Mbps. Obviously, these simultaneous querys run slower). The problem with running the query with the "Use Declare/Fetch" option disabled is that the client computer shows an error saying "Out of memory while reading tuples".

Very different results are obtained if a the query "SELECT * from big_table ORDER BY "some_column"". In this scenario PostgreSQL is faster than MS Access or MySQL by more than 100 seconds. Transfer speed, however, transfer speed is still slower for PostgreSQL than for MS Access.

We have run many other queries (not complex, at most nesting of 5 inner joins) and MS Access is always faster. We have seen by looking at the network activity in the Windows Task Manager that the main problem is the transfer speed. We also have noticed that MS Access quickly downloads the file that has the necesary information and works on it locally on the client computer. The queries, obviously, run faster if the client computer has more resources (CPU speed, RAM, etc.). The fact that the client computer does not use any resource to execute the query, only to receive the results, is one big plus for PostgreSQL (we think). We need,however, to improve the performance of the queries that return a lot of rows because those are the most used queries.

We searched the postgresql archives, mailing lists, etc. and have tried changing the parameters of the PostgreSQL server(both on Linux and Windows)(We also tried with the default parameters) and changing the parameters of the ODBC driver as suggested. We still get aproximately the same results. We have even changed some TCP/IP parameters(only in Windows) but no improvement.

We have turned off all tracings, logs, and debugs of the ODBC driver. The behaviour is the same when querying from pgAdmin III.

To get to the point: Is this problem with the transfer rates a PostgreSQL server/PostgresQL ODBC driver limitation?
Is there a way to increase the transfer rates?

Thank you very much for any help received!

Hansell E. Baran Altuve


Do you Yahoo!?
Get on board. You're invited to try the new Yahoo! Mail Beta.

Re: Slow transfer speeds

От
Scott Marlowe
Дата:
On Mon, 2006-08-07 at 12:26, hansell baran wrote:
> Hi. I'm new at using PostgreSQL. I have found posts related to this
> one but there is not a definite answer or solution. Here it goes.
> Where I work, all databases were built with MS Access. The Access
> files are hosted by computers with Windows 2000 and Windows XP. A new
> server is on its way and only Open Source Software is going to be
> installed. The OS is going to be SUSE Linux 10.1 and we are making
> comparisons between MySQL, PostgreSQL and MS Access. We installed
> MySQL and PostgreSQL on both SUSE and Windows XP (MySQL & PostgreSQL
> DO NOT run at the same time)(There is one HDD for Windows and one for
> Linux)
> The "Test Server" in which we install the DBMS has the following
> characteristics:
>
> CPU speed = 1.3 GHz
> RAM = 512 MB
> HDD = 40 GB

Just FYI, that's not only not much in terms of server, it's not even
much in terms of a workstation.  My laptop is about on par with that.

Just sayin.

OK, just so you know, you're comparing apples and oranges.  A client
side application like access has little or none of the overhead that a
real database server has.

The advantage PostgreSQL has is that many people can read AND write to
the same data store simultaneously and the database server will make
sure that the underlying data in the files never gets corrupted.
Further, with proper constraints in place, it can make sure that the
data stays coherent (i.e. that data dependencies are honored.)

As you can imagine, there's gonna be some overhead there.  And it's
wholly unfair to compare a databases ability to stream out data in a
single read to access.  It is the worst case scenario.

Try having 30 employees connect to the SAME access database and start
updating lots and lots of records.  Have someone read out the data while
that's going on.  Repeat on PostgreSQL.

If you're mostly going to be reading data, then maybe some intermediate
system is needed, something to "harvest" the data into some flat files.

But if your users need to read out 500,000 rows, change a few, and write
the whole thing back, your business process is likely not currently
suited to a database and needs to be rethought.