Обсуждение: Linux PostgreSQL - Data Mart & Small Data Warehousing

Поиск
Список
Период
Сортировка

Linux PostgreSQL - Data Mart & Small Data Warehousing

От
"QUINN, JAY (SBCSI)"
Дата:
Hi all, I was wondering if there is anyone out there who has successfully
used a Linux OS with a version of PostgreSQL for a active data mart, and
perhaps for small to medium sized business as a data warehouse?  I'm
interested in building a data mart/warehouse type of application, and would
like to know if PostgreSQL is as good or a better DB to use than the usual
MS SQL Server & Oracle.  Also, what is the most common client software for a
typical user to interface with PostgreSQL?  And the most common client
software for a developer interfacing to PostgreSQL?  Thanks, Jay
mailto:jq8274@sbc.com

Re: Linux PostgreSQL - Data Mart & Small Data Warehousing

От
"scott.marlowe"
Дата:
On Mon, 11 Nov 2002, QUINN, JAY (SBCSI) wrote:

> Hi all, I was wondering if there is anyone out there who has successfully
> used a Linux OS with a version of PostgreSQL for a active data mart, and
> perhaps for small to medium sized business as a data warehouse?  I'm
> interested in building a data mart/warehouse type of application, and would
> like to know if PostgreSQL is as good or a better DB to use than the usual
> MS SQL Server & Oracle.  Also, what is the most common client software for a
> typical user to interface with PostgreSQL?  And the most common client
> software for a developer interfacing to PostgreSQL?  Thanks, Jay
> mailto:jq8274@sbc.com

We've just started using it internally and it seems a better choice than
MS SQL, where we were looking at $20,000 just for licensing MSSQL, not
counting the server.  I can build a heckuva box for th $25,000 we would
have spent on MSSQL server.

Generally, postgresql has been faster than MSSQL server.  We were loading
about 250Megs worth of data through an application in a Notes server
accessing both MSSQL and pgsql via ODBC.  The run time of the script was
14 minutes on postgresql and 35 or more minutes on MSSQL.  I dumped the
data out in SQL insert format (--inserts switch) and reloading it locally
took 1.5 minutes, remotely it took 2.5 minutes across the network.

We also have a table of sales figures that weighs in at about 150Megs of
raw data, about 350 on the drive.  We can search for individual records in
about 0.00044 seconds if there's an index on them.

You can build a nice little Dual PIII machine with 1 gig ram and a pair of
120 Gig UDMA133 drives for very little (<$2,000) that will be a real
screamer for data warehousing.

We interface to our Postgresql database in a variety of ways:

Access (read only for sales and marketing) via ODBC
PHP via postgresql native AND ODBC
Java client apps via jdbc (nothing heavy, just a few custom desktop apps.)
Lotus Domino via ODBC
Import data from the mainframe with a cron job running ftp and a shell
script to format and import the data.

When we first started using postgresql, version 6.5.3, it was slower than
MSSQL and Oracle by a fair bit and still had some problems.  Now that we
are running 7.2.3 (soon to be 7.3) we are glad we stuck it out.

Grab an old PII-350 with 256 Megs of ram and give it a spin, it's quite
nice, and you may find that your "test" server becomes a production server
with little or no cost to your company.