Обсуждение: Drill-downs and OLAP type data

Поиск
Список
Период
Сортировка

Drill-downs and OLAP type data

От
Anthony Presley
Дата:
Hi there!

We have a typical data-warehouse type application, and we'd like to set up a star-schema type data analysis software product (which we'll be programming), on top of PG.  The goal is to do fast roll-up, drill-down, and drill-through of objects / tables like locations, inventory items, and sales volume.

After a few weeks of searching around, we're running into dead-ends on the front-end, and the back-end.  PG doesn't support OLAP / MDX .... and the GUI tools that do this, for the most part, require MDX / OLAP (SPSS and DB2, MS SQL Analytics, etc...).

What's the PG route here?  Are there some secrets / tips / tricks / contrib modules for handling this?


--
Anthony Presley

Re: Drill-downs and OLAP type data

От
Craig Ringer
Дата:
On 10/12/2011 11:50 AM, Anthony Presley wrote:

> What's the PG route here?  Are there some secrets / tips / tricks /
> contrib modules for handling this?

I don't see much discussion of DW, OLAP-type workloads here. Pg doesn't
support index-oriented tables (though IIRC 9.2 will add covering indexes
- yay!), column-oriented storage, or other features that're pretty basic
to OLAP workloads.

Have you looked at Greenplum?

--
Craig Ringer

Re: Drill-downs and OLAP type data

От
Filip Rembiałkowski
Дата:

Mondrian (which is a part of Pentaho BI stack) is an open source OLAP engine with MDX.
See http://community.pentaho.com/projects/bi_platform/


2011/10/12 Anthony Presley <anthony@resolution.com>
Hi there!

We have a typical data-warehouse type application, and we'd like to set up a star-schema type data analysis software product (which we'll be programming), on top of PG.  The goal is to do fast roll-up, drill-down, and drill-through of objects / tables like locations, inventory items, and sales volume.

After a few weeks of searching around, we're running into dead-ends on the front-end, and the back-end.  PG doesn't support OLAP / MDX .... and the GUI tools that do this, for the most part, require MDX / OLAP (SPSS and DB2, MS SQL Analytics, etc...).

What's the PG route here?  Are there some secrets / tips / tricks / contrib modules for handling this?


--
Anthony Presley

Re: Drill-downs and OLAP type data

От
Ondrej Ivanič
Дата:
Hi,

On 12 October 2011 14:50, Anthony Presley <anthony@resolution.com> wrote:
> After a few weeks of searching around, we're running into dead-ends on the
> front-end, and the back-end.  PG doesn't support OLAP / MDX .... and the GUI
> tools that do this, for the most part, require MDX / OLAP (SPSS and DB2, MS
> SQL Analytics, etc...).
> What's the PG route here?  Are there some secrets / tips / tricks / contrib
> modules for handling this?

Our db has very simple star schema and several materialised tables for
aggregates. We need OLAP style queries with OLTP performance :). After
several upgrades we ended up with 16 cores, 80GB of RAM and fast SAN
but performance wasn't good. Meanwhile we evaluated several NoSQL
solutions and several comercial MPP solutions -- and the winner is
Greenplum! (you can get CE version for free -- max two processors (but
unlimited cores) and almost no HA options). AsterData's nCluster has
very nice MapReduce integration but looks quite immature.

I would say that roll-ups/drill-downs/cubes sound obsolete to me :) I
suggest the following tools (list is not sorted by anything)
- RapidMiner (http://rapid-i.com/content/view/181/196/)
- Tableau (http://www.tableausoftware.com/public/gallery)
- AlpineMiner (http://www.alpinedatalabs.com/, http://alpineminer.org/)
- Jasper (http://www.jaspersoft.com/)
- Pentaho (http://www.pentaho.com/)

--
Ondrej Ivanic
(ondrej.ivanic@gmail.com)

Re: Drill-downs and OLAP type data

От
Anthony Presley
Дата:
2011/10/12 Ondrej Ivanič <ondrej.ivanic@gmail.com>
Hi,

On 12 October 2011 14:50, Anthony Presley <anthony@resolution.com> wrote:
> After a few weeks of searching around, we're running into dead-ends on the
> front-end, and the back-end.  PG doesn't support OLAP / MDX .... and the GUI
> tools that do this, for the most part, require MDX / OLAP (SPSS and DB2, MS
> SQL Analytics, etc...).
> What's the PG route here?  Are there some secrets / tips / tricks / contrib
> modules for handling this?

Our db has very simple star schema and several materialised tables for
aggregates. We need OLAP style queries with OLTP performance :). After
several upgrades we ended up with 16 cores, 80GB of RAM and fast SAN
but performance wasn't good. Meanwhile we evaluated several NoSQL
solutions and several comercial MPP solutions -- and the winner is
Greenplum! (you can get CE version for free -- max two processors (but
unlimited cores) and almost no HA options). AsterData's nCluster has
very nice MapReduce integration but looks quite immature.

The *problem* with Greenplum is that it's ultra-expensive once you leave the CE version - and you're not supposed to be using the CE version for commercial usage last I read the license.  Has that changed?

The last pricing I saw was around $16k per CPU (it may have been per core?).  At that point, you're now in the realm of SQL Server Analysis and DB2 SPSS.
 
I would say that roll-ups/drill-downs/cubes sound obsolete to me :) I
suggest the following tools (list is not sorted by anything)
- RapidMiner (http://rapid-i.com/content/view/181/196/)
- Tableau (http://www.tableausoftware.com/public/gallery)
- AlpineMiner (http://www.alpinedatalabs.com/, http://alpineminer.org/)
- Jasper (http://www.jaspersoft.com/)
- Pentaho (http://www.pentaho.com/)

Thanks, but I'm well aware of all of these packages.  Our problem is pairing up a web based GUI to a database we love using.  Doesn't seem possible, because the user-friendly OLAP / data analysis / dashboard tools are all expecting functionality that PG doesn't have.

It sounds like, we're either choosing a different DB to work with the pretty GUI tools, or writing a GUI tool to work with PG.


--
Anthony Presley

Re: Drill-downs and OLAP type data

От
Anthony Presley
Дата:
2011/10/12 Filip Rembiałkowski <plk.zuber@gmail.com>

Mondrian (which is a part of Pentaho BI stack) is an open source OLAP engine with MDX.
See http://community.pentaho.com/projects/bi_platform/

Thanks ... we're familiar with Mondrian.  Unfortunately, the only stable GUI for it under $30k is jPivot, which is basically dead since 2008.  And, IMHO, it's a very ugly product.

The newer jPivot (which Pentaho and Jasper have both "forked"), is somewhat an improvement.

But now, we need a PG database to store the raw data in, and a Mondrian OLAP server to store the drill-downs in.

... Was hoping this wasn't going to be 30 tools to accomplish our needs.
 
 
--
Anthony Presley

Re: Drill-downs and OLAP type data

От
Ondrej Ivanič
Дата:
Hi,

> The *problem* with Greenplum is that it's ultra-expensive once you leave the
> CE version - and you're not supposed to be using the CE version for
> commercial usage last I read the license.  Has that changed?

Not sure. I haven't seen something like that in the license. After POC
we bought HW and license from EMC

> The last pricing I saw was around $16k per CPU (it may have been per core?).
>  At that point, you're now in the realm of SQL Server Analysis and DB2 SPSS.

Yeah, it is not cheap but it is "drop in" replacement for Postgres and
we can connect to it from PHP without any issues. (PHP PDO + ODBC
doesn't work very well)

>
> Our problem is pairing
> up a web based GUI to a database we love using.  Doesn't seem possible,
> because the user-friendly OLAP / data analysis / dashboard tools are all
> expecting functionality that PG doesn't have.

Could you please name few of them? I'm looking for something like
Tableau but web based (without .Net) and hackable (our DB is partially
encrypted)

> It sounds like, we're either choosing a different DB to work with the pretty
> GUI tools, or writing a GUI tool to work with PG.

I think you are right here.

--
Ondrej Ivanic
(ondrej.ivanic@gmail.com)

Re: Drill-downs and OLAP type data

От
Anthony Presley
Дата:
Notes are below ...

2011/10/12 Ondrej Ivanič <ondrej.ivanic@gmail.com>
Hi,

> The *problem* with Greenplum is that it's ultra-expensive once you leave the
> CE version - and you're not supposed to be using the CE version for
> commercial usage last I read the license.  Has that changed?

Not sure. I haven't seen something like that in the license. After POC
we bought HW and license from EMC

Yes - I understand with a full license from EMC, or one of their appliances, the data does fly.
 
> Our problem is pairing
> up a web based GUI to a database we love using.  Doesn't seem possible,
> because the user-friendly OLAP / data analysis / dashboard tools are all
> expecting functionality that PG doesn't have.

Could you please name few of them? I'm looking for something like
Tableau but web based (without .Net) and hackable (our DB is partially
encrypted)

Try InetSoft - we have used them for reporting in the past, and they have an awesome drill-down, dashboard, reporting tool.  It's also very hackable.

I haven't looked at LogiXML in a while - but their Ad-Hoc tool was pretty impressive.

I had forgotten about Tableau, and it appears they have a server component now.  That's good to hear.


--
Anthony Presley