Обсуждение: Performance Benchmarking for data-warehousing instance?

Поиск
Список
Период
Сортировка

Performance Benchmarking for data-warehousing instance?

От
Huy Nguyen
Дата:
We're using PostgreSQL to host our analytics (OLAP) database and trying to tune our configuration for better performance.

Is there any existing PG performance benchmarking tool set that tailors for OLAP purpose? Right now what we do is just go through the list of documentations 

I think pgtune is optimized more for OLTP application. Is there something similar to pgtune/pgbench for OLAP?

Thanks,
Huy

Re: Performance Benchmarking for data-warehousing instance?

От
Peter Geoghegan
Дата:
On Fri, Feb 7, 2014 at 7:36 PM, Huy Nguyen <huy@viki.com> wrote:
> I think pgtune is optimized more for OLTP application. Is there something
> similar to pgtune/pgbench for OLAP?

IIRC pgtune can be told to give out an OLAP-optimized postgresql.conf.
Maybe that's only recent versions?

--
Regards,
Peter Geoghegan


Re: Performance Benchmarking for data-warehousing instance?

От
Huy Nguyen
Дата:
Do you happen to have a link to it? Though I think different machine specs should yield different optimal postgresql.conf.

I'm looking for a hand-crafted set of data + queries tailored for OLAP so that I can try to manually tweak one config at a time and run against the benchmark.

I might considering creating one if no one has done it before.


On Sat, Feb 8, 2014 at 12:12 PM, Peter Geoghegan <peter.geoghegan86@gmail.com> wrote:
On Fri, Feb 7, 2014 at 7:36 PM, Huy Nguyen <huy@viki.com> wrote:
> I think pgtune is optimized more for OLTP application. Is there something
> similar to pgtune/pgbench for OLAP?

IIRC pgtune can be told to give out an OLAP-optimized postgresql.conf.
Maybe that's only recent versions?

--
Regards,
Peter Geoghegan

Re: Performance Benchmarking for data-warehousing instance?

От
Wattana Hinchaisri
Дата:
I used PostgreSQL 8.4  for Pentaho OLAP (mondrian) for a while. It's work like i want. But all i've done are the combination of Application level, Database level and Hardware level.

Our application is CRM, Then we did ETL to new PostgreSQL server, and do OLAP there. The application and OLAP DBMS is totally difference configuration. In Hardware level with hardware optimization and postgresql.conf configuration. We can run OLAP query for our need.

I think you have to know your database behavior and what OLAP you need to query. Then you will know how to configure your OLAP and postgresql.conf and Hardware you need.

Benchmark tool is only a test. But you have to know all about pg_log and performance monitoring.


Wattana


2014-02-08 13:41 GMT+07:00 Huy Nguyen <huy@viki.com>:
Do you happen to have a link to it? Though I think different machine specs should yield different optimal postgresql.conf.

I'm looking for a hand-crafted set of data + queries tailored for OLAP so that I can try to manually tweak one config at a time and run against the benchmark.

I might considering creating one if no one has done it before.


On Sat, Feb 8, 2014 at 12:12 PM, Peter Geoghegan <peter.geoghegan86@gmail.com> wrote:
On Fri, Feb 7, 2014 at 7:36 PM, Huy Nguyen <huy@viki.com> wrote:
> I think pgtune is optimized more for OLTP application. Is there something
> similar to pgtune/pgbench for OLAP?

IIRC pgtune can be told to give out an OLAP-optimized postgresql.conf.
Maybe that's only recent versions?

--
Regards,
Peter Geoghegan




--
Life has no boundaries...

Re: Performance Benchmarking for data-warehousing instance?

От
Tomas Vondra
Дата:
Hi,

On 8.2.2014 07:41, Huy Nguyen wrote:
> Do you happen to have a link to it? Though I think different machine
> specs should yield different optimal postgresql.conf.

An optimal configuration is not just about machine specs, it's about the
workload and application configuration too. So there's no benchmark that
would give you the best config for your application.

> I'm looking for a hand-crafted set of data + queries tailored for OLAP
> so that I can try to manually tweak one config at a time and run against
> the benchmark.

I think using pgtune is the best starting point you can get, and you may
tweak it based on your actual workload. If you can prepare a sample of
the workload (i.e. a representative amount of data) and run a set of
actual queries (generated by the application), that'd be an excellent
situation.

> I might considering creating one if no one has done it before.

So how exactly is that going to work? There's an benchmark for this,
called TPC-H [1], but again - this is just a model of how a DWH/DSS
application may look like.

I've spent a lot of time working with it a while ago (see [2]), and IMHO
the values recommended by pgtune are quite fine.

[1] http://www.tpc.org/tpch/default.asp
[2] http://www.fuzzy.cz/en/articles/dss-tpc-h-benchmark-with-postgresql/

regards
Tomas