Обсуждение: Up-to-date reports database

Поиск
Список
Период
Сортировка

Up-to-date reports database

От
Herouth Maoz
Дата:
Hi guys,

I'm interested in a solution that will allow our customers to run reports - which may involve complicated queries - on
datawhich is as up-to-date as possible. 

One thing I don't want to do is to let the reporting system connect to the production database. I want the indexes in
productionto be limited to what production needs, and not add indexes that are required for reports, for instance. And
basically,I don't want a customer to run a complicated report and degrade the performance of my production system. 

A replication solution is not very good, either, because of course I can't define indexes differently, I don't want
*all*transactions in all tables to be sent, and also, because I may want to cross reference data from different
systems.So ideally, I want to have a reporting database, where specific tables (or maybe even just specific columns)
fromvarious databases are collected, and have a reporting tool connect to this database. But I want to push the data
intointo that database as close to real time as possible. 

The most important data I am currently considering are two tables which have an average of 7,600 transactions per hour
(standarddeviation 10,000, maximum in May is 62,000 transactions per hour). There may be similar pairs of tables
collectedfrom more than one database. 

I assume this is not an uncommon scenario. What solutions would you recommend?


Herouth

Re: Up-to-date reports database

От
Chris Ernst
Дата:
On 05/23/2012 07:11 AM, Herouth Maoz wrote:
> Hi guys,
>
> I'm interested in a solution that will allow our customers to run reports - which may involve complicated queries -
ondata which is as up-to-date as possible. 
>
> One thing I don't want to do is to let the reporting system connect to the production database. I want the indexes in
productionto be limited to what production needs, and not add indexes that are required for reports, for instance. And
basically,I don't want a customer to run a complicated report and degrade the performance of my production system. 
>
> A replication solution is not very good, either, because of course I can't define indexes differently, I don't want
*all*transactions in all tables to be sent, and also, because I may want to cross reference data from different
systems.So ideally, I want to have a reporting database, where specific tables (or maybe even just specific columns)
fromvarious databases are collected, and have a reporting tool connect to this database. But I want to push the data
intointo that database as close to real time as possible. 
>
> The most important data I am currently considering are two tables which have an average of 7,600 transactions per
hour(standard deviation 10,000, maximum in May is 62,000 transactions per hour). There may be similar pairs of tables
collectedfrom more than one database. 
>
> I assume this is not an uncommon scenario. What solutions would you recommend?

Hi Herouth,

I would have a look at slony.  It is a trigger based replication system
that allows you to replicate only the tables you define and you can have
different indexing on the slave.  The only requirement is that each
table you want to replicate has the same primary key or unique
constraint on the master and slave.  Other than that, you can index the
tables on the slave however you want.

Cheers!

    - Chris

Re: Up-to-date reports database

От
Herouth Maoz
Дата:
On 23/05/2012, at 17:20, Chris Ernst wrote:

> I would have a look at slony.  It is a trigger based replication system
> that allows you to replicate only the tables you define and you can have
> different indexing on the slave.  The only requirement is that each
> table you want to replicate has the same primary key or unique
> constraint on the master and slave.  Other than that, you can index the
> tables on the slave however you want.

Thanks for the pointer. I will read up about it.

Herouth

Re: Up-to-date reports database

От
Bartosz Dmytrak
Дата:
hi,
my suggestion is to redesign reporting database to fit reporting specifics (e.g. brake normal form of database, in some cases this will speed up reports). Than you can use some ETL tool to sync production and reporting. Good thing is to use some OLAP software to use multidimensional analyze - this will make queries easier (with MDX language). I think this kind of discussion is huge one :)

there are some opensource ETL and BI suits available.

Regards,
Bartek 

Re: Up-to-date reports database

От
Herouth Maoz
Дата:
On 23/05/2012, at 18:54, Bartosz Dmytrak wrote:

> hi,
> my suggestion is to redesign reporting database to fit reporting specifics (e.g. brake normal form of database, in
somecases this will speed up reports). Than you can use some ETL tool to sync production and reporting. Good thing is
touse some OLAP software to use multidimensional analyze - this will make queries easier (with MDX language). I think
thiskind of discussion is huge one :) 
>
> there are some opensource ETL and BI suits available.

Thanks, I'll take that into consideration.


Re: Up-to-date reports database

От
Sergey Konoplev
Дата:
On Wed, May 23, 2012 at 5:11 PM, Herouth Maoz <herouth@unicell.co.il> wrote:
> A replication solution is not very good, either, because of course I can't define indexes differently, I don't want
*all*transactions in all tables to be sent, and also, because I may want to cross reference data from different
systems.So ideally, I want to have a reporting database, where specific tables (or maybe even just specific columns)
fromvarious databases are collected, and have a reporting tool connect to this database. But I want to push the data
intointo that database as close to real time as possible. 

Look at PgQ from Skytools. You can queue your OLTP data changes and
restore specific columns only into your OLAP database.

>
> The most important data I am currently considering are two tables which have an average of 7,600 transactions per
hour(standard deviation 10,000, maximum in May is 62,000 transactions per hour). There may be similar pairs of tables
collectedfrom more than one database. 
>
> I assume this is not an uncommon scenario. What solutions would you recommend?
>
>
> Herouth
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general



--
Sergey Konoplev

a database and software architect
http://www.linkedin.com/in/grayhemp

Jabber: gray.ru@gmail.com Skype: gray-hemp Phone: +79160686204