Re: Normalization or Performance

Поиск
Список
Период
Сортировка
От Richard Huxton
Тема Re: Normalization or Performance
Дата
Msg-id 41AF6050.9030302@archonet.com
обсуждение исходный текст
Ответ на Normalization or Performance  (Alvaro Nunes Melo <al_nunes@atua.com.br>)
Список pgsql-performance
Alvaro Nunes Melo wrote:
> Hi,
>
> Before writing this mail, I'd researched a little about this topic,
> and got some opinions from guys like Fabien Pascal, who argues that
> logical design should be separated from physical design, and other
> sources. As this is not fact, I'm writing to you guys, that make
> things work in real world.

I believe he's right. Or at least that you should only compromise your
logical design once it becomes absolutely necessary due to physical
limitations.

> We started our first big (for our company standards) project always
> thinking in normalization. But once we imported legacy data into the
> DB, things got harder.
>
> One example is the clients status. A client might be active, inactive
> or pending (for many reasons). We store all the status a client have
> since it is in the system. To check what is the actual status of a
> client, we get the last status from this historical status table.
> This take a considerable time, so our best results were achieved
> building  a function that checks the status and indexing this
> function. The problem is that indexed functions mus bu immutable, so
> as you can figure, if the status change after the creation of the
> index, the retunr of the function is still the same.
>
> What do you suggest for situations like this? Should I add a field to
>  clients table and store its actual status, keep storing data in the
> historical table an control its changes with a trigger?

Trigger + history table is a common solution, it's easy to implement and
there's nothing non-relational about it as a solution.

> There are other situations that are making things difficult to us.
> For example, one query must return the total amount a client bought
> in the last 90 days. It's taking too long, when we must know it for
> many clients, many times. So should I create summarization tables to
> store this kind of stuff, update it with a trigger in daily basis
> (for example), and solve this problem with one join?

One solution I use for this sort of thing is a summary table grouped by
date, and accurate until the start of today. Then, I check the summary
table and the "live" table for todays information and sum those.

> Our database is not that big. The larger table has about 7.000.000
> rows. About 50.000 clients, half of them active. All that I'd point
> out above uses indexes for queries, etc. But even with this it's not
> been fast enough. We have a Dell server for this (I know, the Dell
> issue), a Dual Xeon 2.8, SCSI HD, 1 GB mem. Do we need better
> hardware for our system?

Swap one of your processors for more RAM and disks, perhaps.

--
   Richard Huxton
   Archonet Ltd

В списке pgsql-performance по дате отправления:

Предыдущее
От: Andrew McMillan
Дата:
Сообщение: Re: pg_restore taking 4 hours!
Следующее
От: Dave Cramer
Дата:
Сообщение: Re: Alternatives to Dell?