Re: Large tables (was: RAID 0 not as fast as expected)

Поиск
Список
Период
Сортировка
От Bucky Jordan
Тема Re: Large tables (was: RAID 0 not as fast as expected)
Дата
Msg-id 78ED28FACE63744386D68D8A9D1CF5D42099E1@MAIL.corp.lumeta.com
обсуждение исходный текст
Ответ на Re: Large tables (was: RAID 0 not as fast as expected)  ("Merlin Moncure" <mmoncure@gmail.com>)
Ответы Re: Large tables (was: RAID 0 not as fast as expected)  ("Alex Turner" <armtuk@gmail.com>)
Список pgsql-performance
> good normalization skills are really important for large databases,
> along with materialization strategies for 'denormalized sets'.

Good points- thanks. I'm especially curious what others have done for
the materialization. The matview project on gborg appears dead, and I've
only found a smattering of references on google. My guess is, you roll
your own for optimal performance...

> regarding the number of rows, there is no limit to how much pg can
> handle per se, just some practical limitations, especially vacuum and
> reindex times.  these are important because they are required to keep
> a handle on mvcc bloat and its very nice to be able to vaccum bits of
> your database at a time.

I was hoping for some actual numbers on "practical". Hardware isn't too
much of an issue (within reason- we're not talking an amazon or google
here... the SunFire X4500 looks interesting... )- if a customer wants to
store that much data, and pay for it, we'll figure out how to do it. I'd
just rather not have to re-design the database. Say the requirement is
to keep 12 months of data accessible, each "scan" produces 100M records,
and I run one per month. What happens if the customer wants to run it
once a week? I was more trying to figure out at what point (ballpark)
I'm going to have to look into archive tables and things of that nature
(or at Bizgres/MPP). It's easier for us to add more/bigger hardware, but
not so easy to redesign/add history tables...

>
> just another fyi, if you have a really big database, you can forget
> about doing pg_dump for backups (unless you really don't care about
> being x day or days behind)...you simply have to due some type of
> replication/failover strategy.  i would start with pitr.
>
> merlin
I was originally thinking replication, but I did notice some nice pitr
features in 8.x - I'll have to look into that some more.

Thanks for the pointers though...

- Bucky

В списке pgsql-performance по дате отправления:

Предыдущее
От: "Jim C. Nasby"
Дата:
Сообщение: Re: Vacuums on large busy databases
Следующее
От: "Alex Turner"
Дата:
Сообщение: Re: Large tables (was: RAID 0 not as fast as expected)