Обсуждение: Distant mirroring

Поиск
Список
Период
Сортировка

Distant mirroring

От
dforum
Дата:
Hello,

I'm trying to install a solution to permit me to :
- Secure the datas, without RAID
- Giving ability to increase the potentiality of the database towards
the needs.

I have read about slony, DRBD, pgpool....

I don't find the good system to do what I want.

I manage for now 50 millions of request per month.

I will reach 100 millions in the end of the year I suppose.

There is 2 difficulties :
1 - is the storage : to get faster access,it is recommend to use SAS 15
000 tps. But the disk I can get are 149 GO of space. As the database is
growing par 1,7 Go per week at the moment, it will reach is maximum in 3
month. I can add 3 disk at least so It can go to 9 month. What to do
after, and especially what to do today to prevent it?
2 - The machine will treat more and more simultaneous entrance, so I
need to loadbalance those inserts/updates on several machine and to
replicate the datas between them. It's not a real problem if the data
are asynchrony.

I'm migrating to postgresql 8.3.3.

Thanks for all your remarks, suggestions and helps

David

Re: Distant mirroring

От
"Scott Marlowe"
Дата:
On Sat, Aug 9, 2008 at 11:29 AM, dforum <dforums@vieonet.com> wrote:
> Hello,
>
> I'm trying to install a solution to permit me to :
> - Secure the datas, without RAID

Nothing beats a simple mirror set for simplicity while protecting the
data, and for a pretty cheap cost.  How much is your data worth?

> - Giving ability to increase the potentiality of the database towards the
> needs.
>
> I have read about slony, DRBD, pgpool....
>
> I don't find the good system to do what I want.
>
> I manage for now 50 millions of request per month.

Assuming they all happen from 9 to 5 and during business days only,
that's about 86 transactions per second.  Well within the realm of a
single mirror set to keep up if you don't make your db work real fat.

> I will reach 100 millions in the end of the year I suppose.

That takes us to 172 transactions per second.

> There is 2 difficulties :
> 1 - is the storage : to get faster access,it is recommend to use SAS 15 000
> tps. But the disk I can get are 149 GO of space. As the database is growing
> par 1,7 Go per week at the moment, it will reach is maximum in 3 month. I
> can add 3 disk at least so It can go to 9 month. What to do after, and
> especially what to do today to prevent it?

No, don't piecemeal just enough to outrun the disk space boogieman
each month.  Buy enough to last you at least 1 year in the future.
More if you can afford it.

> 2 - The machine will treat more and more simultaneous entrance, so I need to
> loadbalance those inserts/updates on several machine and to replicate the
> datas between them. It's not a real problem if the data are asynchrony.

Then PostgreSQL might not be your best choice.    But I think you're
wrong.  You can easily handle the load you're talking about on a
mid-sized box for about $5000 to $10000.

You can use 7200 rpm SATA drives, probably 8 to 12 or so, in a RAID-10
with a battery backed cache and hit 172 transactions per second.

Given the 1+ G a week storage requirement, you should definitely look
at using inheritance  to do partitioning.  Then use slony or something
to replicate the data into the back office for other things.  There's
always other things most the time that are read only.

Re: Distant mirroring

От
dforums
Дата:
Houlala

I got headache !!!

So please help...........;;

"Assuming they all happen from 9 to 5 and during business days only,
that's about 86 transactions per second.  Well within the realm of a
single mirror set to keep up if you don't make your db work real fat."

OK i like, But my reality is that to make an insert of a table that have
27 millions of entrance it took 200 ms.
so it took between 2 minutes and 10 minutes to treat 3000 records and
dispatch/agregate in other tables. And I have for now 20000 records
every 3 minutes.

At the moment I have a

I have a Linux 2.6.24.2-xxxx-std-ipv4-64 #3 SMP Tue Feb 12 12:27:47 CET
2008 x86_64 Intel(R) Xeon(R) CPU X5355 @ 2.66GHz GenuineIntel GNU/Linux
with 8Gb of memory. Using sata II disk in RAID 1 (I known that is bad,
but it would change has quickly I can).

I got 1-2 GO per week

I can change to 2 kinds of server, using 8.3.3 postgresql server, and
even taking more sever if need. But it is the biggest computer that I
can rent for now.

Intel  2x Xeon X5355
2x 4x 2.66 GHz
L2: 8Mo, FSB: 1333MHz
Double Quadruple Coeur
64 bits
12 Go FBDIMM DDR2
2x 147 Go
SAS 15 000 tr/min
RAID 1 HARD

I can add 500 Go under sataII

OR

Intel  2x Xeon X5355
2x 4x 2.66 GHz
L2: 8Mo, FSB: 1333MHz
Double Quadruple Coeur
64 bits
12 Go FBDIMM DDR2
5x 750 Go (2.8 To **)
SATA2 RAID HARD 5

I can add 500 Go under sataII


After several tunings, reading, ect...

The low speed seems to be definetly linked to the SATA II in RAID 1.

So I need a solution to be able to 1st supporting more transaction,
secondly I need to secure the data, and being able to load balancing the
charge.

Please, give me any advise or suggestion that can help me.

regards to all

David





Scott Marlowe a écrit :
> On Sat, Aug 9, 2008 at 11:29 AM, dforum <dforums@vieonet.com> wrote:
>> Hello,
>>
>> I'm trying to install a solution to permit me to :
>> - Secure the datas, without RAID
>
> Nothing beats a simple mirror set for simplicity while protecting the
> data, and for a pretty cheap cost.  How much is your data worth?
>
>> - Giving ability to increase the potentiality of the database towards the
>> needs.
>>
>> I have read about slony, DRBD, pgpool....
>>
>> I don't find the good system to do what I want.
>>
>> I manage for now 50 millions of request per month.
>
> Assuming they all happen from 9 to 5 and during business days only,
> that's about 86 transactions per second.  Well within the realm of a
> single mirror set to keep up if you don't make your db work real fat.
>
>> I will reach 100 millions in the end of the year I suppose.
>
> That takes us to 172 transactions per second.
>
>> There is 2 difficulties :
>> 1 - is the storage : to get faster access,it is recommend to use SAS 15 000
>> tps. But the disk I can get are 149 GO of space. As the database is growing
>> par 1,7 Go per week at the moment, it will reach is maximum in 3 month. I
>> can add 3 disk at least so It can go to 9 month. What to do after, and
>> especially what to do today to prevent it?
>
> No, don't piecemeal just enough to outrun the disk space boogieman
> each month.  Buy enough to last you at least 1 year in the future.
> More if you can afford it.
>
>> 2 - The machine will treat more and more simultaneous entrance, so I need to
>> loadbalance those inserts/updates on several machine and to replicate the
>> datas between them. It's not a real problem if the data are asynchrony.
>
> Then PostgreSQL might not be your best choice.    But I think you're
> wrong.  You can easily handle the load you're talking about on a
> mid-sized box for about $5000 to $10000.
>
> You can use 7200 rpm SATA drives, probably 8 to 12 or so, in a RAID-10
> with a battery backed cache and hit 172 transactions per second.
>
> Given the 1+ G a week storage requirement, you should definitely look
> at using inheritance  to do partitioning.  Then use slony or something
> to replicate the data into the back office for other things.  There's
> always other things most the time that are read only.
>

--
<http://www.1st-affiliation.fr>

*David Bigand
*Président Directeur Générale*
*51 chemin des moulins
73000 CHAMBERY - FRANCE

Web : htttp://www.1st-affiliation.fr
Email : david@1st-affiliation.com
Tel. : +33 479 696 685
Mob. : +33 666 583 836
Skype : firstaffiliation_support


Re: Distant mirroring

От
"Fernando Hevia"
Дата:
> -----Mensaje original-----
> De: pgsql-performance-owner@postgresql.org
> [mailto:pgsql-performance-owner@postgresql.org] En nombre de dforums
> Enviado el: Lunes, 11 de Agosto de 2008 11:27
> Para: Scott Marlowe; pgsql-performance@postgresql.org
> Asunto: Re: [PERFORM] Distant mirroring
>
> Houlala
>
> I got headache !!!
>
> So please help...........;;
>
> "Assuming they all happen from 9 to 5 and during business
> days only, that's about 86 transactions per second.  Well
> within the realm of a single mirror set to keep up if you
> don't make your db work real fat."
>
> OK i like, But my reality is that to make an insert of a
> table that have
> 27 millions of entrance it took 200 ms.
> so it took between 2 minutes and 10 minutes to treat 3000
> records and dispatch/agregate in other tables. And I have for
> now 20000 records every 3 minutes.
>

You must try to partition that table. It should considerably speed up your
inserts.

>
> So I need a solution to be able to 1st supporting more
> transaction, secondly I need to secure the data, and being
> able to load balancing the charge.
>
> Please, give me any advise or suggestion that can help me.
>

Have you taken into consideration programming a solution on BerkeleyDB? Its
an API that provides a high-performance non-SQL database. With such a
solution you could achieve several thousands tps on a much smaller hardware.
You could use non-work hours to dump your data to Postgres for SQL support
for reporting and such.

Regards,
Fernando


Re: Distant mirroring

От
"Scott Marlowe"
Дата:
On Mon, Aug 11, 2008 at 8:26 AM, dforums <dforums@vieonet.com> wrote:
> Houlala
>
> I got headache !!!
>
> So please help...........;;
>
> "Assuming they all happen from 9 to 5 and during business days only,
> that's about 86 transactions per second.  Well within the realm of a
> single mirror set to keep up if you don't make your db work real fat."
>
> OK i like, But my reality is that to make an insert of a table that have 27
> millions of entrance it took 200 ms.
> so it took between 2 minutes and 10 minutes to treat 3000 records and
> dispatch/agregate in other tables. And I have for now 20000 records every 3
> minutes.

Can you partition your data on some logical barrier like a timestamp
or something?  that would probably help a lot.  also, are you doing
all 3000 records in one transaction or individual transactions?  If
one at a time, can you batch them together for better performance or
are you stuck doing them one at a time?

> At the moment I have a
>
> I have a Linux 2.6.24.2-xxxx-std-ipv4-64 #3 SMP Tue Feb 12 12:27:47 CET 2008
> x86_64 Intel(R) Xeon(R) CPU X5355 @ 2.66GHz GenuineIntel GNU/Linux
> with 8Gb of memory. Using sata II disk in RAID 1 (I known that is bad, but
> it would change has quickly I can).

Yeah, you're gonna be I/O bound as long as you've only got a single
mirror set.  A machine with 8 or 12 SAS 15K drives should make it much
more likely you can handle the load.

>
> I got 1-2 GO per week

Definitely let's look at partitioning then if we can do it.

> I can change to 2 kinds of server, using 8.3.3 postgresql server, and even
> taking more sever if need. But it is the biggest computer that I can rent
> for now.
>
> Intel  2x Xeon X5355
> 2x 4x 2.66 GHz
> L2: 8Mo, FSB: 1333MHz
> Double Quadruple Coeur
> 64 bits
> 12 Go FBDIMM DDR2
> 2x 147 Go
> SAS 15 000 tr/min
> RAID 1 HARD

All that memory and CPU power will be wasted on a db with just two
drives.  Do you at least have a decent RAID controller in that setup?

>
> I can add 500 Go under sataII
>
> OR
>
> Intel  2x Xeon X5355
> 2x 4x 2.66 GHz
> L2: 8Mo, FSB: 1333MHz
> Double Quadruple Coeur
> 64 bits
> 12 Go FBDIMM DDR2
> 5x 750 Go (2.8 To **)
> SATA2 RAID HARD 5
>
> I can add 500 Go under sataII

RAID5 is generally a poor choice for a write limited DB.  I'd guess
that the dual SAS drives above would work better than the 5 SATA
drives in RAID 5 here.

> After several tunings, reading, ect...
>
> The low speed seems to be definetly linked to the SATA II in RAID 1.

Going to 15k SAS RAID 1 will just about double your write rate
(assuming it's a commits/second issue and it likely is).  going to a 4
disk SAS RAID10 will double that, and so on.

> So I need a solution to be able to 1st supporting more transaction, secondly
> I need to secure the data, and being able to load balancing the charge.

Look at slony for read only slaves and the master db as write only.
If you can handle the slight delay in updates from master to slave.
Otherwise you'll need sync replication, and that is generally not as
fast.

Take a look at something like this server:

http://www.aberdeeninc.com/abcatg/Stirling-229.htm

With 8 15k SAS 146G drives it runs around $5k or so.  Right now all
the servers your hosting provider is likely to provide you with are
gonna be big on CPU and memory and light on I/O, and that's the
opposite of what you need for databases.

Re: Distant mirroring

От
dforums
Дата:
Tx to all.

I reach the same reflection on partitionning the data to those tables.

And postgresql is giving very good tools for that with the rules features.

I got the SAS server for immediate fix.

But I'm looking for buying a machine that will handle my needs for more
long time.

Regarding partitionning it seems that I could just use a daily tables
for daily treatment and keeping a another one for mass reading. I even
things to partition per years or half years.

One question is on table that have FK constraint, I don't know how to
maintain it ? Could I use rules for it too ? Tx for helps

Regards

David

Scott Marlowe a écrit :
> On Mon, Aug 11, 2008 at 8:26 AM, dforums <dforums@vieonet.com> wrote:
>> Houlala
>>
>> I got headache !!!
>>
>> So please help...........;;
>>
>> "Assuming they all happen from 9 to 5 and during business days only,
>> that's about 86 transactions per second.  Well within the realm of a
>> single mirror set to keep up if you don't make your db work real fat."
>>
>> OK i like, But my reality is that to make an insert of a table that have 27
>> millions of entrance it took 200 ms.
>> so it took between 2 minutes and 10 minutes to treat 3000 records and
>> dispatch/agregate in other tables. And I have for now 20000 records every 3
>> minutes.
>
> Can you partition your data on some logical barrier like a timestamp
> or something?  that would probably help a lot.  also, are you doing
> all 3000 records in one transaction or individual transactions?  If
> one at a time, can you batch them together for better performance or
> are you stuck doing them one at a time?
>
>> At the moment I have a
>>
>> I have a Linux 2.6.24.2-xxxx-std-ipv4-64 #3 SMP Tue Feb 12 12:27:47 CET 2008
>> x86_64 Intel(R) Xeon(R) CPU X5355 @ 2.66GHz GenuineIntel GNU/Linux
>> with 8Gb of memory. Using sata II disk in RAID 1 (I known that is bad, but
>> it would change has quickly I can).
>
> Yeah, you're gonna be I/O bound as long as you've only got a single
> mirror set.  A machine with 8 or 12 SAS 15K drives should make it much
> more likely you can handle the load.
>
>> I got 1-2 GO per week
>
> Definitely let's look at partitioning then if we can do it.
>
>> I can change to 2 kinds of server, using 8.3.3 postgresql server, and even
>> taking more sever if need. But it is the biggest computer that I can rent
>> for now.
>>
>> Intel  2x Xeon X5355
>> 2x 4x 2.66 GHz
>> L2: 8Mo, FSB: 1333MHz
>> Double Quadruple Coeur
>> 64 bits
>> 12 Go FBDIMM DDR2
>> 2x 147 Go
>> SAS 15 000 tr/min
>> RAID 1 HARD
>
> All that memory and CPU power will be wasted on a db with just two
> drives.  Do you at least have a decent RAID controller in that setup?
>
>> I can add 500 Go under sataII
>>
>> OR
>>
>> Intel  2x Xeon X5355
>> 2x 4x 2.66 GHz
>> L2: 8Mo, FSB: 1333MHz
>> Double Quadruple Coeur
>> 64 bits
>> 12 Go FBDIMM DDR2
>> 5x 750 Go (2.8 To **)
>> SATA2 RAID HARD 5
>>
>> I can add 500 Go under sataII
>
> RAID5 is generally a poor choice for a write limited DB.  I'd guess
> that the dual SAS drives above would work better than the 5 SATA
> drives in RAID 5 here.
>
>> After several tunings, reading, ect...
>>
>> The low speed seems to be definetly linked to the SATA II in RAID 1.
>
> Going to 15k SAS RAID 1 will just about double your write rate
> (assuming it's a commits/second issue and it likely is).  going to a 4
> disk SAS RAID10 will double that, and so on.
>
>> So I need a solution to be able to 1st supporting more transaction, secondly
>> I need to secure the data, and being able to load balancing the charge.
>
> Look at slony for read only slaves and the master db as write only.
> If you can handle the slight delay in updates from master to slave.
> Otherwise you'll need sync replication, and that is generally not as
> fast.
>
> Take a look at something like this server:
>
> http://www.aberdeeninc.com/abcatg/Stirling-229.htm
>
> With 8 15k SAS 146G drives it runs around $5k or so.  Right now all
> the servers your hosting provider is likely to provide you with are
> gonna be big on CPU and memory and light on I/O, and that's the
> opposite of what you need for databases.
>

--
<http://www.1st-affiliation.fr>

*David Bigand
*Président Directeur Générale*
*51 chemin des moulins
73000 CHAMBERY - FRANCE

Web : htttp://www.1st-affiliation.fr
Email : david@1st-affiliation.com
Tel. : +33 479 696 685
Mob. : +33 666 583 836
Skype : firstaffiliation_support