Обсуждение: Inserts optimization?

Поиск
Список
Период
Сортировка

Inserts optimization?

От
Francisco Reyes
Дата:
Doing my first write heavy database.
What settings will help improve inserts?
Only a handfull of connections, but each doing up to 30 inserts/second.
Plan to have 2 to 3 clients which most of the time will not run at the
same time, but ocasionaly it's possible two of them may bump into each
other.


If anyone recalls a previous thread like this please suggest keywords to
search on. My search on this topic came back pretty empty.

Re: Inserts optimization?

От
Chris
Дата:
Francisco Reyes wrote:
> Doing my first write heavy database.
> What settings will help improve inserts?
> Only a handfull of connections, but each doing up to 30 inserts/second.
> Plan to have 2 to 3 clients which most of the time will not run at the
> same time, but ocasionaly it's possible two of them may bump into each
> other.

If you can, use copy instead:

http://www.postgresql.org/docs/8.1/interactive/sql-copy.html

MUCH quicker (and don't worry about using multiple clients).

--
Postgresql & php tutorials
http://www.designmagick.com/

Re: Inserts optimization?

От
Tom Lane
Дата:
Chris <dmagick@gmail.com> writes:
> Francisco Reyes wrote:
>> Doing my first write heavy database.
>> What settings will help improve inserts?
>> Only a handfull of connections, but each doing up to 30 inserts/second.

> If you can, use copy instead:
> http://www.postgresql.org/docs/8.1/interactive/sql-copy.html

Or at least try to do multiple inserts per transaction.

Also, increasing checkpoint_segments and possibly wal_buffers helps a
lot for write-intensive loads.  Try to get the WAL onto a separate disk
spindle if you can.  (These things don't matter for SELECTs, but they
do matter for writes.)

            regards, tom lane

Re: Inserts optimization?

От
Francisco Reyes
Дата:
Chris writes:

> If you can, use copy instead:
> http://www.postgresql.org/docs/8.1/interactive/sql-copy.html

I am familiar with copy.
Can't use it in this scenario.

The data is coming from a program called Bacula (Backup server).
It is not static data.

Re: Inserts optimization?

От
Francisco Reyes
Дата:
Tom Lane writes:

> Or at least try to do multiple inserts per transaction.

Will see if the program has an option like that.


> Also, increasing checkpoint_segments and possibly wal_buffers helps a

Will try those.

>Try to get the WAL onto a separate disk
>spindle if you can.  (These things don't matter for SELECTs, but they
>do matter for writes.)

This particular server is pretty much what I inherited for now for this
project.and its Raid 5. There is a new server I am setting up
soon... 8 disks which we are planning to setup
6 disks in RAID 10
2 Hot spares

In RAID 10 would it matter that WALL is in the same RAID set?
Would it be better:
4 disks in RAID10 Data
2 disks RAID 1 WALL
2 hot spares

All in the same RAID controller

Re: Inserts optimization?

От
"Jim C. Nasby"
Дата:
On Thu, Apr 13, 2006 at 02:59:23PM -0400, Francisco Reyes wrote:
> In RAID 10 would it matter that WALL is in the same RAID set?
> Would it be better:
> 4 disks in RAID10 Data
> 2 disks RAID 1 WALL
> 2 hot spares

Well, benchmark it with your app and find out, but generally speaking
unless your database is mostly read you'll see a pretty big benefit to
seperating WAL from table/index data.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Inserts optimization?

От
Francisco Reyes
Дата:
Jim C. Nasby writes:

> On Thu, Apr 13, 2006 at 02:59:23PM -0400, Francisco Reyes wrote:
>> In RAID 10 would it matter that WALL is in the same RAID set?
>> Would it be better:
>> 4 disks in RAID10 Data
>> 2 disks RAID 1 WALL
>> 2 hot spares
>
> Well, benchmark it with your app and find out, but generally speaking
> unless your database is mostly read you'll see a pretty big benefit to
> seperating WAL from table/index data.

That will not be easy to compare.. it would mean setting up the machine..
trashing it.. then redoing the whole setup..

I am leaning towards using pgbench against the current machine to see what
parameters affect inserts.. perhaps also doing dome tests with just inserts
from a file. Then using the same setups on the next machine and just go with
RAID10 on the 6 disks. Split the raid into 10 may give me space issues to
deal with.

Will also find out if the app, Bacula, batches transactions or not.

Re: Inserts optimization?

От
Michael Stone
Дата:
>On Thu, Apr 13, 2006 at 02:59:23PM -0400, Francisco Reyes wrote:
>>In RAID 10 would it matter that WALL is in the same RAID set?
>>Would it be better:
>>4 disks in RAID10 Data
>>2 disks RAID 1 WALL
>>2 hot spares

I guess the first question is why 2 hot spares? You don't have many
spindles, so you don't want to waste them. It might turn out that a
larger array with more spindles with outperform a smaller one with
fewer, regardless of RAID level (assuming a decent battery-backed
cache). You might try
5 RAID5
2 RAID1
1 spare

Mike Stone

Re: Inserts optimization?

От
Marc Cousin
Дата:
I hope I'm not going to say stupid things, but here's what i know (or i think
i know :) ) about bacula + postgresql

If I remember correctly (I allready discussed this with Kern Sibbald a while
ago), bacula does each insert in its own transaction : that's how the program
is done, and of course it works ok with mysql and MyIsam tables, as mysql
doesn't have transactions with myisam...

So, you'll probably end up being slowed down by WAL fsyncs ... and you won't
have a lot of solutions. Maybe you should start with trying to set fsync=no
as a test to confirm that (you should have a lot of iowaits right now if you
haven't disabled fsync).

For now, I only could get good performance with bacula and postgresql when
disabling fsync...


On Thursday 13 April 2006 20:45, Francisco Reyes wrote:
> Chris writes:
> > If you can, use copy instead:
> > http://www.postgresql.org/docs/8.1/interactive/sql-copy.html
>
> I am familiar with copy.
> Can't use it in this scenario.
>
> The data is coming from a program called Bacula (Backup server).
> It is not static data.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster

Re: Inserts optimization?

От
Francisco Reyes
Дата:
Marc Cousin writes:

> If I remember correctly (I allready discussed this with Kern Sibbald a while
> ago), bacula does each insert in its own transaction : that's how the program
> is done

Thanks for the info.

> For now, I only could get good performance with bacula and postgresql when
> disabling fsync...


Isn't that less safe?
I think I am going to try increasing wal_buffers
Specially was reading http://www.powerpostgresql.com. Towards the middle it
mentions improvements when increasing wal_buffers. So far performance,
against a single client was fair.

How did you test your bacula setup with postgresql for time?
Doing full backups with lots of files?


Also planning to check commit_delay and see if that helps.
I will try to avoid 2 or more machines backing up at the same time.. plus in
a couple of weeks I should have a better machine for the DB anyways..


The description of commit_delay sure sounds very promissing:

---
commit_delay (integer)

Time delay between writing a commit record to the WAL buffer and flushing
the buffer out to disk, in microseconds. A nonzero delay can allow multiple
transactions to be committed with only one fsync() system call, if system
load is high enough that additional transactions become ready to commit
within the given interval. But the delay is just wasted if no other
transactions become ready to commit. Therefore, the delay is only performed
if at least commit_siblings other transactions are active at the instant
that a server process has written its commit record. The default is zero (no
delay).
---

I only wonder what is safer.. using a second or two in commit_delay or using
fsync = off.. Anyone cares to comment?

I plan to re-read carefully the WALL docs on the site.. so I can better
decide.. however any field expierences would be much welcome.

Marc are you on the Bacula list? I plan to join it later today and will
bring up the PostgreSQL issue there.

Re: Inserts optimization?

От
Tom Lane
Дата:
Francisco Reyes <lists@stringsutils.com> writes:
> I think I am going to try increasing wal_buffers

That will help not at all, if the problem is too-short transactions
as it sounds to be.  You really need to pester the authors of bacula
to try to wrap multiple inserts per transaction.  Or maybe find some
other software that can do that, if they are uninterested in supporting
anything but mysql.

            regards, tom lane

Re: Inserts optimization?

От
Francisco Reyes
Дата:
Tom Lane writes:

> That will help not at all, if the problem is too-short transactions
> as it sounds to be.


How about commit_delay?

>  You really need to pester the authors of bacula
> to try to wrap multiple inserts per transaction.

Like any volunteer project I am sure it's more an issue of resources than an
issue of interest.


Re: Inserts optimization?

От
Francisco Reyes
Дата:
Michael Stone writes:

> I guess the first question is why 2 hot spares?


Because we are using RAID 10

> larger array with more spindles with outperform a smaller one with
> fewer, regardless of RAID level (assuming a decent battery-backed
> cache).

Based on what I have read RAID 10 is supposed to be better with lots of
random access.


> 5 RAID5
> 2 RAID1
> 1 spare

That is certainly something worth considering... Still I wonder if 2 more
spindles will help enough to justify going to RAID 5. My understanding is
that RAID10 has simpler computations requirements which is partly what makes
it better for lots of random read/write.

Re: Inserts optimization?

От
Michael Stone
Дата:
On Fri, Apr 14, 2006 at 02:01:56PM -0400, Francisco Reyes wrote:
>Michael Stone writes:
>>I guess the first question is why 2 hot spares?
>
>Because we are using RAID 10

I still don't follow that. Why would the RAID level matter? IOW, are you
actually wanting 2 spares, or are you just stick with that because you
need a factor of two disks for your mirrors?

>>larger array with more spindles with outperform a smaller one with
>>fewer, regardless of RAID level (assuming a decent battery-backed
>>cache).
>
>Based on what I have read RAID 10 is supposed to be better with lots of
>random access.

Mmm, it's a bit more complicated than that. RAID 10 can be better if you
have lots of random writes (though a large RAID cache can mitigate
that). For small random reads the limiting factor is how fast you can
seek, and that number is based more on the number of disks than the RAID
level.

>>5 RAID5
>>2 RAID1
>>1 spare
>
>That is certainly something worth considering... Still I wonder if 2 more
>spindles will help enough to justify going to RAID 5. My understanding is
>that RAID10 has simpler computations requirements which is partly what
>makes it better for lots of random read/write.

If your RAID hardware notices a difference between the parity
calculations for RAID 5 and the mirroring of RAID 1 it's a fairly lousy
unit for 2006--those calculations are really trivial for modern
hardware. The reason that RAID 10 can give better random small block
write performance is that fewer disks need to be involved per write.
That's something that can be mitigated with a large cache to aggregate
the writes, but some controllers are much better than others in that
regard. This is really a case where you have to test with your
particular hardware & data, because the data access patterns are
critical in determining what kind of performance is required.

Mike Stone

Re: Inserts optimization?

От
Francisco Reyes
Дата:
Michael Stone writes:

> I still don't follow that. Why would the RAID level matter? IOW, are you
> actually wanting 2 spares, or are you just stick with that because you
> need a factor of two disks for your mirrors?

RAID 10 needs pairs.. so we can either have no spares or 2 spares.

> Mmm, it's a bit more complicated than that. RAID 10 can be better if you
> have lots of random writes (though a large RAID cache can mitigate
> that).

We are using a 3ware 9550SX with 128MB RAM (at least I believe that is what
that card has installed).

>For small random reads the limiting factor is how
>fast you can  seek, and that number is based more on the number of disks than the RAID
> level.

I don't have any solid stats, but I would guess the machines will fairly
close split between reads and writes.


> hardware. The reason that RAID 10 can give better random small block
> write performance is that fewer disks need to be involved per write.


That makes sense.

> That's something that can be mitigated with a large cache

128MB enough in your opinion?


> the writes, but some controllers are much better than others in that
> regard.

The controller we are using is 3Ware 9550SX.

> This is really a case where you have to test with your
> particular hardware & data


That is obviously the ideal way to go, but it is very time consuming. :-(
To setup a machine with one set of raid setup.. test, then re-do with
different set of raid.. re test.. that's anywhere from 1 to 2 days worth of
testing. Unlikely I will be given that time to test.

Re: Inserts optimization?

От
Scott Marlowe
Дата:
On Fri, 2006-04-14 at 15:09, Francisco Reyes wrote:
> Michael Stone writes:
>
> > I still don't follow that. Why would the RAID level matter? IOW, are you
> > actually wanting 2 spares, or are you just stick with that because you
> > need a factor of two disks for your mirrors?
>
> RAID 10 needs pairs.. so we can either have no spares or 2 spares.

Spares are placed in service one at a time.  You don't need 2 spares for
RAID 10, trust me.


Re: Inserts optimization?

От
Francisco Reyes
Дата:
Scott Marlowe writes:

> Spares are placed in service one at a time.

Ah.. that's your point. I know that. :-)

>  You don't need 2 spares for
> RAID 10, trust me.

We bought the machine with 8 drives. At one point we were considering RAID
5, then we decided to give RAID 10 a try. We have a simmilar machine with
raid 5 and less memory (2GB, vs 4GB on the new machine)  and the old machine
was having some very serious issues handling the load.

So far the RAID10 machine (twice the memory, newer disks, faster CPUs,..
the previous machine was about 1 year old) has been performing very well.

We are now looking to put our 3rd NFS machine into production with
identical specs as the machine we currently use with RAID10. This 3rd
machine with do NFS work (mail store) and also will be our database server
(until we can afford to buy a 4th.. dedicated DB machine).

The whole reason I started the thread was because most PostgreSQL setups I
have done in the past were mostly read.. whereas now the Bacula backups plus
another app we will be developing.. will be doing considerable writes to the
DB.

Re: Inserts optimization?

От
Gábriel Ákos
Дата:
Francisco Reyes wrote:

> That is certainly something worth considering... Still I wonder if 2
> more spindles will help enough to justify going to RAID 5. My
> understanding is that RAID10 has simpler computations requirements which
> is partly what makes it better for lots of random read/write.

you are right. raid5 is definitely not suitable for database activities.
it is good for file servers, though.

--
Üdvözlettel,
Gábriel Ákos
-=E-Mail :akos.gabriel@i-logic.hu|Web:  http://www.i-logic.hu=-
-=Tel/fax:+3612367353            |Mobil:+36209278894            =-

Re: Inserts optimization?

От
Gábriel Ákos
Дата:
Francisco Reyes wrote:
> Michael Stone writes:
>
>> I still don't follow that. Why would the RAID level matter? IOW, are
>> you actually wanting 2 spares, or are you just stick with that because
>> you need a factor of two disks for your mirrors?
>
> RAID 10 needs pairs.. so we can either have no spares or 2 spares.

hm, interesting. I have recently set up a HP machine with smartarray 6i
controller, and it is able to handle 4 disks in raid10 plus 1 as spare.
with some scripting you can even use linux software raid in the same setup.

--
Üdvözlettel,
Gábriel Ákos
-=E-Mail :akos.gabriel@i-logic.hu|Web:  http://www.i-logic.hu=-
-=Tel/fax:+3612367353            |Mobil:+36209278894            =-

Re: Inserts optimization?

От
Francisco Reyes
Дата:
Gábriel Ákos writes:

>> RAID 10 needs pairs.. so we can either have no spares or 2 spares.
>
> hm, interesting. I have recently set up a HP machine with smartarray 6i
> controller, and it is able to handle 4 disks in raid10 plus 1 as spare.

:-)
Ok so let me be a bit more clear...

We have 6 disks in RAID10.
We can have the following with the remaining 2 disks
RAID1
1 Spare, 1 for storage
2 spares

Having at least 1 spare is a must.. which means that we can use the
remaining disk as spare or as data... using it to store ANY data means the
data will not be protected by the RAID. I can not think of almost any data
we would care so little that we would put it in the single disk.

For a second I thought logs.. but even that is not good, because the
programs that write the logs may fail.. so even if we don't care about
loosing the logs, in theory it may cause problems to the operation of the
system.

That's how we ended up with 2 hot spares.

Re: Inserts optimization?

От
Francisco Reyes
Дата:
Gábriel Ákos writes:

> you are right. raid5 is definitely not suitable for database activities.

That is not entirely true. :-)
Right now the new server is not ready and the ONLY place I could put the DB
for Bacula was a machine with RAID 5. So far it is holding fine. HOWEVER...
only one bacula job at a time so far and the machine doesn't do anything
else. :-)


> it is good for file servers, though.

Even for a DB server, from what I have read and what I have experienced so
far for mostly read DBs RAID5 should be ok. Specially if you have enough
memory.

As I add more clients to Bacula and the jobs bump into each other I will
better know how the DB holds up. I am doing each FULL backup at a time so
hopefully by the time I do multiple backups from multiple machines less
records will need to be inserted to the DB and there will be a more balanced
operation between reads and writes. Right now there are lots of inserts
going on.

Re: Inserts optimization?

От
Francisco Reyes
Дата:
Tom Lane writes:

> Also, increasing checkpoint_segments and possibly wal_buffers helps a
> lot for write-intensive loads.

Following up on those two recomendations from Tom.
Tom mentioned in a different message that if the inserst are small that
increasing wal_buffers would not help.

How about checkpoint_segments?
Also commit_delays seems like may be helpfull. Is it too much a risk to set
commit_delays to 1 second? If the machine has any problems while having a
higher commit_delay will the end result simply be that a larger amount of
transactions will be rolled back?

p.s. did not CC Tom because he uses an RBL which is rather "selective" about
what it lets through (one of the worst in my opinion).

Re: Inserts optimization?

От
Markus Schaber
Дата:
Hi, Francisco,

Francisco Reyes wrote:

> I only wonder what is safer.. using a second or two in commit_delay or
> using fsync = off.. Anyone cares to comment?

It might be that you misunderstood commit_delay. It will not only delay
the disk write, but also block your connnection until the write actually
is performed.

It will rise the throughput in multi-client scenarios, but will also
rise the latency, and it will absolutely bring no speedup in
single-client scenarios.

It does not decrease safety (in opposite to fsync=off), data will be
consistent, and any application that has successfully finished a commit
can be shure their data is on the platters.[1]

HTH,
Markus

[1] As long as the platters don't lie, but that's another subject.

--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

Re: Inserts optimization?

От
"Magnus Hagander"
Дата:
> > For now, I only could get good performance with bacula and
> postgresql
> > when disabling fsync...
>
>
> Isn't that less safe?

Most definitly.

FWIW, I'm getting pretty good speeds with Bacula and PostgreSQL on a
reasonably small db (file table about 40 million rows, filename about
5.2 million and path 1.5 million).

Config changes are increasing shared mem and work mems, fsm pages,
wal_sync_method=fdatasync, wal_buffers=16, checkpoint_segments=8,
default_with_oids=off (before creating the bacula tables, so they don't
use oids).

Used to run with full_pages_writes=off, but not anymore since it's not
safe.


> Also planning to check commit_delay and see if that helps.
> I will try to avoid 2 or more machines backing up at the same
> time.. plus in a couple of weeks I should have a better
> machine for the DB anyways..

Bacula already serializes access to the database (they have to support
mysql/myisam), so this shouldn't help. Actually, it might well hurt by
introducing extra delays.

> I only wonder what is safer.. using a second or two in
> commit_delay or using
> fsync = off.. Anyone cares to comment?

Absolutely a commit_delay.


//Magnus

Re: Inserts optimization?

От
"Jim C. Nasby"
Дата:
On Fri, Apr 14, 2006 at 03:15:33PM -0500, Scott Marlowe wrote:
> On Fri, 2006-04-14 at 15:09, Francisco Reyes wrote:
> > Michael Stone writes:
> >
> > > I still don't follow that. Why would the RAID level matter? IOW, are you
> > > actually wanting 2 spares, or are you just stick with that because you
> > > need a factor of two disks for your mirrors?
> >
> > RAID 10 needs pairs.. so we can either have no spares or 2 spares.
>
> Spares are placed in service one at a time.  You don't need 2 spares for
> RAID 10, trust me.

Sadly, 3ware doesn't produce any controllers with the ability to do an
odd number of channels, so you end up burning through 2 slots to get a
hot spare (unless you spend substantially more money and go with the
next model up).
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Inserts optimization?

От
"Jim C. Nasby"
Дата:
On Tue, Apr 18, 2006 at 01:56:44PM +0200, Magnus Hagander wrote:
> Bacula already serializes access to the database (they have to support
> mysql/myisam), so this shouldn't help. Actually, it might well hurt by
> introducing extra delays.

You have any contact with the developers? Maybe they're a possibility
for our summer of code...
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Inserts optimization?

От
Markus Schaber
Дата:
Hi, Magnus,

Magnus Hagander wrote:

> Bacula already serializes access to the database (they have to support
> mysql/myisam), so this shouldn't help.

Ouch, that hurts.

To support mysql, they break performance for _every other_ database system?

<cynism>
Now, I understand how the mysql people manage to spread the legend of
mysql being fast. They convince software developers to thwart all others.
</>

Seriously: How can we convince developers to either fix MySQL or abandon
and replace it with a database, instead of crippling client software?

> Actually, it might well hurt by introducing extra delays.

Well, if you read the documentation, you will see that it will only wait
if there are at least commit_siblings other transactions active. So when
Bacula serializes access, there will be no delays, as there is only a
single transaction alive.


HTH
Markus

--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

Re: Inserts optimization?

От
"Magnus Hagander"
Дата:
> > Bacula already serializes access to the database (they have
> to support
> > mysql/myisam), so this shouldn't help.
>
> Ouch, that hurts.
>
> To support mysql, they break performance for _every other_
> database system?

Actually, it probably helps on SQLite as well. And considering they only
support postgresql, mysql and sqlite, there is some merit to it from
their perspective.

You can find a thread about it in the bacula archives from a month or
two back.

> <cynism>
> Now, I understand how the mysql people manage to spread the
> legend of mysql being fast. They convince software developers
> to thwart all others.
> </>

Yes, same as the fact that most (at least FOSS) web project-du-jour are
"dumbed down" to the mysql featureset. (And not just mysql, but
mysql-lowest-common-factors, which means myisam etc)


> > Actually, it might well hurt by introducing extra delays.
>
> Well, if you read the documentation, you will see that it
> will only wait if there are at least commit_siblings other
> transactions active. So when Bacula serializes access, there
> will be no delays, as there is only a single transaction alive.

Hm. Right. Well, it still won't help :-)

//Magnus

Re: Inserts optimization?

От
Markus Schaber
Дата:
Hi, Magnus,

Magnus Hagander wrote:

>>To support mysql, they break performance for _every other_
>>database system?
> Actually, it probably helps on SQLite as well.

AFAICS from the FAQ http://www.sqlite.org/faq.html#q7 and #q8, SQLite
does serialize itsself.

> And considering they only
> support postgresql, mysql and sqlite, there is some merit to it from
> their perspective.

Okay, I understand, but I hesitate to endorse it.

IMHO, they should write their application in a "normal" way, and then
have the serialization etc. encapsulated in the database driver
interface (possibly a wrapper class or so).

>><cynism>
>>Now, I understand how the mysql people manage to spread the
>>legend of mysql being fast. They convince software developers
>>to thwart all others.
>></>
> Yes, same as the fact that most (at least FOSS) web project-du-jour are
> "dumbed down" to the mysql featureset. (And not just mysql, but
> mysql-lowest-common-factors, which means myisam etc)

Well, most of those projects don't need a database, they need a bunch of
tables and a lock.

Heck, they even use client-side SELECT-loops in PHP instead of a JOIN
because "I always confuse left and right".


Markus

--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

Re: Inserts optimization?

От
Tom Lane
Дата:
"Magnus Hagander" <mha@sollentuna.net> writes:
>>> Actually, [commit_delay] might well hurt by introducing extra delays.
>>
>> Well, if you read the documentation, you will see that it
>> will only wait if there are at least commit_siblings other
>> transactions active. So when Bacula serializes access, there
>> will be no delays, as there is only a single transaction alive.

> Hm. Right. Well, it still won't help :-)

It could actually hurt, because nonzero time is required to go look
whether there are any other active transactions.  I'm not sure whether
this overhead is enough to be measurable when there's only one backend
running, but it might be.

            regards, tom lane

Re: Inserts optimization?

От
Scott Marlowe
Дата:
On Wed, 2006-04-19 at 07:08, Markus Schaber wrote:
> Hi, Magnus,
>
> Magnus Hagander wrote:
>
> > Bacula already serializes access to the database (they have to support
> > mysql/myisam), so this shouldn't help.
>
> Ouch, that hurts.
>
> To support mysql, they break performance for _every other_ database system?

Note that should be "to support MySQL with MyISAM tables".

If they had written it for MySQL with innodb tables they would likely be
able to use the same basic methods for performance tuning MySQL as or
Oracle or PostgreSQL.

It's the refusal of people to stop using MyISAM table types that's the
real issue.

Of course, given the shakey ground MySQL is now on with Oracle owning
innodb...

Re: Inserts optimization?

От
Tom Lane
Дата:
Scott Marlowe <smarlowe@g2switchworks.com> writes:
> It's the refusal of people to stop using MyISAM table types that's the
> real issue.

Isn't MyISAM still the default over there?  It's hardly likely that the
average MySQL user would use anything but the default table type ...

            regards, tom lane

Re: Inserts optimization?

От
PFC
Дата:
> Isn't MyISAM still the default over there?

    Yes, it's the default.
    Personnally I compile MySQL without InnoDB... and for any new development
I use postgres.

> It's hardly likely that the average MySQL user would use anything but
> the default table type ...

    Double yes ; also many hosts provide MySQL 4.0 or even 3.x, both of which
have no subquery support and are really brain-dead ; and most OSS PHP apps
have to be compatible... argh.

Re: Inserts optimization?

От
Scott Marlowe
Дата:
On Wed, 2006-04-19 at 10:31, Tom Lane wrote:
> Scott Marlowe <smarlowe@g2switchworks.com> writes:
> > It's the refusal of people to stop using MyISAM table types that's the
> > real issue.
>
> Isn't MyISAM still the default over there?  It's hardly likely that the
> average MySQL user would use anything but the default table type ...

Sure.  But bacula supplies its own setup scripts.  and it's not that
hard to make them a requirement for the appication.  Most versions of
MySQL that come with fedora core and other distros nowadays support
innodb.



Re: Inserts optimization?

От
Christopher Kings-Lynne
Дата:
> Scott Marlowe <smarlowe@g2switchworks.com> writes:
>> It's the refusal of people to stop using MyISAM table types that's the
>> real issue.
>
> Isn't MyISAM still the default over there?  It's hardly likely that the
> average MySQL user would use anything but the default table type ...

Since MySQL 5, InnoDB tables are default I recall.


Re: Inserts optimization?

От
Mark Kirkwood
Дата:
Christopher Kings-Lynne wrote:
>> Scott Marlowe <smarlowe@g2switchworks.com> writes:
>>> It's the refusal of people to stop using MyISAM table types that's the
>>> real issue.
>>
>> Isn't MyISAM still the default over there?  It's hardly likely that the
>> average MySQL user would use anything but the default table type ...
>
> Since MySQL 5, InnoDB tables are default I recall.

Might be for the binaries from www.mysql.com - but if you build from
source, it still seems to be MYISAM (checked with 5.0.18 and 5.1.7 here).

Cheers

Mark

Re: Inserts optimization?

От
Scott Marlowe
Дата:
On Wed, 2006-04-19 at 20:07, Christopher Kings-Lynne wrote:
> > Scott Marlowe <smarlowe@g2switchworks.com> writes:
> >> It's the refusal of people to stop using MyISAM table types that's the
> >> real issue.
> >
> > Isn't MyISAM still the default over there?  It's hardly likely that the
> > average MySQL user would use anything but the default table type ...
>
> Since MySQL 5, InnoDB tables are default I recall.

It gets built by default, but when you do a plain create table, it will
still default to myisam tables.

Note that there is a setting somewhere in my.cnf that will make the
default table type anything you want.

For Bacula though, what I was suggesting was that they simply declare
that you need innodb table type support if you want decent performance,
then coding to that, and if someone doesn't have innodb table support,
then they have no right to complain about poor performance.  Seems a
fair compromise to me.  The Bacula folks would get to program to a real
database model with proper serlialization and all that, and the people
who refuse to move up to a later model MySQL get crappy performance.

Re: Inserts optimization?

От
Vivek Khera
Дата:
On Apr 13, 2006, at 2:59 PM, Francisco Reyes wrote:

> This particular server is pretty much what I inherited for now for
> this project.and its Raid 5. There is a new server I am setting up
> soon... 8 disks which we are planning to setup
> 6 disks in RAID 10
> 2 Hot spares
>
> In RAID 10 would it matter that WALL is in the same RAID set?
> Would it be better:
> 4 disks in RAID10 Data
> 2 disks RAID 1 WALL
> 2 hot spares

why do you need two hot spares?

I'd go with 6 disk RAID10 for data
2 disk RAID1 for WAL (and OS if you don't have other disks from which
to boot)

and run nothing else but Postgres on that box.

bump up checkpoint_segments to some huge  number like 256 and use the
bg writer process.

if a disk fails, just replace it quickly with a cold spare.

and if your RAID controller has two channels, pair the mirrors across
channels.


Re: Inserts optimization?

От
Vivek Khera
Дата:
On Apr 14, 2006, at 8:00 AM, Marc Cousin wrote:

> So, you'll probably end up being slowed down by WAL fsyncs ... and
> you won't
> have a lot of solutions. Maybe you should start with trying to set
> fsync=no
> as a test to confirm that (you should have a lot of iowaits right
> now if you
> haven't disabled fsync).

Instead of doing that, why not use commit_delay to some nominal value
to try and group the fsyncs.  If they're coming in at 30 per second,
this should help a bit, I suspect.