Обсуждение: Increasing number of PG connections.
I am running a Dual Xeon hyperthreaded server with 4GB RAM RAID-5. The only thing running on the server is Postgres running under Fedora. I have a 700 connection limit. The DB is setup as a backend for a very high volume website. Most of the queries are simple, such as logging accesses, user login verification etc. There are a few bigger things suchas reporting etc but for the most part each transaction lasts less then a second. The connections are not persistant (I'm using pg_connect in PHP) The system was at 2 GB with a 400 connection limit. We ran into problems because we hit the limit of connections during high volume. 1. Does 400 connections sound consistant with the 2GB of RAM? Does 700 sound good with 4 GB. I've read a little on optimizing postgres. Is there anything else I can do maybe OS wise to increase how many connections I get before I start swapping? 2. Are there any clustering technologies that will work with postgres? Specifically I'm looking at increasing the number of connections. The bottom line is since the website launched (middle of January) we have increased the number of http connections, and increased bandwidth allowances by over 10 times. The site continues to grow and we are looking at our options. Some of the ideas have been possible DB replication. Write to master and read from multiple slaves. Other ideas including increasing hardware. This is the biggest site I have ever worked with. Almost everything else fits in a T1 with a single DB server handling multiple sites. Does anybody with experence in this realm have any suggestions? Thank you in advance for whatever help you can provide. -- Kevin Barnard
I am new here. I have a question related to this in some way. Our web site needs to upload a large volume of data into Postgres at a time. The performance deterioates as number of rows becomes larger. When it reaches 2500 rows, it never come back to GUI. Since the tests were run through GUI, my suspision is that it might be caused by the way the application server talking to Postgres server, the connections, etc.. What might be the factors involved here? Does anyone know? Thanks a lot! Qing On Feb 2, 2004, at 11:14 AM, Kevin Barnard wrote: > I am running a Dual Xeon hyperthreaded server with 4GB RAM RAID-5. > The only > thing running on the server is Postgres running under Fedora. I have > a 700 > connection limit. > > The DB is setup as a backend for a very high volume website. Most of > the queries > are simple, such as logging accesses, user login verification etc. > There are a few > bigger things suchas reporting etc but for the most part each > transaction lasts less > then a second. The connections are not persistant (I'm using > pg_connect in PHP) > > The system was at 2 GB with a 400 connection limit. We ran into > problems because > we hit the limit of connections during high volume. > > 1. Does 400 connections sound consistant with the 2GB of RAM? Does > 700 sound > good with 4 GB. I've read a little on optimizing postgres. Is there > anything else I can > do maybe OS wise to increase how many connections I get before I start > swapping? > > 2. Are there any clustering technologies that will work with > postgres? Specifically I'm > looking at increasing the number of connections. > > The bottom line is since the website launched (middle of January) we > have increased > the number of http connections, and increased bandwidth allowances by > over 10 > times. The site continues to grow and we are looking at our options. > Some of the > ideas have been possible DB replication. Write to master and read > from multiple > slaves. Other ideas including increasing hardware. > > This is the biggest site I have ever worked with. Almost everything > else fits in a T1 > with a single DB server handling multiple sites. Does anybody with > experence in this > realm have any suggestions? > > Thank you in advance for whatever help you can provide. > -- > Kevin Barnard > > > > ---------------------------(end of > broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster >
inserting large number of rows was: Re: Increasing number of PG connections.
От
 
		    	"scott.marlowe"
		    Дата:
		        On Mon, 2 Feb 2004, Qing Zhao wrote: > I am new here. I have a question related to this in some way. > > Our web site needs to upload a large volume of data into Postgres at a > time. The performance deterioates as number of rows becomes larger. > When it reaches 2500 rows, it never come back to GUI. Since the tests > were run through GUI, my suspision is > that it might be caused by the way the application server talking to > Postgres server, the connections, etc.. What might be the factors > involved here? Does anyone know? Actually, I'm gonna go out on a limb here and assume two things: 1. you've got lotsa fk/pk relationships setup. 2. you're analyzing the table empty before loading it up. What happens in this instance is that the analyze on an empty, or nearly so, table, means that during the inserts, postgresql thinks you have only a few rows. At first, this is fine, as pgsql will seq scan the tables to make sure there is a proper key in both. As the number of rows increases, the planner needs to switch to index scans but doesn't, because it doesn't know that the number of rows is increasing. Fix: insert a few hundred rows, run analyze, check to see if the explain for inserts is showing index scans or not. If not, load a few more hundred rows, analyze, rinse, repeat. Also, look for fk/pk mismatches. I.e. an int4 field pointing to an int8 field. That's a performance killer, so if the pk/fk types don't match, see if you can change your field types to match and try again.
On Monday 02 February 2004 19:39, Qing Zhao wrote: > I am new here. I have a question related to this in some way. Hmm - no real connection I can see - might have been better to start a new thread rather than replying to this one. Also, it is usually considered best practice not to quote large amounts of the previous message if you're not replying to it, > Our web site needs to upload a large volume of data into Postgres at a > time. The performance deterioates as number of rows becomes larger. > When it reaches 2500 rows, it never come back to GUI. Since the tests > were run through GUI, my suspision is > that it might be caused by the way the application server talking to > Postgres server, the connections, etc.. What might be the factors > involved here? Does anyone know? You don't really give us enough information. What GUI are you talking about? How are you loading this data - as a series of INSERT statements, text-file with separators, from Access/MySQL etc? In general, the fastest way to add a large number of rows is via the COPY sql command. Next best is to batch your inserts together into larger transactions of say 100-1000 inserts. Two other things to be aware of are: use of VACUUM/ANALYZE and configuration tuning (see http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php). PG shouldn't have a problem with inserting a few thousand rows, so I suspect it's something to do with your application/GUI setup. Hope that helps, if not try turning on statement logging for PG and then we can see what commands your GUI is sending. -- Richard Huxton Archonet Ltd
I must have missed this post when it was made earlier. Pardon the noise if my suggestion has already been made. Unlike MySQL (and possibly other database servers) PostgreSQL is faster when inserting inside a transaction. Depending on the method in which you are actually adding the records. In my own experience (generating a list of INSERT statements from a perl script and using psql to execute them) the difference in performance was incredibly dramatic when I added a "BEGIN WORK" at the beginning and "COMMIT WORK" at the end. scott.marlowe wrote: > On Mon, 2 Feb 2004, Qing Zhao wrote: > > >>I am new here. I have a question related to this in some way. >> >>Our web site needs to upload a large volume of data into Postgres at a >>time. The performance deterioates as number of rows becomes larger. >>When it reaches 2500 rows, it never come back to GUI. Since the tests >>were run through GUI, my suspision is >>that it might be caused by the way the application server talking to >>Postgres server, the connections, etc.. What might be the factors >>involved here? Does anyone know? > > > Actually, I'm gonna go out on a limb here and assume two things: > > 1. you've got lotsa fk/pk relationships setup. > 2. you're analyzing the table empty before loading it up. > > What happens in this instance is that the analyze on an empty, or nearly > so, table, means that during the inserts, postgresql thinks you have only > a few rows. At first, this is fine, as pgsql will seq scan the > tables to make sure there is a proper key in both. As the number of > rows increases, the planner needs to switch to index scans but doesn't, > because it doesn't know that the number of rows is increasing. > > Fix: insert a few hundred rows, run analyze, check to see if the explain > for inserts is showing index scans or not. If not, load a few more > hundred rows, analyze, rinse, repeat. > > Also, look for fk/pk mismatches. I.e. an int4 field pointing to an int8 > field. That's a performance killer, so if the pk/fk types don't match, > see if you can change your field types to match and try again. > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > -- Bill Moran Potential Technologies http://www.potentialtech.com
On Mon, 2 Feb 2004, Kevin Barnard wrote: > I am running a Dual Xeon hyperthreaded server with 4GB RAM RAID-5. The only > thing running on the server is Postgres running under Fedora. I have a 700 > connection limit. > > The DB is setup as a backend for a very high volume website. Most of the queries > are simple, such as logging accesses, user login verification etc. There are a few > bigger things suchas reporting etc but for the most part each transaction lasts less > then a second. The connections are not persistant (I'm using pg_connect in PHP) > > The system was at 2 GB with a 400 connection limit. We ran into problems because > we hit the limit of connections during high volume. what do you mean at 2 GB? Is that how much is in kernel cache plus buffer, plus used, plus etc??? Could you give us the top of top output to make sure? If most of that is kernel cache, then that's fine. My experience has been that individual postgresql backends only weigh in at a mega byte at most, and they share buffer, so 700 connections can be anywhere from 300meg to 1 gig. the rest would be buffer memory. It's not a good idea to give up too much to shared buffers, as the database isn't as good at caching as the kernel. What do you have in postgresql.conf? sort_mem, shared_buffers, etc??? sort_mem can be a real killer if it lets the processes chew up too much memory. Once sort_mem gets high enough to make the machine start swapping it is doing more harm than good being that high, and should usually be lowered a fair bit. How many disks in your RAID5? The more the better. Is it hardware with battery backed cache? If you write much to it it will help to have battery backed cache on board. If it's a megaraid / LSI board, get the megaraid2 driver, it's supposedly much faster. You may find it hard to get postgresql to use any more memory than you have, as 32 bit apps can only address 2 gigs anyway, but the extra can certainly be used by the kernel as cache, which will help.
On 2 Feb 2004 at 13:58, scott.marlowe wrote:
> what do you mean at 2 GB?  Is that how much is in kernel cache plus
> buffer, plus used, plus etc???  Could you give us the top of top output to
> make sure?  If most of that is kernel cache, then that's fine.
2GB was total system memory.  We upgraded to 4GB to prior to increasing the
number of connections.
Here's the top of top
 16:14:17  up 2 days, 16:15,  1 user,  load average: 7.60, 6.56, 4.61
730 processes: 721 sleeping, 9 running, 0 zombie, 0 stopped
CPU states:  cpu    user    nice  system    irq  softirq  iowait    idle
           total    0.0%    0.0%    0.0%   0.0%     0.0%    0.0%    0.0%
           cpu00    0.0%    0.0%    0.0%   0.0%     0.0%    0.0%    0.0%
           cpu01    0.0%    0.0%    0.0%   0.0%     0.0%    0.0%    0.0%
           cpu02    0.0%    0.0%    0.0%   0.0%     0.0%    0.0%    0.0%
           cpu03    0.0%    0.0%    0.0%   0.0%     0.0%    0.0%    0.0%
Mem:  3747644k av, 3298344k used,  449300k free,       0k shrd,  147880k buff
      2158532k active,             760040k inactive
Swap: 1048088k av,       0k used, 1048088k free                 2262156k cached
The DB is pretty close to max connections at this point in time.  I don't know why
CPU shows 0% in every bucket.  It looks like I can increase the number of
connections a little from here.  This is a fairly standard Fedora install.  It's using
version 2.4.22 of the Kernel.  Postgres is a complied version using 7.4.1
> experience has been that individual postgresql backends only weigh in at a
> mega byte at most, and they share buffer, so 700 connections can be
> anywhere from 300meg to 1 gig.  the rest would be buffer memory.  It's not
> a good idea to give up too much to shared buffers, as the database isn't
> as good at caching as the kernel.
OK I take this as I should keep shared buffers around 2x connections then correct?
>
> What do you have in postgresql.conf?  sort_mem, shared_buffers, etc???
Here is what I have that is not set from the defaults.
max_connections = 700
shared_buffers = 1500
sort_mem = 512
random_page_cost = 2
stats_start_collector = true
stats_command_string = true
stats_block_level = true
stats_row_level = true
> sort_mem can be a real killer if it lets the processes chew up too much
> memory.  Once sort_mem gets high enough to make the machine start swapping
> it is doing more harm than good being that high, and should usually be
> lowered a fair bit.
I dropped it down to 512 as you can see.  Should I be running with all of the stats on?
I am no longer using pg_autovacuum.  I seem to be getting better results with an
hourly Vacuum anaylse.
> How many disks in your RAID5?  The more the better.  Is it hardware with
> battery backed cache?  If you write much to it it will help to have
> battery backed cache on board.  If it's a megaraid / LSI board, get the
> megaraid2 driver, it's supposedly much faster.
4 disk IBM ServeRAID 5i with battery backed cache.
> You may find it hard to get postgresql to use any more memory than you
> have, as 32 bit apps can only address 2 gigs anyway, but the extra can
> certainly be used by the kernel as cache, which will help.
Isn't that only true for each indivdual process space.  Shouldn't each process have
access at most 2GB.  If each backend is in it's own process space is this really a limit
since all of my queries are pretty small.
I have been monitoring the system has it gets up to load.  For most of the time the
sytem sits around 100-300 connections.  Once it ramps up it ramps up hard.  Top
starts cycling at 0 and 133% CPU for irq, softirq and iowait.  The system stays at 700
connections until users give up.  I can watch bandwidth utilization drop to almost
nothing right before the DB catches up.
--
Kevin Barnard
Speed Fulfillment and Call Center
kbarnard@speedfc.com
214-258-0120
			
		On Mon, 2 Feb 2004, Kevin Barnard wrote: > On 2 Feb 2004 at 13:58, scott.marlowe wrote: > > > what do you mean at 2 GB? Is that how much is in kernel cache plus > > buffer, plus used, plus etc??? Could you give us the top of top output to > > make sure? If most of that is kernel cache, then that's fine. > > 2GB was total system memory. We upgraded to 4GB to prior to increasing the > number of connections. Oh, ok. I thought you meant the system was using 2 gigs of RAM for postgresql > Here's the top of top > > 16:14:17 up 2 days, 16:15, 1 user, load average: 7.60, 6.56, 4.61 > 730 processes: 721 sleeping, 9 running, 0 zombie, 0 stopped > CPU states: cpu user nice system irq softirq iowait idle > total 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% > cpu00 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% > cpu01 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% > cpu02 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% > cpu03 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% > Mem: 3747644k av, 3298344k used, 449300k free, 0k shrd, 147880k buff > 2158532k active, 760040k inactive > Swap: 1048088k av, 0k used, 1048088k free 2262156k cached when you have a high load but load CPU usage, you are usually I/O bound. > The DB is pretty close to max connections at this point in time. I don't know why > CPU shows 0% in every bucket. It looks like I can increase the number of > connections a little from here. This is a fairly standard Fedora install. It's using > version 2.4.22 of the Kernel. Postgres is a complied version using 7.4.1 On this machine you could probably handle even more. What I want is to get your page return times down enough so you don't need to increase the number of connections. I.e. if you've got 2 second response times and you drop those to 0.2 seconds, then you won't need as many processes to handle the load (theoretically... :-) > > experience has been that individual postgresql backends only weigh in at a > > mega byte at most, and they share buffer, so 700 connections can be > > anywhere from 300meg to 1 gig. the rest would be buffer memory. It's not > > a good idea to give up too much to shared buffers, as the database isn't > > as good at caching as the kernel. > > OK I take this as I should keep shared buffers around 2x connections then correct? Not really. What happens is that if the shared buffers are so large that they are as large as or god forbid, larger than the kernel cache, then the kernel cache becomes less effective. The general rule of thumb is 25% of memory, or 256 Megs, whichever is less. The real test is that you want enough shared_buffers so that all the result sets currently being smooshed up against each other in joins, sorts, etc... can fit in postgresql's shared buffers, or at least the buffers can hold a fair chunk of it. So, the number of buffers can be anywhere from a few thousand, up to 40000 or 50000, sometimes even higher. But for most tuning you won't be needing to be above 32768, which is 256 Megs of ram. > > What do you have in postgresql.conf? sort_mem, shared_buffers, etc??? > > Here is what I have that is not set from the defaults. > > max_connections = 700 > shared_buffers = 1500 > sort_mem = 512 > random_page_cost = 2 > stats_start_collector = true > stats_command_string = true > stats_block_level = true > stats_row_level = true > > > > sort_mem can be a real killer if it lets the processes chew up too much > > memory. Once sort_mem gets high enough to make the machine start swapping > > it is doing more harm than good being that high, and should usually be > > lowered a fair bit. > > I dropped it down to 512 as you can see. Should I be running with all of the stats on? > I am no longer using pg_autovacuum. I seem to be getting better results with an > hourly Vacuum anaylse. Seeing as how top shows 2262156k kernel cache, you can afford to give up a fair bit more than 512k per sort. I generally run 8192 (8 meg) but I don't handle 700 simos. Try running it a little higher, 2048, 4096, etc... and see if that helps. Note you can change sort_mem and just do a pg_ctl reload to make the change, without interrupting service, unlike shared_buffers, which requires a restart. > > How many disks in your RAID5? The more the better. Is it hardware with > > battery backed cache? If you write much to it it will help to have > > battery backed cache on board. If it's a megaraid / LSI board, get the > > megaraid2 driver, it's supposedly much faster. > > 4 disk IBM ServeRAID 5i with battery backed cache. Do you have the cache set to write back or write through? Write through can be a performance killer. But I don't think your RAID is the problem, it looks to me like postgresql is doing a lot of I/O. When you run top, do the postgresql processes show a lot of D status? That's usually waiting on I/O > > You may find it hard to get postgresql to use any more memory than you > > have, as 32 bit apps can only address 2 gigs anyway, but the extra can > > certainly be used by the kernel as cache, which will help. > > Isn't that only true for each indivdual process space. Shouldn't each process have > access at most 2GB. If each backend is in it's own process space is this really a limit > since all of my queries are pretty small. Right, each process can use a big chunk, but shared_buffers will top out at ~2 gig. Most tests have shown a negative return on a shared_buffers setting that big though, so the nicest thing about the extra memory is that the kernel can use it to cache, AND you can increase your sort_mem to something larger. > I have been monitoring the system has it gets up to load. For most of the time the > sytem sits around 100-300 connections. Once it ramps up it ramps up hard. Top > starts cycling at 0 and 133% CPU for irq, softirq and iowait. The system stays at 700 > connections until users give up. I can watch bandwidth utilization drop to almost > nothing right before the DB catches up. Yeah, it sounds to me like it's grinding down to a halt because postgresql isn't being able to hold enough data in memory for what it's doing. Try increasing shared_buffers to 5000 to as high as 30000 or 50000, a but at a time, as well as increasing sort_mem to 4096 or 8192. Note that increasing shared_buffers will have a very positive effect on performance at first, then less effect, then slowly bring it back down as it goes too high, but isn't likely to starve the machine (64k buffers = 512 meg, you've got the memory to spare, so no great loss). however, sort_mem will have a huge effect right up until it's big enough for all your sorts to fit into memory. once that happens, increasing it won't help or hurt UNTIL the machine gets enough load to make the sorts use up all memory and send it into a swap storm, so be careful about overdoing sorts. I.e. shared_buffers = too big, no big deal, sort_mem too big = time bomb. what you want to do is get the machine to a point where the kernel cache is about twice the size or larger, than the shared_buffers. I'd start at 10000 shared buffers and 4096 sort mem and see what happens. If you've still got >2 gig kernel cache at that point, then increase both a bit (2x or so) and see how much kernel cache you've got. If your kernel cache stays above 1Gig, and the machine is running faster, you're doing pretty good. you may need to increase shmmax and friends to increase the shared_buffers that high, but sort_mem requires no new kernel configuration.
You could do high speed inserts with COPY command: http://developer.postgresql.org/docs/postgres/sql-copy.html Check whenether your database adapter/client lib supports it (i guess it does). Note that it doesnt help very much if there are fk's/triggers's on the target table. Bill Moran wrote: > I must have missed this post when it was made earlier. Pardon the > noise if > my suggestion has already been made. > > Unlike MySQL (and possibly other database servers) PostgreSQL is > faster when > inserting inside a transaction. Depending on the method in which you are > actually adding the records. > > In my own experience (generating a list of INSERT statements from a perl > script and using psql to execute them) the difference in performance was > incredibly dramatic when I added a "BEGIN WORK" at the beginning and > "COMMIT WORK" at the end. > > scott.marlowe wrote: > >> On Mon, 2 Feb 2004, Qing Zhao wrote: >> >> >>> I am new here. I have a question related to this in some way. >>> >>> Our web site needs to upload a large volume of data into Postgres at >>> a time. The performance deterioates as number of rows becomes >>> larger. When it reaches 2500 rows, it never come back to GUI. Since >>> the tests were run through GUI, my suspision is >>> that it might be caused by the way the application server talking to >>> Postgres server, the connections, etc.. What might be the factors >>> involved here? Does anyone know? >> >> >> >> Actually, I'm gonna go out on a limb here and assume two things: >> >> 1. you've got lotsa fk/pk relationships setup. >> 2. you're analyzing the table empty before loading it up. >> >> What happens in this instance is that the analyze on an empty, or >> nearly so, table, means that during the inserts, postgresql thinks >> you have only a few rows. At first, this is fine, as pgsql will seq >> scan the tables to make sure there is a proper key in both. As the >> number of rows increases, the planner needs to switch to index scans >> but doesn't, because it doesn't know that the number of rows is >> increasing. >> >> Fix: insert a few hundred rows, run analyze, check to see if the >> explain for inserts is showing index scans or not. If not, load a >> few more hundred rows, analyze, rinse, repeat. >> >> Also, look for fk/pk mismatches. I.e. an int4 field pointing to an >> int8 field. That's a performance killer, so if the pk/fk types don't >> match, see if you can change your field types to match and try again. >> >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 2: you can get off all lists at once with the unregister command >> (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) >> > >
On 2 Feb 2004 at 16:45, scott.marlowe wrote: > Do you have the cache set to write back or write through? Write through > can be a performance killer. But I don't think your RAID is the problem, > it looks to me like postgresql is doing a lot of I/O. When you run top, > do the postgresql processes show a lot of D status? That's usually waiting > on I/O > Actually I'm not sure. It's setup with the factory defaults from IBM. Actually when I start hitting the limit I was surprised to find only a few D status indicators. Most of the processes where sleeping. > what you want to do is get the machine to a point where the kernel cache > is about twice the size or larger, than the shared_buffers. I'd start at > 10000 shared buffers and 4096 sort mem and see what happens. If you've > still got >2 gig kernel cache at that point, then increase both a bit (2x > or so) and see how much kernel cache you've got. If your kernel cache > stays above 1Gig, and the machine is running faster, you're doing pretty > good. > I've set shared to 10000 and sort to 4096. I just have to wait until the afternoon before I see system load start to max out. Thanks for the tips I'm crossing my fingers. -- Kevin Barnard