Обсуждение: Really bad insert performance: what did I do wrong?
I've use PostgreSQL for some pretty amazing things, and was a proponent of using it here. I set up 7.3.1 on my development RedHat 8 box, and it was fine. I need to load about 700,000 rows into one table, the rows only having 6 columns, and the load on my box happens in just a couple of minutes (there's some calculation while the data is loading, and that time was acceptable to me). My box, however, isn't a production server, so I attempted to create the database again on a Sun Blade: SunOS trident 5.8 Generic_108528-17 sun4u sparc SUNW,UltraAX-i2 Status of processor 0 as of: 02/21/03 10:10:10 Processor has been on-line since 01/13/03 13:53:51. The sparcv9 processor operates at 500 MHz, and has a sparcv9 floating point processor. It isn't the world's fastest box, but should be fast enough for this. However... It took almost 2 days to load the table on this box. The postgresql process was constantly eating up all of the CPU time it could get, while loading, at times, less than 1 row a second. Now, I KNOW something is wrong, and it probably isn't postgres. :) I turned off the index on the table (there was only one, on three fields) and made the whole load happen in a transaction. The 2 day result was after those changes were made. I compiled postgres myself, using the gcc 3.2.2 package from sunfreeware. Any thoughts? I know that postgres should be more than capable of loading this data quickly on this box...I figured it could possibly take twice as long...and I thought my problem would be i/o related, not CPU. Thanks! Kevin
> Yipes. We found awhile ago that Solaris' standard qsort() really sucks, > but 7.3 should work around that --- and I don't think qsort would be > invoked during data load anyway. > > Do you want to rebuild Postgres with profiling enabled, and get a gprof > trace so we can see where the time is going? You don't need to run it > for two days --- a few minutes' worth of runtime should be plenty. Great...I'd love to...as I've never had a problem with Postgres like this, I didn't even know where to start...I'll look for how to do that. Kevin
Kevin White <kwhite@digital-ics.com> writes: > My box, however, isn't a production server, so I attempted to create the > database again on a Sun Blade: > ... > It took almost 2 days to load the table on this box. Yipes. We found awhile ago that Solaris' standard qsort() really sucks, but 7.3 should work around that --- and I don't think qsort would be invoked during data load anyway. Do you want to rebuild Postgres with profiling enabled, and get a gprof trace so we can see where the time is going? You don't need to run it for two days --- a few minutes' worth of runtime should be plenty. regards, tom lane
On Friday 21 Feb 2003 9:02 pm, you wrote: > Any thoughts? I know that postgres should be more than capable of > loading this data quickly on this box...I figured it could possibly take > twice as long...and I thought my problem would be i/o related, not CPU. First, check vmstat or similar on SunOS. See what is maxing out. Second tunr postgresql trace on and see where it is specnding most of the CPU. Needless to say, did you tune shared buffers? Shridhar
On Fri, Feb 21, 2003 at 10:32:37AM -0500, Kevin White wrote: > I've use PostgreSQL for some pretty amazing things, and was a proponent > of using it here. I set up 7.3.1 on my development RedHat 8 box, and it > was fine. I need to load about 700,000 rows into one table, the rows > only having 6 columns, and the load on my box happens in just a couple > of minutes (there's some calculation while the data is loading, and that > time was acceptable to me). > > My box, however, isn't a production server, so I attempted to create the > database again on a Sun Blade: > > SunOS trident 5.8 Generic_108528-17 sun4u sparc SUNW,UltraAX-i2 > Status of processor 0 as of: 02/21/03 10:10:10 > Processor has been on-line since 01/13/03 13:53:51. > The sparcv9 processor operates at 500 MHz, > and has a sparcv9 floating point processor. > > It isn't the world's fastest box, but should be fast enough for this. What's the disk subsystem? Is fsync turned on in both cases? And is your IDE drive lying to you about what it's doing. My experiences in moving from a Linux box to a low-end Sun is pretty similar. The problem usually turns out to be a combination of overhead on fsync (which shows up as processor load instead of i/o, oddly); and memory contention, especially in case there are too-large numbers of shared buffers (on a 16 Gig box, we find that 2 Gig of shared buffers is too large -- the shared memory management is crummy). A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
Andrew Sullivan wrote: > What's the disk subsystem? Is fsync turned on in both cases? And is > your IDE drive lying to you about what it's doing. It is IDE. How do I turn fsync on or off? (All I can find in the man is a function call to fsync...is there something else?) > My experiences in moving from a Linux box to a low-end Sun is pretty > similar. The problem usually turns out to be a combination of > overhead on fsync (which shows up as processor load instead of i/o, > oddly); and memory contention, especially in case there are too-large > numbers of shared buffers This box only has 1 gig, and I've only set up 200 shared buffers...at this point, it is only me hitting it. Increasing the shared buffers might help, but I haven't yet found the info I need to do that intelligently. Shridhar Daithankar wrote: > First, check vmstat or similar on SunOS. See what is maxing out. Second tunr > postgresql trace on and see where it is specnding most of the CPU. Do you mean turning on the statistics generators, or gprof? > Needless to say, did you tune shared buffers? Like I mentioned above, I haven't yet found good info on what to do to actually tune shared buffers...I know how to change them, but that's about it. I'll poke around some more. Tom Lane wrote: > You should be able to find details in the archives, but the key point > is to do > cd .../src/backend > gmake clean > gmake PROFILE="-pg" all > to build a profile-enabled backend. You may need a more complex > incantation than -pg on Solaris, but it works on other platforms. I did this, but my gmon.out doesn't appear to have much data from the actual child postgres process, just the parent. I might be wrong, and I'm letting some stats generate. However, to everyone, I DID find a problem in my code that was making it take super forever long....the code doesn't just insert. It is also written to do updates if it needs to, and because of what I'm doing, I end up doing selects back against the table during the load to find previously loaded rows. In this case, there aren't any, because the table's been trunced, but...having turned the indexes off, those selects were really really slow. Using the stats tools told me that. So, that may have been a large part of my problem. I'm still seeing the process just SIT there, though, for seconds at a time, so there's probably something else that I can fix. Maybe I should just randomly try a larger buffers setting... Being able to analyze the output of gmon would be nice, but as I said before, it doesn't appear to actually change much... Right now, the load has been running continuously for several minutes. It is 12:20pm, but the time on the gmon.out file is 12:18pm. I should be able to let the load finish while I'm at lunch, and I might be able to get something out of gmon when it is done...maybe writing to gmon just doesn't happen constantly. Thanks all... Kevin
On Fri, Feb 21, 2003 at 12:21:38PM -0500, Kevin White wrote: > Andrew Sullivan wrote: > > What's the disk subsystem? Is fsync turned on in both cases? And is > > your IDE drive lying to you about what it's doing. > > It is IDE. How do I turn fsync on or off? (All I can find in the man > is a function call to fsync...is there something else?) By default, Postgres calls fsync() at every COMMIT. Lots of IDE drives lie about whether the fsync() succeeded, so you get better performance than you do with SCSI drives; but you're not really getting that performance, because the fsync isn't effectve. On Linux, I think you can use hdparm to fix this. I believe the write caching is turned off under Solaris, but I'm not sure. Anyway, you can adjust your postgresql.conf file to turn off fsync. > This box only has 1 gig, and I've only set up 200 shared buffers...at > this point, it is only me hitting it. Increasing the shared buffers > might help, but I haven't yet found the info I need to do that > intelligently. For stright inserts, it shouldn't matter, and that seems low enough that it shouldn't be a problem. You should put WAL on another disk, as well, if you can. Also, try using truss to see what the backend is doing. (truss -c gives you a count and gives some time info.) A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
OK, I'm gonna make a couple of observations here that may help out. 1: sun's performance on IDE hardware is abysmal. Both Solaris X86 and Solaris Sparc are utter dogs at IDE, even when you do get DMA and prefetch setup and running. Linux or BSD are much much better at handling IDE interfaces. 2: Postgresql under Solaris on Sparc is about 1/2 as fast as Postgresql under Linux on Sparc, all other things being equal. On 32 bith Sparc the chasm widens even more. 3: Inserting ALL 700,000 rows in one transaction is probably not optimal. Try putting a test in every 1,000 or 10,000 rows to toss a "commit;begin;" pair at the database while loading. Inserting all 700,000 rows at once means postgresql can't recycle the transaction logs, so you'll have 700,000 rows worth of data in the transaction logs waiting for you to commit at the end. That's a fair bit of space, and a large set of files to keep track of. My experience has been that anything over 1,000 inserts in a transaction gains little. 4: If you want to make sure you don't insert any duplicates, it's probably faster to use a unique multi-column key on all your columns (there's a limit on columns in an index depending on which flavor of postgresql you are running, but I think it's 16 on 7.2 and before and 32 on 7.3 and up. I could be off by a factor of two there.
> 1: sun's performance on IDE hardware is abysmal. OK, good to know. This box won't always be the production server: an x86 Dell server with Linux will happen...thanks for the info. > 2: Postgresql under Solaris on Sparc is about 1/2 as fast as Postgresql > under Linux on Sparc, all other things being equal. On 32 bith Sparc the > chasm widens even more. Wow... > 3: Inserting ALL 700,000 rows in one transaction is probably not optimal. > Try putting a test in every 1,000 or 10,000 rows to toss a "commit;begin;" > pair at the database while loading. My original test was 700,000 at once...for today's, I realized that was, um, dumb :) so I fixed it...it commits every 1,000 now... > 4: If you want to make sure you don't insert any duplicates, it's > probably faster to use a unique multi-column key on all your columns The problem isn't inserting the dupes, but at times I need to update the data, rather than load a new batch of it...and the rows have a "rank" (by price)...so when one group of say 5 gets an updated row, it could change the rank of the other 4 so all 5 need updated...so I have to do the select first to find the other values so I can calculate the rank. In THIS specific case, with the table empty, I don't need to do that, but the code had been changed to do it, because normally, my table won't be empty. This is just the initial setup of a new server... Thanks for all the help... It looks like the load finished...I might try turning the sync off. Kevin
Kevin White <kwhite@digital-ics.com> writes: > I did this, but my gmon.out doesn't appear to have much data from the > actual child postgres process, just the parent. Are you looking in the right place? Child processes will dump gmon.out into $PGDATA/base/yourdbnum/, which is not where the parent process does. regards, tom lane
"scott.marlowe" <scott.marlowe@ihs.com> writes: > 3: Inserting ALL 700,000 rows in one transaction is probably not optimal. > Try putting a test in every 1,000 or 10,000 rows to toss a "commit;begin;" > pair at the database while loading. Inserting all 700,000 rows at once > means postgresql can't recycle the transaction logs, so you'll have > 700,000 rows worth of data in the transaction logs waiting for you to > commit at the end. That was true in 7.1.0, but we got rid of that behavior *very* quickly (by 7.1.3, according to the release notes). Long transactions do not currently stress the WAL storage any more than the same amount of work in short transactions. Which is not to say that there's anything wrong with divvying the work into 1000-row-or-so transactions. I agree that that's enough to push the per-transaction overhead down into the noise. regards, tom lane
Does anyone know if there is a fast way to find out how many records are in a table? "Select count(*) from table" is very slow. I would think that PG would keep up with the number of undeleted rows on a realtime basis. Is that the case? If so, how would I query it? Hope this is the correct list for this question. Thanks! -- Matt Mello
On Tue, 25 Feb 2003, Matt Mello wrote: > Does anyone know if there is a fast way to find out how many records are > in a table? > > "Select count(*) from table" is very slow. > > I would think that PG would keep up with the number of undeleted rows on > a realtime basis. Is that the case? If so, how would I query it? Sorry, it doesn't, and it's one of the areas that having an MVCC style database costs you. Also, if postgresql kept up with this automatically, it would have an overhead for each table, but how often do you use it on ALL your tables? Most the time, folks use count(*) on a few tables only, and it would be a waste to have a seperate counting mechanism for all tables when you'd only need it for a few. The general mailing list has several postings in the last 12 months about how to setup a trigger to a single row table that keeps the current count(*) of the master table. If you need a rough count, you can get one from the statistics gathered by analyze in the pg_* tables.