Обсуждение: Large database help
Hi everyone, I'm more or less new to PostgreSQL and am trying to setup a rather large database for a data analysis application. Data is collected and dropped into a single table, which will become ~20GB. Analysis happens on a Windows client (over a network) that queries the data in chunks across parallel connections. I'm running the DB on a dual gig p3 w/ 512 memory under Redhat 6 (.0 I think). A single index exists that gives the best case for lookups, and the table is clustered against this index. My problem is this: during the query process the hard drive is being tagged excessively, while the cpu's are idling at 50% (numbers from Linux command: top), and this is bringing down the speed pretty dramatically since the process is waiting on the hard disk. How do I get the database to be completely resident in memory such that selects don't cause hdd activity? How do I pin how exactly why the hard disk is being accessed? I am setting 'echo 402653184 >/proc/sys/kernel/shmmax', which is being reflected in top. I also specify '-B 48000' when starting postmaster. My test DB is only 86MB, so in theory the disk has no business being active once the data is read into memory unless I perform a write operation.. What am I missing? I appreciate any help anyone could give me. -Xavier _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
xbdelacour@yahoo.com writes: > Hi everyone, I'm more or less new to PostgreSQL and am trying to setup a > rather large database for a data analysis application. Data is collected > and dropped into a single table, which will become ~20GB. Analysis happens > on a Windows client (over a network) that queries the data in chunks across > parallel connections. I'm running the DB on a dual gig p3 w/ 512 memory > under Redhat 6 (.0 I think). > I am setting 'echo 402653184 >/proc/sys/kernel/shmmax', which is being > reflected in top. I also specify '-B 48000' when starting postmaster. Hm. 384M shared memory request on a 512M machine. I'll bet that the kernel is deciding you don't need all that stuff in RAM, and is swapping out chunks of the shared memory region to make room for processes and its own disk buffering activity. Try a more reasonable -B setting, like maybe a quarter of your physical RAM, max. There's no percentage in -B large enough to risk getting swapped. Moreover, any physical RAM that does happen to be free will be exploited by the kernel for disk buffering at its level, so you aren't really saving any I/O by increasing Postgres' internal buffering. BTW, what Postgres version are you using? regards, tom lane
pgsql/PG_VERSION says "7.0". postmaster --version and psql --version don't tell me anything. If the machine is completely dedicated to this database, under what conditions would the kernel make such decisions? Where can I find more information on this? Are there other users with similar requirements? Thanks for your reply. -Xavier At 06:08 PM 4/22/01 -0400, Tom Lane wrote: >xbdelacour@yahoo.com writes: > > Hi everyone, I'm more or less new to PostgreSQL and am trying to setup a > > rather large database for a data analysis application. Data is collected > > and dropped into a single table, which will become ~20GB. Analysis happens > > on a Windows client (over a network) that queries the data in chunks > across > > parallel connections. I'm running the DB on a dual gig p3 w/ 512 memory > > under Redhat 6 (.0 I think). > > > I am setting 'echo 402653184 >/proc/sys/kernel/shmmax', which is being > > reflected in top. I also specify '-B 48000' when starting postmaster. > >Hm. 384M shared memory request on a 512M machine. I'll bet that the >kernel is deciding you don't need all that stuff in RAM, and is swapping >out chunks of the shared memory region to make room for processes and >its own disk buffering activity. Try a more reasonable -B setting, like >maybe a quarter of your physical RAM, max. There's no percentage in -B >large enough to risk getting swapped. Moreover, any physical RAM that >does happen to be free will be exploited by the kernel for disk >buffering at its level, so you aren't really saving any I/O by >increasing Postgres' internal buffering. > >BTW, what Postgres version are you using? > > regards, tom lane > >---------------------------(end of broadcast)--------------------------- >TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
On Sun, Apr 22, 2001 at 05:12:20PM -0400, xbdelacour@yahoo.com wrote: > My problem is this: during the query process the hard drive is being tagged > excessively, while the cpu's are idling at 50% (numbers from Linux command: > top), and this is bringing down the speed pretty dramatically since the > process is waiting on the hard disk. How do I get the database to be > completely resident in memory such that selects don't cause hdd activity? > How do I pin how exactly why the hard disk is being accessed? 50% idle on a 2 CPU system mean one CPU is busy and one is idle - as expected if you run a single CPU-intensive job. > I am setting 'echo 402653184 >/proc/sys/kernel/shmmax', which is being > reflected in top. I also specify '-B 48000' when starting postmaster. My > test DB is only 86MB, so in theory the disk has no business being active > once the data is read into memory unless I perform a write operation.. What > am I missing? What does top say? How much memory do each of your postgres processes take, and how much of it is shared? Is there any free memory? -- Ragnar Kjørstad
I'm spawning 6 backends to query the data. top lists 6 postmaster processes working, and therefore the idle time should hit 0% easily. Also, the hard drive light goes nuts when I'm running this. Here is the pertinent information from top. To be clear, I'm NOT spawning a new postmaster per chunk. These same six processes are alive and kicking for over 4 minutes. I hope the formatting works ok. PID USER PRI NI SIZE RSS SHARE STAT LIB %CPU %MEM TIME COMMAND 2379 mg 13 0 14476 14M 12908 S 0 5.8 2.7 0:01 postmaster 2380 mg 10 0 14436 14M 12868 S 0 2.9 2.7 0:01 postmaster 2381 mg 10 0 13572 13M 12012 S 0 2.9 2.6 0:00 postmaster 2377 mg 10 0 13640 13M 12072 S 0 2.4 2.6 0:01 postmaster 2378 mg 11 0 14476 14M 12908 S 0 2.4 2.7 0:01 postmaster 2376 mg 8 0 13556 13M 11984 S 0 1.9 2.6 0:00 postmaster 27 processes: 24 sleeping, 3 running, 0 zombie, 0 stopped CPU states: 16.3% user, 3.8% system, 0.0% nice, 79.8% idle Mem: 517292K av, 508400K used, 8892K free, 94444K shrd, 197224K buff Swap: 65988K av, 0K used, 65988K free 160740K cached -Xavier At 12:46 AM 4/23/01 +0200, Ragnar Kjørstad wrote: >On Sun, Apr 22, 2001 at 05:12:20PM -0400, xbdelacour@yahoo.com wrote: > > My problem is this: during the query process the hard drive is being > tagged > > excessively, while the cpu's are idling at 50% (numbers from Linux > command: > > top), and this is bringing down the speed pretty dramatically since the > > process is waiting on the hard disk. How do I get the database to be > > completely resident in memory such that selects don't cause hdd activity? > > How do I pin how exactly why the hard disk is being accessed? > >50% idle on a 2 CPU system mean one CPU is busy and one is idle - as >expected if you run a single CPU-intensive job. > > > I am setting 'echo 402653184 >/proc/sys/kernel/shmmax', which is being > > reflected in top. I also specify '-B 48000' when starting postmaster. My > > test DB is only 86MB, so in theory the disk has no business being active > > once the data is read into memory unless I perform a write operation.. > What > > am I missing? > >What does top say? >How much memory do each of your postgres processes take, and how much of >it is shared? >Is there any free memory? > > >-- >Ragnar Kjørstad _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
> xbdelacour@yahoo.com writes: > > Hi everyone, I'm more or less new to PostgreSQL and am trying to setup a > > rather large database for a data analysis application. Data is collected > > and dropped into a single table, which will become ~20GB. Analysis happens > > on a Windows client (over a network) that queries the data in chunks across > > parallel connections. I'm running the DB on a dual gig p3 w/ 512 memory > > under Redhat 6 (.0 I think). > > > I am setting 'echo 402653184 >/proc/sys/kernel/shmmax', which is being > > reflected in top. I also specify '-B 48000' when starting postmaster. > > Hm. 384M shared memory request on a 512M machine. I'll bet that the > kernel is deciding you don't need all that stuff in RAM, and is swapping > out chunks of the shared memory region to make room for processes and > its own disk buffering activity. Try a more reasonable -B setting, like > maybe a quarter of your physical RAM, max. There's no percentage in -B > large enough to risk getting swapped. Moreover, any physical RAM that > does happen to be free will be exploited by the kernel for disk > buffering at its level, so you aren't really saving any I/O by > increasing Postgres' internal buffering. FYI, I will complete a PostgreSQL performance tuning article this week that will appear on the PostgreSQL web site. It will talk about the problems if making -B too high, causing pages to be swapped out. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
> 27 processes: 24 sleeping, 3 running, 0 zombie, 0 stopped > CPU states: 16.3% user, 3.8% system, 0.0% nice, 79.8% idle > Mem: 517292K av, 508400K used, 8892K free, 94444K shrd, 197224K buff > Swap: 65988K av, 0K used, 65988K free 160740K cached > I see zero swap used, meaning -B is not too large. I was wrong. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
>> 27 processes: 24 sleeping, 3 running, 0 zombie, 0 stopped >> CPU states: 16.3% user, 3.8% system, 0.0% nice, 79.8% idle >> Mem: 517292K av, 508400K used, 8892K free, 94444K shrd, 197224K buff >> Swap: 65988K av, 0K used, 65988K free 160740K cached These numbers don't add up. If there's a 384M shared-memory block in the system, how can there be 197M of kernel disk buffers (not to mention the kernel and user programs)? regards, tom lane
On Sun, Apr 22, 2001 at 06:52:26PM -0400, xbdelacour@yahoo.com wrote: > I'm spawning 6 backends to query the data. top lists 6 postmaster processes > working, and therefore the idle time should hit 0% easily. Also, the hard > drive light goes nuts when I'm running this. > > Here is the pertinent information from top. To be clear, I'm NOT spawning a > new postmaster per chunk. These same six processes are alive and kicking > for over 4 minutes. > > I hope the formatting works ok. > > PID USER PRI NI SIZE RSS SHARE STAT LIB %CPU %MEM TIME COMMAND > 2379 mg 13 0 14476 14M 12908 S 0 5.8 2.7 0:01 postmaster > 2380 mg 10 0 14436 14M 12868 S 0 2.9 2.7 0:01 postmaster > 2381 mg 10 0 13572 13M 12012 S 0 2.9 2.6 0:00 postmaster > 2377 mg 10 0 13640 13M 12072 S 0 2.4 2.6 0:01 postmaster > 2378 mg 11 0 14476 14M 12908 S 0 2.4 2.7 0:01 postmaster > 2376 mg 8 0 13556 13M 11984 S 0 1.9 2.6 0:00 postmaster The processes should have ~ 400M shared memory, not ~ 10M, right? stracing (or maybe just running with debugging enabled) should tell you what went wrong. My first guess is that the OS is not able to provide a single 400M block of memory, and postgres falls back to some lower setting. You could also verify this by writing a small program that just allocates different sizes of shared memory, and see what the biggest size you can allocate is. -- Ragnar Kjørstad
Below are snippets from: strace postmaster -i -D `pwd` -B 48000 A large snippet: . . . socket(PF_UNIX, SOCK_STREAM, 0) = 4 bind(4, {sun_family=AF_UNIX, sun_path="/tmp/.s.PGSQL.5432"}, 20) = 0 listen(4, 128) = 0 chmod("/tmp/.s.PGSQL.5432", 0777) = 0 shmget(5432010, 144, IPC_CREAT|0x1c0|0600) = 2944 shmat(2944, 0, 0) = 0x40014000 shmget(5432001, 400385024, IPC_CREAT|0x180|0600) = 2945 shmget(5432001, 400385024, 0) = 2945 shmat(2945, 0, 0) = 0x40176000 brk(0x81b7000) = 0x81b7000 mmap(0, 192512, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x57f4d000 brk(0x81c3000) = 0x81c3000 mmap(0, 577536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x57f7c000 mmap(0, 3076096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x58009000 brk(0x81cf000) = 0x81cf000 semget(5432014, 0, 0) = -1 ENOENT (No such file or directory) semget(5432014, 16, IPC_CREAT|0x180|0600) = 1792 semctl(1792, 0, SETALL, 0xbffff750) = 0 semget(5432015, 0, 0) = -1 ENOENT (No such file or directory) semget(5432015, 16, IPC_CREAT|0x180|0600) = 1793 semctl(1793, 0, SETALL, 0xbffff750) = 0 shmget(5432007, 0, 0) = -1 ENOENT (No such file or directory) shmget(5432007, 66060, IPC_CREAT|0x180|0600) = 2946 shmat(2946, 0, 0) = 0x582f8000 getpid() = 2909 open("/home/mg/pgsql/postmaster.pid", O_RDWR|O_CREAT|O_EXCL, 0600) = 5 write(5, "2909", 4) = 4 close(5) = 0 open("/home/mg/pgsql/postmaster.opts", O_RDWR|O_CREAT|O_TRUNC, 0600) = 5 . . . A smaller snippet: shmget(5432001, 400385024, IPC_CREAT|0x180|0600) = 2945 shmget(5432001, 400385024, 0) = 2945 shmat(2945, 0, 0) = 0x40176000 I'm no Unix expert, but this would seem to indicate that shmget is successfully allocating 400385024/1024/1024=381MB of shared memory. I don't know enough about how the postgres parent/child/shmem scheme works to know why this is working yet the children only register 12MB of shared memory under top. I do need to get this to work however, and firing up a debugger is becoming my only option rather quickly.. -Xavier At 06:15 PM 4/23/01 +0200, Ragnar Kjørstad wrote: >On Sun, Apr 22, 2001 at 06:52:26PM -0400, xbdelacour@yahoo.com wrote: > > I'm spawning 6 backends to query the data. top lists 6 postmaster > processes > > working, and therefore the idle time should hit 0% easily. Also, the hard > > drive light goes nuts when I'm running this. > > > > Here is the pertinent information from top. To be clear, I'm NOT > spawning a > > new postmaster per chunk. These same six processes are alive and kicking > > for over 4 minutes. > > > > I hope the formatting works ok. > > > > PID USER PRI NI SIZE RSS SHARE STAT LIB %CPU %MEM TIME COMMAND > > 2379 mg 13 0 14476 14M 12908 S 0 5.8 2.7 0:01 > postmaster > > 2380 mg 10 0 14436 14M 12868 S 0 2.9 2.7 0:01 > postmaster > > 2381 mg 10 0 13572 13M 12012 S 0 2.9 2.6 0:00 > postmaster > > 2377 mg 10 0 13640 13M 12072 S 0 2.4 2.6 0:01 > postmaster > > 2378 mg 11 0 14476 14M 12908 S 0 2.4 2.7 0:01 > postmaster > > 2376 mg 8 0 13556 13M 11984 S 0 1.9 2.6 0:00 > postmaster > >The processes should have ~ 400M shared memory, not ~ 10M, right? >stracing (or maybe just running with debugging enabled) should tell you >what went wrong. My first guess is that the OS is not able to provide a >single 400M block of memory, and postgres falls back to some lower >setting. > >You could also verify this by writing a small program that just >allocates different sizes of shared memory, and see what the biggest >size you can allocate is. > > > >-- >Ragnar Kjørstad _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
I'm trying to figure that out myself :-) According to the strace info I sent in my last message, it is in fact creating a 381MB shmem block.. which makes no sense, I agree. -Xavier At 01:07 AM 4/23/01 -0400, Tom Lane wrote: > >> 27 processes: 24 sleeping, 3 running, 0 zombie, 0 stopped > >> CPU states: 16.3% user, 3.8% system, 0.0% nice, 79.8% idle > >> Mem: 517292K av, 508400K used, 8892K free, 94444K shrd, 197224K buff > >> Swap: 65988K av, 0K used, 65988K free 160740K cached > >These numbers don't add up. If there's a 384M shared-memory block >in the system, how can there be 197M of kernel disk buffers (not to >mention the kernel and user programs)? > > regards, tom lane > >---------------------------(end of broadcast)--------------------------- >TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
On Mon, Apr 23, 2001 at 06:13:38PM -0400, xbdelacour@yahoo.com wrote: > A smaller snippet: > > shmget(5432001, 400385024, IPC_CREAT|0x180|0600) = 2945 > shmget(5432001, 400385024, 0) = 2945 > shmat(2945, 0, 0) = 0x40176000 > > I'm no Unix expert, but this would seem to indicate that shmget is > successfully allocating 400385024/1024/1024=381MB of shared memory. I don't > know enough about how the postgres parent/child/shmem scheme works to know > why this is working yet the children only register 12MB of shared memory > under top. I believe you're right that it allocates the memory (succesfully). The reason top only show 12 MB shared memory may be that the process haven't actually used the whole segment yet. (linux only allocates the memory when it's first written to) I suppose it could be that postgres doesn't write to the memory-segment, because it doesn't need it. For read access to the file, the OS wil cache the data anyway, and there is no need to use process memory to access the file-data. Maybe the whole shared-memory issue was a blind track - it should not be related to disk activety in this case? Note: some disk activety should be expected. Maybe postgresql updates the log? Or at the very least it will update the atime timestamps for the files everytime they're read. This shouldn't cause enough disk activity to become a performance-problem, but if I remember your initial post correctly, you indicated that one processor was fully saturated. Maybe the problem is in fact related to locking and smp, and not related to shared-memory and disk activity? -- Ragnar Kjørstad
ANote: some disk activety should be expected. Maybe postgresql updates >the log? Or at the very least it will update the atime timestamps for >the files everytime they're read. This shouldn't cause enough disk >activity to become a performance-problem, but if I remember your initial >post correctly, you indicated that one processor was fully saturated. > >Maybe the problem is in fact related to locking and smp, and not related >to shared-memory and disk activity? Then, please correct me if I'm wrong: I should be able to test your hypothesis by creating a small DB (of say 2MB) and setting up at least a dozen backends to tag it. If I get the same symptoms w/ respect to disk activity/performance then we could say that the problem is not related to shared memory/the amount of data/swapping. What log files are output from postgres? I was under the impression that postmaster's stdout/stderr were the only output. I've been postmaster xyz >& log.txt'ing it. I'll check into this. Thanks a lot for your help :-) -Xavier _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
xbdelacour@yahoo.com writes: > I'm no Unix expert, but this would seem to indicate that shmget is > successfully allocating 400385024/1024/1024=381MB of shared memory. I don't > know enough about how the postgres parent/child/shmem scheme works to know > why this is working yet the children only register 12MB of shared memory > under top. On most of the systems I've worked on, top does not seem to count shmem blocks that a process is attached to in the process' memory usage. So that doesn't prove much one way or the other. I am wondering if your version of 'top' fails to count swapped-out shmem segments against swap space, or something like that. That'd be a tad weird, but it seems very improbable that your machine is not swapping; I just do not believe top's claim that no swapping is happening. Anyway, the most direct experiment would be to reduce your -B request to 100MB or so and see how things change... regards, tom lane
By my reading, the machine is definitely swapping, and not writing to a log file (unless its writing obscene amounts of data to the log, which presumably the default settings won't do). postmaster -i -D /home/mg/pgsql -B 100 produces almost identical results in terms of performance and disk activity. top shows that each child only has 2.2MB shared instead of 12MB. What other program/means do I have to tell if the machine is swapping? Can I get a reading as to the amount of data that is actually in physical memory within a process? -Xavier At 08:24 PM 4/23/01 -0400, Tom Lane wrote: >xbdelacour@yahoo.com writes: > > I'm no Unix expert, but this would seem to indicate that shmget is > > successfully allocating 400385024/1024/1024=381MB of shared memory. I > don't > > know enough about how the postgres parent/child/shmem scheme works to know > > why this is working yet the children only register 12MB of shared memory > > under top. > >On most of the systems I've worked on, top does not seem to count shmem >blocks that a process is attached to in the process' memory usage. So >that doesn't prove much one way or the other. > >I am wondering if your version of 'top' fails to count swapped-out shmem >segments against swap space, or something like that. That'd be a tad >weird, but it seems very improbable that your machine is not swapping; >I just do not believe top's claim that no swapping is happening. > >Anyway, the most direct experiment would be to reduce your -B request to >100MB or so and see how things change... > > regards, tom lane _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
On Mon, Apr 23, 2001 at 08:15:05PM -0400, xbdelacour@yahoo.com wrote: > Then, please correct me if I'm wrong: I should be able to test your > hypothesis by creating a small DB (of say 2MB) and setting up at least a > dozen backends to tag it. If I get the same symptoms w/ respect to disk > activity/performance then we could say that the problem is not related to > shared memory/the amount of data/swapping. I'm no expert, but I guess so. > What log files are output from postgres? I was under the impression that > postmaster's stdout/stderr were the only output. I've been postmaster > xyz >& log.txt'ing it. I'll check into this. There is a transaction log named pg_log. I'm not sure exactly under what conditions postgres writes to that file, and what - others will be able to answer that. -- Ragnar Kjørstad
> and dropped into a single table, which will become ~20GB. Analysis happens > on a Windows client (over a network) that queries the data in chunks across > parallel connections. I'm running the DB on a dual gig p3 w/ 512 memory > under Redhat 6 (.0 I think). A single index exists that gives the best case > for lookups, and the table is clustered against this index. Sorry for my ignorant question, but I think I'll learn if I ask it: Wouldn't one *expect* lots of heavy disk activity if one were querying a 20GB database on a system with only 512MB of RAM? Does the same thing happen on, say, 300MB of data? -Clueless in Seattle, Dan B.