Обсуждение: Out of file-descriptors message
Hello All We are having an issue and it appears to be with file descriptors. We are running PostgreSQL 7.4.13 on RHEL 3.0 box. This system is running Redhat Cluster Suite with each system running multiple postmasters. The error has occurred on both boxes and we can't find any consistencies on what is causing the problem. I don't see this message in dmesg or any other log file. It only appears in this single database. There are other databases running on the same host but from different postmasters and I don't see the same message. A couple of things to add is that this has happened on another machine from a different database. When this problem has been reoccurring for a period of several days the server get fenced by the other machine. Don't know if related but seems to happen at the time when the descriptors error is occurring. Here is a snippet from the log of one of the databases. 2006-11-29 14:57:51 [3823] LOG: out of file descriptors: Too many open files; release and retry 2006-11-29 14:58:01 [3823] LOG: out of file descriptors: Too many open files; release and retry 2006-11-29 15:00:26 [3823] LOG: out of file descriptors: Too many open files; release and retry -- **John Allgood** /*/Senior Systems Administrator/*/ **Turbo Logistics - Division of Ozburn-Hessey Logistics** **2251 Jesse Jewell Pky. NE** **Gainesville, GA 30507****** **tel (678) 989-3051 fax (770) 531-7878** **john@turbocorp.com** <mailto:john@turbocorp.com> **www.turbocorp.com** <http://www.turbocorp.com>
John Allgood <john@turbocorp.com> writes: > 2006-11-29 14:57:51 [3823] LOG: out of file descriptors: Too many open > files; release and retry Consider reducing PG's max_files_per_process setting. Postgres itself will usually not have a serious problem when you've run the kernel out of file descriptors, but everything else on the machine will :-( regards, tom lane
Hey Tom I assume that if it is the kernel running out of descriptors that I would get the messages in dmesg. This message only appears in the log file for that database. Thanks Tom Lane wrote: > John Allgood <john@turbocorp.com> writes: > >> 2006-11-29 14:57:51 [3823] LOG: out of file descriptors: Too many open >> files; release and retry >> > > Consider reducing PG's max_files_per_process setting. Postgres itself > will usually not have a serious problem when you've run the kernel out > of file descriptors, but everything else on the machine will :-( > > regards, tom lane >
John Allgood wrote: > Hey Tom > > I assume that if it is the kernel running out of descriptors that I > would get the messages in dmesg. This message only appears in the log > file for that database. Yeah, the point is that you have the max_files_per_process setting higher than what the kernel likes. So decrease it, and Postgres will adjust itself to use less file descriptors by closing and reopening files as needed. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera wrote: > John Allgood wrote: > >> Hey Tom >> >> I assume that if it is the kernel running out of descriptors that I >> would get the messages in dmesg. This message only appears in the log >> file for that database. >> > > Yeah, the point is that you have the max_files_per_process setting > higher than what the kernel likes. So decrease it, and Postgres will > adjust itself to use less file descriptors by closing and reopening > files as needed. > > I really can't see that is possible file-max is set to 838860. I believe it is based on the amount of ram on the machine and we have 8GB. ulimit -a reports you can have 1024 open files. I thought maybe we were running out of descriptors for the postgres user but that should not bring down the machine. Thoughts anyone.
Alvaro Herrera wrote: > John Allgood wrote: >> Hey Tom >> >> I assume that if it is the kernel running out of descriptors that I >> would get the messages in dmesg. This message only appears in the log >> file for that database. > > Yeah, the point is that you have the max_files_per_process setting > higher than what the kernel likes. So decrease it, and Postgres will > adjust itself to use less file descriptors by closing and reopening > files as needed. I work with the original poster and wanted to make sure the problem here is clear. The 'out of file descriptors' message is coming from Postgresql, not the kernel. Thus, it doesn't make sense to me that the max_files_per_process setting is too high. I would think we need to increase it so that Postgresql will stop generating these errors. -- Until later, Geoffrey Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. - Benjamin Franklin
Geoffrey wrote: > Alvaro Herrera wrote: > >John Allgood wrote: > >>Hey Tom > >> > >> I assume that if it is the kernel running out of descriptors that I > >>would get the messages in dmesg. This message only appears in the log > >>file for that database. > > > >Yeah, the point is that you have the max_files_per_process setting > >higher than what the kernel likes. So decrease it, and Postgres will > >adjust itself to use less file descriptors by closing and reopening > >files as needed. > > I work with the original poster and wanted to make sure the problem here > is clear. Yes, that was understood from the beginning. > The 'out of file descriptors' message is coming from > Postgresql, not the kernel. Thus, it doesn't make sense to me that the > max_files_per_process setting is too high. I would think we need to > increase it so that Postgresql will stop generating these errors. No, you need to lower it so that Postgres doesn't _try_ to use as many file descriptors. Read this again: > > So decrease it, and Postgres will > >adjust itself to use less file descriptors by closing and reopening > >files as needed. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera wrote: > Geoffrey wrote: >> Alvaro Herrera wrote: >>> John Allgood wrote: >>>> Hey Tom >>>> >>>> I assume that if it is the kernel running out of descriptors that I >>>> would get the messages in dmesg. This message only appears in the log >>>> file for that database. >>> Yeah, the point is that you have the max_files_per_process setting >>> higher than what the kernel likes. So decrease it, and Postgres will >>> adjust itself to use less file descriptors by closing and reopening >>> files as needed. >> I work with the original poster and wanted to make sure the problem here >> is clear. > > Yes, that was understood from the beginning. > >> The 'out of file descriptors' message is coming from >> Postgresql, not the kernel. Thus, it doesn't make sense to me that the >> max_files_per_process setting is too high. I would think we need to >> increase it so that Postgresql will stop generating these errors. > > No, you need to lower it so that Postgres doesn't _try_ to use as many > file descriptors. Read this again: > >>> So decrease it, and Postgres will >>> adjust itself to use less file descriptors by closing and reopening >>> files as needed. Okay, I'm just not getting it. Postgres complains that it is out of file descriptors. The kernel is not complaining about any such issues. So I should lower the max_files_per_process value and this will rid us of the 'out of file descriptors' error? Is it because the max_files_per_process is greater then the number of file descriptors that are alloted to Postgres by the kernel? -- Until later, Geoffrey Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. - Benjamin Franklin
Bradley Kieser wrote: > Hmm, not entirely true. You may well only see the error in the process > itself. You're telling me that the kernel could be running out of file descriptors and I wouldn't see a message regarding this in the kernel logs? I don't believe that is possible. > > > Geoffrey wrote: >> Alvaro Herrera wrote: >>> John Allgood wrote: >>>> Hey Tom >>>> >>>> I assume that if it is the kernel running out of descriptors that I >>>> would get the messages in dmesg. This message only appears in the log >>>> file for that database. >>> >>> Yeah, the point is that you have the max_files_per_process setting >>> higher than what the kernel likes. So decrease it, and Postgres will >>> adjust itself to use less file descriptors by closing and reopening >>> files as needed. >> >> I work with the original poster and wanted to make sure the problem >> here is clear. The 'out of file descriptors' message is coming from >> Postgresql, not the kernel. Thus, it doesn't make sense to me that >> the max_files_per_process setting is too high. I would think we need >> to increase it so that Postgresql will stop generating these errors. >> > -- Until later, Geoffrey Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. - Benjamin Franklin
Geoffrey wrote: > Okay, I'm just not getting it. Postgres complains that it is out of > file descriptors. The kernel is not complaining about any such issues. > > > So I should lower the max_files_per_process > value and this will rid us of the 'out of file descriptors' error? > > Is it because the max_files_per_process is greater then the number of > file descriptors that are alloted to Postgres by the kernel? Yes. Which is kinda weird, because Postgres actually tests the number when it starts, so that if you set the number too high, it will decrease it according to what the kernel allows. Maybe the test is newer than the version you are running -- what was it, again? -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Maybe the whole problem is with ulimit. Ulimit sets the max files per user at the shell and postgres might be exceeding the 1024 limit. What would increasing this value to 2048 do? Geoffrey wrote: > Alvaro Herrera wrote: >> Geoffrey wrote: >>> Alvaro Herrera wrote: >>>> John Allgood wrote: >>>>> Hey Tom >>>>> >>>>> I assume that if it is the kernel running out of descriptors >>>>> that I >>>>> would get the messages in dmesg. This message only appears in the log >>>>> file for that database. >>>> Yeah, the point is that you have the max_files_per_process setting >>>> higher than what the kernel likes. So decrease it, and Postgres will >>>> adjust itself to use less file descriptors by closing and reopening >>>> files as needed. >>> I work with the original poster and wanted to make sure the problem >>> here is clear. >> >> Yes, that was understood from the beginning. >> >>> The 'out of file descriptors' message is coming from Postgresql, not >>> the kernel. Thus, it doesn't make sense to me that the >>> max_files_per_process setting is too high. I would think we need to >>> increase it so that Postgresql will stop generating these errors. >> >> No, you need to lower it so that Postgres doesn't _try_ to use as many >> file descriptors. Read this again: >> >>>> So decrease it, and Postgres will >>>> adjust itself to use less file descriptors by closing and reopening >>>> files as needed. > > Okay, I'm just not getting it. Postgres complains that it is out of > file descriptors. The kernel is not complaining about any such issues. > > So I should lower the max_files_per_process > value and this will rid us of the 'out of file descriptors' error? > > Is it because the max_files_per_process is greater then the number of > file descriptors that are alloted to Postgres by the kernel? >
Alvaro Herrera wrote: > Geoffrey wrote: > >> Okay, I'm just not getting it. Postgres complains that it is out of >> file descriptors. The kernel is not complaining about any such issues. >> >> >> So I should lower the max_files_per_process >> value and this will rid us of the 'out of file descriptors' error? >> >> Is it because the max_files_per_process is greater then the number of >> file descriptors that are alloted to Postgres by the kernel? > > Yes. Which is kinda weird, because Postgres actually tests the number > when it starts, so that if you set the number too high, it will decrease > it according to what the kernel allows. > > Maybe the test is newer than the version you are running -- what was it, > again? > 7.4.13 -- Until later, Geoffrey Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. - Benjamin Franklin
Geoffrey <esoteric@3times25.net> writes: > I work with the original poster and wanted to make sure the problem here > is clear. The 'out of file descriptors' message is coming from > Postgresql, not the kernel. Thus, it doesn't make sense to me that the > max_files_per_process setting is too high. I would think we need to > increase it so that Postgresql will stop generating these errors. These are not "errors", they are only log messages. It would be perfectly safe to ignore them. However, as Alvaro says, if you want to get rid of them then the way to do that is to *decrease* max_files_per_process to something less than whatever per-process file limit the kernel is enforcing. regards, tom lane
Tom Lane wrote: > Geoffrey <esoteric@3times25.net> writes: >> I work with the original poster and wanted to make sure the problem here >> is clear. The 'out of file descriptors' message is coming from >> Postgresql, not the kernel. Thus, it doesn't make sense to me that the >> max_files_per_process setting is too high. I would think we need to >> increase it so that Postgresql will stop generating these errors. > > These are not "errors", they are only log messages. It would be > perfectly safe to ignore them. However, as Alvaro says, if you want > to get rid of them then the way to do that is to *decrease* > max_files_per_process to something less than whatever per-process file > limit the kernel is enforcing. Understood, but we had one database that was generating these messages continuously over a period of 4 days. Eventually, the machine became unresponsive. Still, we did not have any file descriptor errors generated by the kernel. Thus we are considering whether we should decrease max_files_per_process or increase the postgres id's 'ulimit -n' value. -- Until later, Geoffrey Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. - Benjamin Franklin
Geoffrey <esoteric@3times25.net> writes: > Alvaro Herrera wrote: >> Yes. Which is kinda weird, because Postgres actually tests the number >> when it starts, so that if you set the number too high, it will decrease >> it according to what the kernel allows. >> >> Maybe the test is newer than the version you are running -- what was it, >> again? >> > 7.4.13 Hmph ... 7.4.13 does have the same set_max_safe_fds() logic as HEAD. So this shouldn't be happening really, no matter what the relative values of max_files_per_process and ulimit -n are. Explanations I can think of: * ulimit -n decreased after process startup (is this even possible?) * something is leaking file descriptors that are outside fd.c's control, such that eventually there are more than NUM_RESERVED_FDS of 'em. Theory B seems a lot more plausible. It'd be interesting to look at "lsof" output for one of the backend processes that is emitting this message, to see if we can identify what's getting leaked. regards, tom lane
Tom Lane wrote: > Geoffrey <esoteric@3times25.net> writes: >> Alvaro Herrera wrote: >>> Yes. Which is kinda weird, because Postgres actually tests the number >>> when it starts, so that if you set the number too high, it will decrease >>> it according to what the kernel allows. >>> >>> Maybe the test is newer than the version you are running -- what was it, >>> again? >>> >> 7.4.13 > > Hmph ... 7.4.13 does have the same set_max_safe_fds() logic as HEAD. > So this shouldn't be happening really, no matter what the relative > values of max_files_per_process and ulimit -n are. > > Explanations I can think of: > > * ulimit -n decreased after process startup (is this even possible?) > > * something is leaking file descriptors that are outside fd.c's control, > such that eventually there are more than NUM_RESERVED_FDS of 'em. > > Theory B seems a lot more plausible. It'd be interesting to look at > "lsof" output for one of the backend processes that is emitting this > message, to see if we can identify what's getting leaked. We have been looking at a particular issue related to a library we've built into the backend. It does appear that this library could well be the culprit. We are researching it further, will post more as we get more info. -- Until later, Geoffrey Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. - Benjamin Franklin
Hmm, not entirely true. You may well only see the error in the process itself. Geoffrey wrote: > Alvaro Herrera wrote: >> John Allgood wrote: >>> Hey Tom >>> >>> I assume that if it is the kernel running out of descriptors that I >>> would get the messages in dmesg. This message only appears in the log >>> file for that database. >> >> Yeah, the point is that you have the max_files_per_process setting >> higher than what the kernel likes. So decrease it, and Postgres will >> adjust itself to use less file descriptors by closing and reopening >> files as needed. > > I work with the original poster and wanted to make sure the problem > here is clear. The 'out of file descriptors' message is coming from > Postgresql, not the kernel. Thus, it doesn't make sense to me that > the max_files_per_process setting is too high. I would think we need > to increase it so that Postgresql will stop generating these errors. >