Обсуждение: HINT: Perhaps out of disk space?

Поиск
Список
Период
Сортировка

HINT: Perhaps out of disk space?

От
Michael Adler
Дата:
I'm investigating a problem that happened last night and I would
appreciate any recommendations. The logs indicate that the disks were
full, but I truly doubt that since we only use about 14GB out of the
available 65GB.

I found entries like this in the logs:

ERROR:  could not write block 2354 of temporary file: No space left on device
HINT:  Perhaps out of disk space?
....
ERROR:  could not extend relation "parent_table": No space left on device
HINT:  Check free disk space.
....
LOG:  could not close temporary statistics file "/var/lib/postgres/data/global/pgstat.tmp.1464": No space left on
device

According to the logs, the problem went away after a reboot. I wonder
if the kernel or the RAID device got confused and postgres was simply
echoing what it was told. We run a couple hundred postgres servers and
we have not seen this before (except when the disks truly were full).

Everything is in the root filesystem, which has plenty of room.

Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda1             67756724  14344392  49970408  23% /
tmpfs                  1034768         0   1034768   0% /dev/shm

PostgreSQL 7.4.7 on i386-pc-linux-gnu, compiled by GCC i386-linux-gcc (GCC) 3.3.5 (Debian 1:3.3.5-12)
Debian Sarge with Linux kernel 2.4.27-2-686-smp
Dell PowerEdge 1800
Dell MegaRAID PERC 4/DC RAID Controller, 128MB cache w/BBU
2x SEAGATE Cheetah 10K.7 ST373207LC in RAID 1 (mirroring)

Folks are a little jittery because our customers do very heavy
business this month and we don't want frantic support calls when we
should be drinking eggnog.

 -Mike

Re: HINT: Perhaps out of disk space?

От
Tom Lane
Дата:
Michael Adler <adler@pobox.com> writes:
> I'm investigating a problem that happened last night and I would
> appreciate any recommendations. The logs indicate that the disks were
> full, but I truly doubt that since we only use about 14GB out of the
> available 65GB.

> I found entries like this in the logs:

> ERROR:  could not write block 2354 of temporary file: No space left on device
> HINT:  Perhaps out of disk space?
> ....
> ERROR:  could not extend relation "parent_table": No space left on device
> HINT:  Check free disk space.
> ....
> LOG:  could not close temporary statistics file "/var/lib/postgres/data/global/pgstat.tmp.1464": No space left on
device

> According to the logs, the problem went away after a reboot. I wonder
> if the kernel or the RAID device got confused and postgres was simply
> echoing what it was told. We run a couple hundred postgres servers and
> we have not seen this before (except when the disks truly were full).

I'm inclined to think that a query created a 50GB temporary file ...
the postmaster cleans out temp files when restarted, so that would
have destroyed the evidence.

            regards, tom lane

Re: HINT: Perhaps out of disk space?

От
Michael Adler
Дата:
On Fri, Dec 23, 2005 at 11:36:54AM -0500, Tom Lane wrote:
> Michael Adler <adler@pobox.com> writes:
> > I'm investigating a problem that happened last night and I would
> > appreciate any recommendations. The logs indicate that the disks were
> > full, but I truly doubt that since we only use about 14GB out of the
> > available 65GB.
>
> > I found entries like this in the logs:
>
> > ERROR:  could not write block 2354 of temporary file: No space left on device
> > HINT:  Perhaps out of disk space?
> > ....
> > ERROR:  could not extend relation "parent_table": No space left on device
> > HINT:  Check free disk space.
> > ....
> > LOG:  could not close temporary statistics file "/var/lib/postgres/data/global/pgstat.tmp.1464": No space left on
device
>
> > According to the logs, the problem went away after a reboot. I wonder
> > if the kernel or the RAID device got confused and postgres was simply
> > echoing what it was told. We run a couple hundred postgres servers and
> > we have not seen this before (except when the disks truly were full).
>
> I'm inclined to think that a query created a 50GB temporary file ...
> the postmaster cleans out temp files when restarted, so that would
> have destroyed the evidence.

I'm curious about what could have resulted in so much temporary
storage for a database that fits entirely in 2.5GB space. I can
imagine taking the largest table and joining it against itself many
times without a WHERE clause. What else would use a lot of temp
storage?

How long would it take to clean out 50GB of temp files? It looks like
the postmaster was able to start up instantly after the reboot (ready
less than 1 second after "LOG: database system was shut down at...")

I really appreciate any guidance you could offer.

 -Mike

Re: HINT: Perhaps out of disk space?

От
John Koller
Дата:
On Fri, 23 Dec 2005 13:42:13 -0500, Michael Adler wrote:

> On Fri, Dec 23, 2005 at 11:36:54AM -0500, Tom Lane wrote:
>> Michael Adler <adler@pobox.com> writes:
>> > I'm investigating a problem that happened last night and I would
>> > appreciate any recommendations. The logs indicate that the disks were
>> > full, but I truly doubt that since we only use about 14GB out of the
>> > available 65GB.
>>
>> > I found entries like this in the logs:
>>
>> > ERROR:  could not write block 2354 of temporary file: No space left on device
>> > HINT:  Perhaps out of disk space?
>> > ....
>> > ERROR:  could not extend relation "parent_table": No space left on device
>> > HINT:  Check free disk space.
>> > ....
>> > LOG:  could not close temporary statistics file "/var/lib/postgres/data/global/pgstat.tmp.1464": No space left on
device
>>
>> > According to the logs, the problem went away after a reboot. I wonder
>> > if the kernel or the RAID device got confused and postgres was simply
>> > echoing what it was told. We run a couple hundred postgres servers and
>> > we have not seen this before (except when the disks truly were full).
> I really appreciate any guidance you could offer.
>

Are there any errors about running out of shared memory? I have seen the
"No space left on device" error for that on FreeBSD before.