Обсуждение: When adding millions of rows at once, getting out of disk space errors

Поиск
Список
Период
Сортировка

When adding millions of rows at once, getting out of disk space errors

От
Mike Christensen
Дата:
Hi all -

I'm doing some perf testing and need huge amounts of data.  So I have a
program that is adding data to a few tables ranging from 500,000 to 15M
rows.  The program is just a simply C# program that blasts data into the
DB, but after about 3M rows or so I get an errror:

ERROR:  could not extend relation 1663/41130/41177: No space left on device
HINT:  Check free disk space.

If I do a full VACUUM on the table being inserted into, the error goes
away but it comes back very quickly.  Obviously, I wouldn't want this
happening in a production environment.

I've noticed some auto-vacuum settings as well (I just checked the box
and left all the defaults) but that doesn't seem to help too much.
What's the recommended setup in a production environment for tables
where tons of data will be inserted?

It seems to me there's some sort of "max table size" before you have to
allocate more space on the disk, however I can't seem to find where
these settings are and how to allow millions of rows to be inserted into
a table without having to vacuum every few million rows..

Mike



Re: When adding millions of rows at once, getting out of disk space errors

От
Scott Marlowe
Дата:
On Wed, Feb 18, 2009 at 1:39 PM, Mike Christensen <imaudi@comcast.net> wrote:
> Hi all -
> ERROR:  could not extend relation 1663/41130/41177: No space left on device
> HINT:  Check free disk space.

You're running out of disk space.

Re: When adding millions of rows at once, getting out of disk space errors

От
Mike Christensen
Дата:
I have well over 50 gigs free on that drive..  I doubt it.

Scott Marlowe wrote:
On Wed, Feb 18, 2009 at 1:39 PM, Mike Christensen <imaudi@comcast.net> wrote: 
Hi all -
ERROR:  could not extend relation 1663/41130/41177: No space left on device
HINT:  Check free disk space.   
You're running out of disk space.
 

Re: When adding millions of rows at once, getting out of disk space errors

От
Tom Lane
Дата:
Mike Christensen <imaudi@comcast.net> writes:
> I'm doing some perf testing and need huge amounts of data.  So I have a
> program that is adding data to a few tables ranging from 500,000 to 15M
> rows.  The program is just a simply C# program that blasts data into the
> DB, but after about 3M rows or so I get an errror:

> ERROR:  could not extend relation 1663/41130/41177: No space left on device
> HINT:  Check free disk space.

Not to put too fine a point on it, but maybe you need to buy a bigger
disk.

            regards, tom lane

Re: When adding millions of rows at once, getting out of disk space errors

От
Alan Hodgson
Дата:
On Wednesday 18 February 2009, Mike Christensen <imaudi@comcast.net> wrote:
>
> ERROR:  could not extend relation 1663/41130/41177: No space left on
> device HINT:  Check free disk space.

> It seems to me there's some sort of "max table size" before you have to
> allocate more space on the disk, however I can't seem to find where
> these settings are and how to allow millions of rows to be inserted into
> a table without having to vacuum every few million rows..

The error indicates that your file system is full. It's not a PostgreSQL
problem. Hence the "hint".

--
A hybrid Escalade is missing the point much in the same way that having a
diet soda with your extra large pepperoni pizza is missing the point.

Re: When adding millions of rows at once, getting out of disk space errors

От
Mark Roberts
Дата:
On Wed, 2009-02-18 at 12:55 -0800, Mike Christensen wrote:
> I have well over 50 gigs free on that drive..  I doubt it.

Are you sure the pg data directory is on the drive you think it is?  Are
you doing alot of deletes or are you merely inserting?  Are you doing
any sorting and therefore running out of temp space in your tmp
partition [supposition, I've run into something like this before, but
not specifically with Postgres].

-Mark


Re: When adding millions of rows at once, getting out of disk space errors

От
Scott Marlowe
Дата:
On Wed, Feb 18, 2009 at 1:55 PM, Mike Christensen <imaudi@comcast.net> wrote:
> I have well over 50 gigs free on that drive..  I doubt it.

Quotas?  Something's making the OS think the drive is full.

Re: When adding millions of rows at once, getting out of disk space errors

От
Bill Moran
Дата:
In response to Mike Christensen <imaudi@comcast.net>:

> I have well over 50 gigs free on that drive..  I doubt it.

I'm not aware of that error having false-positives associated with it.

Common confusion on this point could result from having quotas enabled,
or possibly you're running out of space, then when you check free space
(after the error has occurred) the space has already been reclaimed from
the failed transaction and therefore it looks like you have plenty of
free space, but it disappears when you're running the process.

Also, 2x check that Postgres' data files are on the same partition as
where you've got the 50G free (I've made that mistake more than once)

If none of those help, provide more details.  Based on the detail level
you've provided, you've run out of disk space.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

Re: When adding millions of rows at once, getting out of disk space errors

От
Sam Mason
Дата:
On Wed, Feb 18, 2009 at 12:39:50PM -0800, Mike Christensen wrote:
> I'm doing some perf testing and need huge amounts of data.  So I have a
> program that is adding data to a few tables ranging from 500,000 to 15M
> rows.

I assume you're repeatedly inserting data and then deleting it?  If so,
PG won't get much of a chance to clean up after you.  Because of the way
it handles transactions all of the old data will be left in the table
until the table is vacuumed and the appropriate tuples/rows are marked as
deleted.

> The program is just a simply C# program that blasts data into the
> DB,

Just out of interest, do you know about the COPY command? things will go
much faster than a large number of INSERT statements.

> but after about 3M rows or so I get an errror:
>
> ERROR:  could not extend relation 1663/41130/41177: No space left on device
> HINT:  Check free disk space.
>
> If I do a full VACUUM on the table being inserted into, the error goes
> away but it comes back very quickly.  Obviously, I wouldn't want this
> happening in a production environment.

VACUUM FULL's should very rarely be done, routine maintenance would be
to do plain VACUUMs or let the auto-vacuum daemon handle things.  This
will mark the space as available and subsequent operations will reuse
the space.

> What's the recommended setup in a production environment for tables
> where tons of data will be inserted?

If you're repeatedly inserting and deleting data then you'll probably want
to intersperse some VACUUMs in there.

> It seems to me there's some sort of "max table size" before you have to
> allocate more space on the disk, however I can't seem to find where
> these settings are and how to allow millions of rows to be inserted into
> a table without having to vacuum every few million rows..

There's no maximum table size you get control over; 15million rows on
its own isn't considered particularly big but you need to start being
careful at that stage.  If you've got a particularly "wide" table (i.e.
lots of attributes/columns) this is obviously going to take more space
and you may consider normalizing the data out into separate tables.
Once your row count gets to 10 or 100 times what your dealing with you'd
probably need to start thinking about partitioning the tables and how to
do that would depend on your usage patterns.

--
  Sam  http://samason.me.uk/

Re: When adding millions of rows at once, getting out of disk space errors

От
Mike Christensen
Дата:
Ooo good call, the account is part of the "Users" group which has a quota:

The users will have the following disk quota:
    Disk space limited to 1024 MB
    Warning sent at 900 MB

Which is the exact size of the database..

However, anyone have a clue on how to change this?  This is Windows Server 2003 SBS, I'm not an expert <g>

Bill Moran wrote:
In response to Mike Christensen <imaudi@comcast.net>:
 
I have well over 50 gigs free on that drive..  I doubt it.   
I'm not aware of that error having false-positives associated with it.

Common confusion on this point could result from having quotas enabled,
or possibly you're running out of space, then when you check free space
(after the error has occurred) the space has already been reclaimed from
the failed transaction and therefore it looks like you have plenty of
free space, but it disappears when you're running the process.

Also, 2x check that Postgres' data files are on the same partition as
where you've got the 50G free (I've made that mistake more than once)

If none of those help, provide more details.  Based on the detail level
you've provided, you've run out of disk space.
 

Re: When adding millions of rows at once, getting out of disk space errors

От
John R Pierce
Дата:
Mike Christensen wrote:
> Ooo good call, the account is part of the "Users" group which has a quota:
>
> The users will have the following disk quota:
>     Disk space limited to 1024 MB
>     Warning sent at 900 MB
>
> Which is the exact size of the database..
>
> However, anyone have a clue on how to change this?  This is Windows
> Server 2003 SBS, I'm not an expert <g>

I'm not particularly familiar with SBS, but in general, the postgres
service account is not normally a part of -any- group on Windows.



Re: When adding millions of rows at once, getting out of disk space errors

От
Mike Christensen
Дата:
I just changed the account manually and did not set a disk quota..

However, now I have bigger problems since the service will not start up
anymore.  I tried re-booting twice.

First I get:

2009-02-18 21:24:25 GMT FATAL:  lock file "postmaster.pid" already exists
2009-02-18 21:24:25 GMT HINT:  Is another postmaster (PID 1888) running
in data directory "C:/Program Files/PostgreSQL/8.3/data"?

After I delete the file, I try to start the service again and get:

2009-02-18 13:27:18 PST FATAL:  could not create any TCP/IP sockets

Any ideas?

John R Pierce wrote:
> Mike Christensen wrote:
>> Ooo good call, the account is part of the "Users" group which has a
>> quota:
>>
>> The users will have the following disk quota:
>>     Disk space limited to 1024 MB
>>     Warning sent at 900 MB
>>
>> Which is the exact size of the database..
>>
>> However, anyone have a clue on how to change this?  This is Windows
>> Server 2003 SBS, I'm not an expert <g>
>
> I'm not particularly familiar with SBS, but in general, the postgres
> service account is not normally a part of -any- group on Windows.
>
>
>

[ANN]VTD-XML 2.5

От
crackeur@comcast.net
Дата:

VTD-XML 2.5 is now released. Please go to https://sourceforge.net/project/showfiles.php?group_id=110612&package_id... to download the latest version.

Changes from Version 2.4 (2/2009)

* Added separate VTD indexing generating and loading (see http://vtd-xml.sf.net/persistence.html for further info)
* Integrated extended VTD supporting 256 GB doc (In Java only).
* Added duplicateNav() for replicate multiple VTDNav instances sharing XML, VTD and LC buffer (availabe in Java and C#).
* Various bug fixes and enhancements.

Re: When adding millions of rows at once, getting out of disk space errors

От
Grzegorz Jaśkiewicz
Дата:
I bet it is on windows (judging by html in that email), but if isn't:
open a console and issue:
watch -n 0.5 df -h
and run that insert again ;)

Re: When adding millions of rows at once, getting out of disk space errors

От
Mike Christensen
Дата:
Actually I'm writing emails on my Mac <g>

However, the Postgres service is running on my Windows 2003 machine..

The disk space issue turned out to be a disk quota which was easy to
solve.  Unfortunately, the fact it crashed Postgres and with a massive
transaction log left the server in a state where it wouldn't boot
anymore.  I was eventually able to fix it by resetting the transaction
log manually.  I'm hoping future versions of Postgres will handle this
scenario a lot better..

Mike

Grzegorz Jaśkiewicz wrote:
> I bet it is on windows (judging by html in that email), but if isn't:
> open a console and issue:
> watch -n 0.5 df -h
> and run that insert again ;)
>
>

Re: When adding millions of rows at once, getting out of disk space errors

От
Scott Marlowe
Дата:
On Thu, Feb 19, 2009 at 3:11 AM, Mike Christensen <imaudi@comcast.net> wrote:
> Actually I'm writing emails on my Mac <g>
>
> However, the Postgres service is running on my Windows 2003 machine..
>
> The disk space issue turned out to be a disk quota which was easy to solve.
>  Unfortunately, the fact it crashed Postgres and with a massive transaction
> log left the server in a state where it wouldn't boot anymore.  I was
> eventually able to fix it by resetting the transaction log manually.  I'm
> hoping future versions of Postgres will handle this scenario a lot better..

They're certainly supposed to.  I've had no such problems running out
of space on linux in the past.  I wonder if it's a windows thing.

Re: When adding millions of rows at once, getting out of disk space errors

От
Mike Christensen
Дата:
I would assume if the account the service is running under has limited disk space, it won't really matter what OS you're running under.  Postgres will throw an "out of disk space" error.

The problem for me is I was in the middle of a transaction which inserted about 50,000 rows into a table, and that transaction was pending in the transaction log.  For some reason, the service didn't quit property and when it started back up, it began a repair process.  When it got to the pending transactions (I think about 16 megs worth of data) it just hung with no useful errors or anything outputted to the log or stderr.  I think this needs to be fixed!

First off, when Postgres starts and sees that your database was not closed properly, it should tell you there's pending transactions and ask if you want to dump them or try to process them (or maybe save them for later).  If you process them, there should be clear status and you should know what's going on.  It's very possible the service would have /eventually/ started up for me had I waited long enough (I tried 5-6 hrs, with no logs, output, or change in memory consumption; thus I assumed it was dead)..

Also, if there are errors processing the transaction log, rather than just error out and exit, it should at least tell you to run pg_resetxlog.exe.  Another idea is if I just delete everything in the pg_xlog directory, it should recover and boot up without any errors rather than complaining about missing checkpoint files.

No big problem, the good news is Postgres never once actually corrupted my data and I didn't lose a single byte (well, except for the pending transactions that I didn't care about).  More good news is I learned a lot and even got familiar with some of the source code and debugging, and it's made me finally get around to writing an automatic backup script that runs every midnight.

Mike

Scott Marlowe wrote:
On Thu, Feb 19, 2009 at 3:11 AM, Mike Christensen <imaudi@comcast.net> wrote: 
Actually I'm writing emails on my Mac <g>

However, the Postgres service is running on my Windows 2003 machine..

The disk space issue turned out to be a disk quota which was easy to solve.Unfortunately, the fact it crashed Postgres and with a massive transaction
log left the server in a state where it wouldn't boot anymore.  I was
eventually able to fix it by resetting the transaction log manually.  I'm
hoping future versions of Postgres will handle this scenario a lot better..   
They're certainly supposed to.  I've had no such problems running out
of space on linux in the past.  I wonder if it's a windows thing.
 

Re: When adding millions of rows at once, getting out of disk space errors

От
Geoffrey
Дата:
Mike Christensen wrote:
> I have well over 50 gigs free on that drive..  I doubt it.

out of inodes.

>
> Scott Marlowe wrote:
>> On Wed, Feb 18, 2009 at 1:39 PM, Mike Christensen <imaudi@comcast.net> wrote:
>>
>>> Hi all -
>>> ERROR:  could not extend relation 1663/41130/41177: No space left on device
>>> HINT:  Check free disk space.
>>>
>>
>> You're running out of disk space.
>>
>>


--
Until later, Geoffrey

Those who would give up essential Liberty, to purchase a little
temporary Safety, deserve neither Liberty nor Safety.
  - Benjamin Franklin

Re: When adding millions of rows at once, getting out of disk space errors

От
John R Pierce
Дата:
Mike Christensen wrote:
> First off, when Postgres starts and sees that your database was not
> closed properly, it should tell you there's pending transactions and
> ask if you want to dump them or try to process them (or maybe save
> them for later).  If you process them, there should be clear status
> and you should know what's going on.


how does a system service ask you anything?   a service normally starts
when the system boots, before anyone logs onto the console.



Re: When adding millions of rows at once, getting out of disk space errors

От
Sam Mason
Дата:
On Thu, Feb 19, 2009 at 02:58:02AM -0800, Mike Christensen wrote:
> I would assume if the account the service is running under has limited
> disk space, it won't really matter what OS you're running under.
> Postgres will throw an "out of disk space" error.

Similarly to Scott, every time I've come across this PG has recovered
quite happily on its own.

> The problem for me is I was in the middle of a transaction which
> inserted about 50,000 rows into a table, and that transaction was
> pending in the transaction log.  For some reason, the service didn't
> quit property and when it started back up, it began a repair process.
> When it got to the pending transactions (I think about 16 megs worth of
> data) it just hung with no useful errors or anything outputted to the
> log or stderr.  I think this needs to be fixed!

By default PG will always create 16MB WAL files, if you're modifying a
lot of data then multiple log files will be involved.  I've done silly
things like inserting the cross product of two large tables and wondered
why things are taking so long it fails after a while with an out of disk
space error, but PG never misbehaved when I did this.  This would be
inserting several tens of GB of data, i.e. 100's of millions of rows.

> First off, when Postgres starts and sees that your database was not
> closed properly, it should tell you there's pending transactions and ask
> if you want to dump them or try to process them (or maybe save them for
> later).

Doing much of that would break things quite impressively.  Applications
are, or should be, designed to assume that once a transaction has
commited then it's comitted.  If the database can decide to rollback a
commited transaction because you run out of space doing something else
this is bad.  Similarly, a transaction can't commit half way through,
which is almost what you're saying you'd like to happen.  This would
lead to all sorts of inconsistency.

> If you process them, there should be clear status and you
> should know what's going on.  It's very possible the service would have
> /eventually/ started up for me had I waited long enough (I tried 5-6
> hrs, with no logs, output, or change in memory consumption; thus I
> assumed it was dead)..

PG, at least under Linux, is very verbose about things when they look
suspicious.  They must be appearing somewhere under Windows as well,
maybe others can suggest where to look.

> Also, if there are errors processing the transaction log, rather than
> just error out and exit, it should at least tell you to run
> pg_resetxlog.exe.  Another idea is if I just delete everything in the
> pg_xlog directory, it should recover and boot up without any errors
> rather than complaining about missing checkpoint files.

And potentially corrupt the entire database?  PG's behavior is designed
to be safely pessimistic and to keep your data safe.  E.g. if a disk is
on the way out and starts corrupting the log then you want to contain
the error rather than spreading it further.

--
  Sam  http://samason.me.uk/

Re: When adding millions of rows at once, getting out of disk space errors

От
Mike Christensen
Дата:
Yea sorry good point..  It's probably at least safe to say the process
should not just hang though, and there should be more info in the log as
to what it's doing..

John R Pierce wrote:
> Mike Christensen wrote:
>> First off, when Postgres starts and sees that your database was not
>> closed properly, it should tell you there's pending transactions and
>> ask if you want to dump them or try to process them (or maybe save
>> them for later).  If you process them, there should be clear status
>> and you should know what's going on.
>
>
> how does a system service ask you anything?   a service normally
> starts when the system boots, before anyone logs onto the console.
>
>
>