Обсуждение: RE: WAL Log using all my disk space!
> I do a large bulk copy once a day (100,000 records of Radius data), > tearing down indices, truncating a large table that contains summary > information, and rebuilding everything after the copy. Over the course > of this operation, I can generate up to 1.5 gigs of WAL data in > pg_xlog. Sometimes (like just now), I will run out of disk space and > the postmaster will crash. I try to restart it, and it errors out. > Then I delete all the WAL logs, try to restart, and (surprise) it errors > out again. (Removing WAL logs is not good idea). > I tried to set some of the of the WAL parameters in postgres.conf like > so: > > wal_buffers = 4 # min 4 (More is better and doesn't affect disk space usage). > wal_files = 8 # range 0-64 Ops. With wal_files > 0 server pre-allocates log files in advance! Should be used only if disk space is not problem... > I would like to recover without an initdb, but if that isn't > possible, I would definitely like to avoid this problem in the > future. So, are you able to restart? If not - send us startup server log. You should be able to remove some of preallocated wal_files but I need in numbers from server log to say what is safe to remove. Now how to reduce disk space usage. First to keep in mind - server removes old (useless) log files at checkpoint time. Second - log file becomes useless (from transaction subsystem POV) if it keeps no record from any running transaction. Third - unfortunately (from my POV), we requires two checkpoint in log files now, so we do not remove files with records between last two checkpoints. Recommendation: try to split your bulk operation into a few transactions with smaller write traffic and run CHECKPOINT commands between them. You could also try to change checkpoint_segments and/or checkpoint_timeout params, but imho explicit CHECKPOINT is better for bulk ops, because of it will not affect normal operations. Vadim
I just put a CHECKPOINT command on both sides of my COPY statement, and it seemed to go quite a bit faster and not give me any problems. Does that make sense? Thanks W "Mikheev, Vadim" wrote: > > > I do a large bulk copy once a day (100,000 records of Radius data), > > tearing down indices, truncating a large table that contains summary > > information, and rebuilding everything after the copy. Over the course > > of this operation, I can generate up to 1.5 gigs of WAL data in > > pg_xlog. Sometimes (like just now), I will run out of disk space and > > the postmaster will crash. I try to restart it, and it errors out. > > Then I delete all the WAL logs, try to restart, and (surprise) it errors > > out again. > > (Removing WAL logs is not good idea). > > > I tried to set some of the of the WAL parameters in postgres.conf like > > so: > > > > wal_buffers = 4 # min 4 > > (More is better and doesn't affect disk space usage). > > > wal_files = 8 # range 0-64 > > Ops. With wal_files > 0 server pre-allocates log files in advance! > Should be used only if disk space is not problem... > > > I would like to recover without an initdb, but if that isn't > > possible, I would definitely like to avoid this problem in the > > future. > > So, are you able to restart? If not - send us startup server log. > You should be able to remove some of preallocated wal_files but > I need in numbers from server log to say what is safe to remove. > > Now how to reduce disk space usage. > First to keep in mind - server removes old (useless) log files at > checkpoint time. Second - log file becomes useless (from transaction > subsystem POV) if it keeps no record from any running transaction. > Third - unfortunately (from my POV), we requires two checkpoint in > log files now, so we do not remove files with records between last > two checkpoints. > Recommendation: try to split your bulk operation into a few transactions > with smaller write traffic and run CHECKPOINT commands between them. > You could also try to change checkpoint_segments and/or checkpoint_timeout > params, but imho explicit CHECKPOINT is better for bulk ops, because of > it will not affect normal operations. > > Vadim
> I just put a CHECKPOINT command on both sides of my COPY > statement, and it seemed to go quite a bit faster and not Hmm, why would COPY be faster just after CHECKPOINT. I assume this was caused by some fortuity (like preparing new log file at checkpoint time - server does it even if wal_files = 0 but current log file is 75% full). > give me any problems. Does that make sense? Of couse. Also remember > > Third - unfortunately (from my POV), we requires two checkpoint in > > log files now, so we do not remove files with records between last > > two checkpoints. Some additional CPKT-s may be required. BUT, I forgot about fourth point - server logs *entire* data pages on first after last CPKT page modification. It doesn't affect COPY-ed tables (when first row goes to new page just this row will be logged) but may affect indices and UPDATE/DELETE ops. So, one should be very careful when using CPKT-s. Vadim
I am trying to determine how many seconds a query takes, and I can't figure it out. I seem to remember some weird, mostly undocumented query that went like: foo# get diagnostics select processed into <int4 variable>; foo# show <int4 variable>; But this doesn't work, and I think it may only work in plpgsql function anyway. I know how to use explain, but I want to see if postgres is correct which query is fastest. Does anybody know how to find this info out? Thx