Обсуждение: Function won't complete

Поиск
Список
Период
Сортировка

Function won't complete

От
"Josh Berkus"
Дата:
Folks,

I am currently testing a massive data update function, which takes a
large amount of raw data (in 3 tables comprising over 2 million records
total) and organizes and cleans that data in a series of successive
steps for the final relational database.

The problem is, it doesn't complete.  About 1/3 of the way in to the
data updates, postgres gets stuck in an endless cycle of recycling
transaction logs and child processes and won't finish (current longest
abort time is 37 minutes).

I've upped the various memory parameters significantly ... working with
wal_files 32, wal_buffers 32, sort_mem 65536, shared_buffers 256, etc.The CPU (1400mhz athalon) and memory (512Mb DDR
RAM)are hardly being
 
taxed.  The only hardware bottleneck I can see is that the system only
has one IDE drive (though a fast, large one).  I've added many indexes.

Does anyone have an suggestions on making this process complete, short
of running out and buying a RAID SCSI array?

-Josh Berkus


Re: Function won't complete

От
Tom Lane
Дата:
"Josh Berkus" <josh@agliodbs.com> writes:
> I am currently testing a massive data update function, which takes a
> large amount of raw data (in 3 tables comprising over 2 million records
> total) and organizes and cleans that data in a series of successive
> steps for the final relational database.

> The problem is, it doesn't complete.  About 1/3 of the way in to the
> data updates, postgres gets stuck in an endless cycle of recycling
> transaction logs and child processes and won't finish (current longest
> abort time is 37 minutes).

You sure it's not just a programming error (the proverbial unconstrained
join, for example)?

Without seeing the query(s) I doubt we can help you much.
        regards, tom lane


Re: Function won't complete

От
"Josh Berkus"
Дата:
Tom,

> Uh, what PG version are you running?  7.1.3 or later should not have
> a
> problem with WAL bloat because of long transactions.

7.2.1.  The problem is not specifically the WAL bloat ... it's
large/many WAL files combined with single-IDE-disk-access.

To recap, because I think this experience might benefit other folks on
this list:

The setup:
1. I had a large, data-heavy system that, because of problems at the
client's site, I had to move to my development machine.
2. Unlike the client's server (which has dual-controller ultra SCSI
drives), my development machine has a single 30gb IDE drive.
3. I was debugging the process of moving data from the client's
immense, not-normalized, accounting database to a smaller, normalized,
backend database for a web interface.  
4. The accounting data consists primarily of six tables with 110,000,
900,000, 110,000, 800, 250,000, and 300 records.
5. This data was fed into a function that tests for irregular data and
flags it, fills in missing defaults, aggregates the data, and loads it
into 7 tables in the new database with a better relational structure
and more normalization controls.

The Problem:
The function never completed.  Even when left to a running time of 3.5
hours, it was still "thinking."

Analysis:
By upping the debug level and tailing the logs, I could see what was
happening.  Because all of the activity was taking place inside a
single function, the database had to be prepared to roll it all back as
a single transaction.  As a result, with each succeeding operation
within the function, the calls to the transaction log got longer and
slower.  On a dual-SCSI or RAID system, this probably would have finished OK
anyway.  But, on a consumer-grade single IDE drive, the system kept
suffering longer and longer waits for reading back data from the drive
(both the database and the WAL files) before it could proceed to the
next step.A check of vmstat confirmed this.  The CPU was not swamped (60-92%
activity) and the system RAM was barely touched (about 128 of 512 mb in
use).  However, there was a constant stream of disk I/O which hovered
around the limit for the drive.

The Solution
Since my system was not to be the permanent home of the application, I
didn't want to run out and spend $700 on drives and a SCSI controller.What to do?
1. The first step was to break up the single function into 6 functions
that write their results to a log.  Each succeeding function would then
check the log for the previous function's result before proceeding, and
run VACUUM ANALYZE between functions.
2. The second step was to check Explain on each of the significant data
steps.  Through this, I discovered three missing indexes (including one
column where I was doing a SELECT DISTINCT ON, a real
perfromance-killer if you don't have an index).  I created those
indexes.
3. I wrote a shell script to launch all 6 functions in succession.
4. The re-built operation now completes in about 35 minutes.

-Josh Berkus




______AGLIO DATABASE SOLUTIONS___________________________                                      Josh Berkus Complete
informationtechnology      josh@agliodbs.com  and data management solutions       (415) 565-7293 for law firms, small
businesses       fax 621-2533   and non-profit organizations.      San Francisco
 


Re: Function won't complete

От
Tom Lane
Дата:
"Josh Berkus" <josh@agliodbs.com> writes:
> Analysis:
> By upping the debug level and tailing the logs, I could see what was
> happening.  Because all of the activity was taking place inside a
> single function, the database had to be prepared to roll it all back as
> a single transaction.  As a result, with each succeeding operation
> within the function, the calls to the transaction log got longer and
> slower.  

This analysis is nonsense, because PG does not rely on WAL for
transaction rollback, and the amount of WAL activity is *not*
proportional to transaction length.  (At least not since 7.1.2.)

It might be that the real issue is growth of the list of pending
triggers, if you have deferred triggers (eg RI triggers) on the
relations you are updating.  Or it could be something else; since you
say the thing remains I/O-bound, it seems like we must be talking about
tuple access or updates somewhere.  (If VACUUM between steps helps, it
might just be accumulation of dead tuples.)  However, until you drop
your focus on the WAL we'll not find out what's really the bottleneck...
        regards, tom lane


Re: Function won't complete

От
"Josh Berkus"
Дата:
Tom,

> This analysis is nonsense, because PG does not rely on WAL for
> transaction rollback, and the amount of WAL activity is *not*
> proportional to transaction length.  (At least not since 7.1.2.)
<snip>
> However, until you drop
> your focus on the WAL we'll not find out what's really the
> bottleneck...

Sorry, Tom.  I was used to the problems of 7.1.2, and didn't really
"get it" when you told me things had changed.

I still say it's the disk I/O, and I think your explanation of dead
tuples makes a lot of sense.   The debug log is full of this:

DEBUG:  proc_exit(0)
DEBUG:  shmem_exit(0)
DEBUG:  exit(0)
DEBUG:  reaping dead processes
DEBUG:  child process (pid 11933) exited with exit code 0
DEBUG:  proc_exit(0)
DEBUG:  shmem_exit(0)
DEBUG:  exit(0)
DEBUG:  reaping dead processes
DEBUG:  child process (pid 11939) exited with exit code 0

And each one of the cycles about takes 5-10 minutes.   

I'm a little reluctant to dump everything to the list, as we're talking
about a lot of data and code.  Lemme do some judicious editing and I'll
send you a gzip package this week.

-Josh Berkus


convert from sybase to postgresql

От
"Frank Morton"
Дата:
Anyone know of a utility to convert a sybase .db
file to something postgres can use? If not, to convert
to anything that I know the file format of to itself
to convert into postgres. All preferably open-source.

Thanks.




Re: convert from sybase to postgresql

От
"Nick Fankhauser"
Дата:
Frank-

Try this link:

http://pgadmin.postgresql.org/pgadmin2.php?ContentID=15



You might find some other clues in the techdocs:

http://techdocs.postgresql.org/

-Nick

--------------------------------------------------------------------------
Nick Fankhauser  nickf@ontko.com  Phone 1.765.935.4283  Fax 1.765.962.9788
Ray Ontko & Co.     Software Consulting Services     http://www.ontko.com/



> -----Original Message-----
> From: pgsql-sql-owner@postgresql.org
> [mailto:pgsql-sql-owner@postgresql.org]On Behalf Of Frank Morton
> Sent: Monday, April 29, 2002 6:09 PM
> To: pgsql-sql@postgresql.org
> Subject: [SQL] convert from sybase to postgresql
> 
> 
> Anyone know of a utility to convert a sybase .db
> file to something postgres can use? If not, to convert
> to anything that I know the file format of to itself
> to convert into postgres. All preferably open-source.
> 
> Thanks.
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
> 
> http://www.postgresql.org/users-lounge/docs/faq.html
>