Re: SQL Loader?

Поиск
Список
Период
Сортировка
От Guy Fraser
Тема Re: SQL Loader?
Дата
Msg-id 3B9CF40A.1D84D909@incentre.net
обсуждение исходный текст
Список pgsql-general
Joel Pang wrote:
>
> Hi Fraser,
>
> I've read your mail in postgresql web site that you've a procedure that will do bulk loading of records into
database. I've been looking a utility just like the sql loader of Oracle for the Postgres database.  I was wondering if
youwill share your procedure will me? or give me some URL links that I can get reference on how to write my own sql
loaderprocedure? 
>
> Thanks in advance for your help.
>
> Cheers,
> Joel

Please not the script below uses some special program to modify radius
accounting files for import as an ordered tab delimited file format.

The script also shows a way to import into an intermediate table for
seperating different uniqu records into seperate files then moving the
duplicate entries to a table for data integrity.

This also demonstrates a method of using tables determined from the data
being imported.

I wrote this software a long time ago and have spent little time
patching or rewriting. Since the script was initially written, I now
know some better ways of performing some of these tasks. I don't have
time to rewrite the script and it has worked for over 3 years so it is
low priority.

Guy Fraser

======== begin radimport ========
#!/bin/bash
ORIG=`pwd`
WORKDIR='/usr/local/pgsql'

cd $WORKDIR
echo Start collecting files
wget -P $WORKDIR/ -t 10 -T 30 -c \
    ftp://username:password@host.domain/path/*.acct.gz   # modified for
security
echo Finished collecting files
echo Start decompressing files
gunzip -v *.acct.gz
echo Finished decompressing files

for fname in `ls *.acct`;
do {
    [ -f $fname ] && {
        date
        echo Start inputing $fname
        # parserad - convert radius detail file to tab delimited format.
        /usr/local/sbin/parserad -i $fname \
            | /usr/local/pgsql/bin/psql -c "\
                copy detail from stdin;" radius
        echo Finished inputing $fname
        echo Start compressing $fname
        gzip -v $fname
        echo Finished compressing $fname
#
#
# Clean up detail
#
/usr/local/bin/psql -c "vacuum detail;" radius
#
# If more than one month determine Current and Previous, Month and Year.
#
#
MINTIME=`/usr/local/pgsql/bin/psql -c "\
select date_trunc('month',min(\"Time-Stamp\")::abstime) from detail;\
" -A -t radius`
MINMON=`echo $MINTIME | awk '{print $2}' -`
MINYEAR=`echo $MINTIME | awk '{print $5}' -`
MAXTIME=`/usr/local/pgsql/bin/psql -c "\
select date_trunc('month',max(\"Time-Stamp\")::abstime) from detail;\
" -A -t radius`
MAXMON=`echo $MAXTIME | awk '{print $2}' -`
MAXYEAR=`echo $MAXTIME | awk '{print $5}' -`
[ "$MAXYEAR" = "" ] && (echo "Exiting: No Data in detail table." || exit
1)
        echo Moving $fname
        mv $fname.gz /mnt/sdb3/done/$MAXYEAR
        echo Start processing data from $fname
#
# Process records in detail file and create a unique record file called
radius.
#
#
        echo Creating lookup table
/usr/local/bin/psql -c "\
select min(oid) as recnum,max(\"Acct-Session-Time\"),\

\"Acct-Status-Type\",\"Acct-Session-Id\",\"NAS-IP-Address\",\"NAS-Port-Id\",\
 \"User-Name\",\"Realm\",\"Framed-IP-Address\",\"Calling-Station-Id\" \
into radius \
from detail \
group by
\"Acct-Session-Id\",\"NAS-IP-Address\",\"NAS-Port-Id\",\"User-Name\",\

\"Realm\",\"Framed-IP-Address\",\"Calling-Station-Id\",\"Acct-Status-Type\";\
" radius
/usr/local/bin/psql -c "vacuum radius;" radius

#
#
# Move stop records to stop table
#
        echo Creating unique stop record table
/usr/local/bin/psql -c "\
select recnum into radstop from radius where \"Acct-Status-Type\" =
'Stop';\
" radius
        echo Filtering stop records
/usr/local/bin/psql -c "\
select a.* into stop from detail as a,radstop as b where a.oid =
b.recnum;\
" radius
/usr/local/bin/psql -c "select count(*) as \"Filtered\" from stop;"
radius
        echo Cleaning stop records
/usr/local/bin/psql -c "\
delete from detail where oid = radstop.recnum;\
" radius
        echo Cleaning temporary tables
/usr/local/bin/psql -c "drop table radstop;" radius
/usr/local/bin/psql -c "vacuum detail;" radius

#
# Move start and alive records to start table
#
        echo Creating unique start record table
/usr/local/bin/psql -c "\
select recnum into radstart from radius where \"Acct-Status-Type\" !=
'Stop';\
" radius
        echo Filtering start records
/usr/local/bin/psql -c "\
select a.* into start from detail as a,radstart as b where a.oid =
b.recnum;\
" radius
/usr/local/bin/psql -c "select count(*) as \"Filtered\" from start;"
radius
        echo Cleaning start records
/usr/local/bin/psql -c "\
delete from detail where oid = radstart.recnum;\
" radius
        echo Cleaning temporary tables
/usr/local/bin/psql -c "drop table radstart;" radius
/usr/local/bin/psql -c "drop table radius;" radius
/usr/local/bin/psql -c "vacuum detail;" radius

#
# Move rest of records to dups table
#
        echo Filtering duplicate records
/usr/local/bin/psql -c "select * into dups from detail;" radius
/usr/local/bin/psql -c "select count(*) as \"Filtered\" from dups;"
radius
        echo Cleaning duplicate records
/usr/local/bin/psql -c "delete from detail;" radius
        echo Cleaning temporary tables
/usr/local/bin/psql -c "vacuum detail;" radius

#
#
# Determine if more than one month.
#
        [ "$MAXMON" != "$MINMON" ] && {
            echo  Previous Month
#
# Starting Previous Month
#
#
#
#
# Put Stop records in Previous stop
#
#
        echo Inserting stop records into ${MINMON}start
/usr/local/bin/psql -c "\
insert into ${MINMON}stop \
 select * \
 from stop \
 where \"Time-Stamp\" between \
  date_trunc('month','$MINMON 1 $MINYEAR 00:00:00'::abstime) \
   and date_trunc('month','$MINMON 1 $MINYEAR 00:00:00'::abstime \
    + '32 days')::abstime - '1 second';
" radius
#
# Put Start and Alive records in Previous start
#
        echo Inserting start records into ${MINMON}start
/usr/local/bin/psql -c "\
insert into ${MINMON}start \
 select * \
 from start \
 where \"Time-Stamp\" between \
  date_trunc('month','$MINMON 1 $MINYEAR 00:00:00'::abstime) \
   and date_trunc('month','$MINMON 1 $MINYEAR 00:00:00'::abstime \
    + '32 days')::abstime - '1 second';\
" radius
#
# Put rest of data in Previous dups
#
        echo Inserting duplicate records into ${MINMON}dups
/usr/local/bin/psql -c "\
insert into ${MINMON}dups \
 select * \
 from dups \
 where \"Time-Stamp\" between \
  date_trunc('month','$MINMON 1 $MINYEAR 00:00:00'::abstime) \
   and date_trunc('month','$MINMON 1 $MINYEAR 00:00:00'::abstime \
    + '32 days')::abstime - '1 second';\
" radius
#
#
# clean up after previous month.
#
echo Finished processing data from $fname for $MINMON $MINYEAR
date
echo Cleaning up after $fname
/usr/local/bin/psql -c "vacuum ${MINMON}stop;" radius
/usr/local/bin/psql -c "vacuum ${MINMON}start;" radius
/usr/local/bin/psql -c "vacuum ${MINMON}dups;" radius
date
echo Making Daily and Monthly Tables for $MINMON
/usr/local/pgsql/make-day-time $MINMON
}
        echo Current Month
#
# Starting Current Month
#
#
#
#
# Put Stop records in Current stop
#
        echo Inserting stop records into ${MINMON}stop
/usr/local/bin/psql -c "\
insert into ${MAXMON}stop \
 select * \
 from stop \
 where \"Time-Stamp\" between \
  date_trunc('month','$MAXMON 1 $MAXYEAR 00:00:00'::abstime) \
   and date_trunc('month','$MAXMON 1 $MAXYEAR 00:00:00'::abstime \
    + '32 days')::abstime - '1 second';\
" radius
/usr/local/bin/psql -c "drop table stop;" radius
#
# Put Stop records in Current stop
#
        echo Inserting start records into ${MINMON}start
/usr/local/bin/psql -c "\
insert into ${MAXMON}start \
 select * \
 from start \
 where \"Time-Stamp\" between \
  date_trunc('month','$MAXMON 1 $MAXYEAR 00:00:00'::abstime) \
   and date_trunc('month','$MAXMON 1 $MAXYEAR 00:00:00'::abstime \
    + '32 days')::abstime - '1 second';\
" radius
/usr/local/bin/psql -c "drop table start;" radius
#
# Put rest of data in Current dups
#
        echo Inserting duplicate records into ${MINMON}dups
/usr/local/bin/psql -c "\
insert into ${MAXMON}dups \
 select * \
 from dups \
 where \"Time-Stamp\" between \
  date_trunc('month','$MAXMON 1 $MAXYEAR 00:00:00'::abstime) \
   and date_trunc('month','$MAXMON 1 $MAXYEAR 00:00:00'::abstime \
    + '32 days')::abstime - '1 second';\
" radius
/usr/local/bin/psql -c "drop table dups;" radius
#
#
#
#
#
#
# clean up after current month.
#
echo Finished processing data from  $fname for $MAXMON $MAXYEAR
date
echo Cleaning up after $fname
/usr/local/bin/psql -c "vacuum ${MAXMON}stop;" radius
/usr/local/bin/psql -c "vacuum ${MAXMON}start;" radius
/usr/local/bin/psql -c "vacuum ${MAXMON}dups;" radius
date
echo Making Daily and Monthly Tables for $MAXMON
/usr/local/pgsql/make-day-time $MAXMON
    }
}
date
done
echo "Finished, no more files to process ."
cd $ORIG
======== end radimport ========

--
There is a fine line between genius and lunacy, fear not, walk the
line with pride. Not all things will end up as you wanted, but you
will certainly discover things the meek and timid will miss out on.

В списке pgsql-general по дате отправления:

Предыдущее
От: Stephan Szabo
Дата:
Сообщение: Re: Performance question
Следующее
От: Culley Harrelson
Дата:
Сообщение: unicode in 7.1