Обсуждение: Performance Problems

Поиск
Список
Период
Сортировка

Performance Problems

От
Alex Paulusberger
Дата:
Hi,
i came across a problem for which I don't really have an explanation. If
anyone has some ideas, help would be greatly appreciated.

Here we go...

I run postgers 7.2.1 on a SunFire, 1 GB memory.
I run a perl batch job (DBI, same machine) that does the following.

1. Iterates through all the records in a table (total 4500 records) by
    selecting one record at the time
2. The record then is used to select a couple of records from another
table (total 400,000 records in the table)
    and then selects a couple of records (average  10-15), complements
the records with additional information
    and stores the information in a temp table.
3. The temp table is then consulted, one record selected and then
inserted in a results table.
    the results table is growing by 4,500 records a day. the temp table
is primarily used to apply a logic by sorting
    records.

The process:
The whole process loops 4,500 times.
For every loop
- a temp table is cleared
- 10-15 records are inserted into the temp table
- one record is selected from the temp table and inserted into the
results table.
in-between some calculations are done in perl.

the total number of selects is ca. 400,000
total number of inserts of table deletes are 4,500
total number of inserts are ca. 400,000

The problem:
initially if am able to evaluate 5-6 records a second and insert them
into the results table, however towards the end
of the evaluation, the figure drops to under 2. I optimized the code,
use transaction blocks and tuned the database (has 700mb of memory
allocated to it). What I don't quite understand is why  the system slows
down.  since the process steps are the same. Could perl be the reason ?

Any ideas ?

Thanks
Alex






Re: Performance Problems

От
Tom Lane
Дата:
Alex Paulusberger <alexp@meta-bit.com> writes:
> initially if am able to evaluate 5-6 records a second and insert them
> into the results table, however towards the end
> of the evaluation, the figure drops to under 2.

Do you have suitable index(es) on the temp table?  Is the system actually
using it/them for your queries?

            regards, tom lane

Re: Performance Problems

От
Tom Lane
Дата:
Another thought...

Alex Paulusberger <alexp@meta-bit.com> writes:
> The whole process loops 4,500 times.
> For every loop
> - a temp table is cleared

How exactly are you clearing the temp table?  DELETE FROM isn't a good
plan because you'll still have dead tuples in there.  You could do a
DELETE FROM and then VACUUM, but I'd suggest TRUNCATE instead.

            regards, tom lane

Re: Performance Problems

От
Alex Paulusberger
Дата:
Tom,
thanks. I do use DELETE FROM since truncate is not an option in
trnansaction blocks
and within DBI , autocommit has to be turned off when connecting to the DB.

But maybe you are right, and the overhead not being able to truncate
tables is bigger than
not using transaction blocks.

Regards
Alex

Tom Lane wrote:

>Another thought...
>
>Alex Paulusberger <alexp@meta-bit.com> writes:
>
>>The whole process loops 4,500 times.
>>For every loop
>>- a temp table is cleared
>>
>
>How exactly are you clearing the temp table?  DELETE FROM isn't a good
>plan because you'll still have dead tuples in there.  You could do a
>DELETE FROM and then VACUUM, but I'd suggest TRUNCATE instead.
>
>            regards, tom lane
>
>---------------------------(end of broadcast)---------------------------
>TIP 3: if posting/reading through Usenet, please send an appropriate
>subscribe-nomail command to majordomo@postgresql.org so that your
>message can get through to the mailing list cleanly
>
>



Re: Performance Problems

От
nikolaus@dilger.cc
Дата:
Alex,

databases are designed for set operations, i.e. process
multiple rows at a time.  The process you describe is
inherently inefficient.  Looping through the whole
table; one line at a time with a DBI call each.  This
will result in many uneccessary table scans and lots of
overhead for your several thousand DBI calls.

So you would get the greatest performance improvment if
you could rewrite your logic in a way to completely
eliminate the loop.  Write a querry that will create
the whole temp table in one shot.  The write a second
querry to insert into the results table.  Depending on
the complexety of your calculations you may need to
create your own SQL functions and add them to your
database.

Regards,
Nikolaus Dilger

On Fri, 23 August 2002, Alex Paulusberger wrote:

>
> Hi,
> i came across a problem for which I don't really have
> an explanation. If
> anyone has some ideas, help would be greatly
> appreciated.
>
> Here we go...
>
> I run postgers 7.2.1 on a SunFire, 1 GB memory.
> I run a perl batch job (DBI, same machine) that does
> the following.
>
> 1. Iterates through all the records in a table (total
> 4500 records) by
>     selecting one record at the time
> 2. The record then is used to select a couple of
> records from another
> table (total 400,000 records in the table)
>     and then selects a couple of records (average
> 10-15), complements
> the records with additional information
>     and stores the information in a temp table.
> 3. The temp table is then consulted, one record
> selected and then
> inserted in a results table.
>     the results table is growing by 4,500 records a
> day. the temp table
> is primarily used to apply a logic by sorting
>     records.
>
> The process:
> The whole process loops 4,500 times.
> For every loop
> - a temp table is cleared
> - 10-15 records are inserted into the temp table
> - one record is selected from the temp table and
> inserted into the
> results table.
> in-between some calculations are done in perl.
>
> the total number of selects is ca. 400,000
> total number of inserts of table deletes are 4,500
> total number of inserts are ca. 400,000
>
> The problem:
> initially if am able to evaluate 5-6 records a second
> and insert them
> into the results table, however towards the end
> of the evaluation, the figure drops to under 2. I
> optimized the code,
> use transaction blocks and tuned the database (has
> 700mb of memory
> allocated to it). What I don't quite understand is
why
> the system slows
> down.  since the process steps are the same. Could
perl
> be the reason ?
>
> Any ideas ?
>
> Thanks
> Alex
>
>
>
>
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to
> majordomo@postgresql.org

Re: Performance Problems

От
"Thomas A. Lowery"
Дата:
Alex,

        You're able to alter the autocommit value during processing:

        $dbh->{AutoCommit} = 0;
        $dbh->do( qq{insert into yucky values ('A')} );
        $dbh->commit;

        $dbh->{AutoCommit} = 1;

        $dbh->do( qq{truncate table yucky} );

        $dbh->{AutoCommit} = 0;
        $dbh->do( qq{insert into yucky values ('A')} );
        $dbh->commit;

        Do you need the use of a temp database table? If so have you
        looked at using a memory resident table (DBD::AnyData)?

        Another thing I've found that helps is to cache reference or lookup data.
        Memoize is an easy way to cache.  This only works for data that doesn't
        change between queries  select x from y where z = 1 always returns 'F' ...


Tom



On Fri, Aug 23, 2002 at 11:21:22PM +0900, Alex Paulusberger wrote:
> Tom,
> thanks. I do use DELETE FROM since truncate is not an option in
> trnansaction blocks
> and within DBI , autocommit has to be turned off when connecting to the DB.
>
> But maybe you are right, and the overhead not being able to truncate
> tables is bigger than
> not using transaction blocks.
>
> Regards
> Alex
>
> Tom Lane wrote:
>
> >Another thought...
> >
> >Alex Paulusberger <alexp@meta-bit.com> writes:
> >
> >>The whole process loops 4,500 times.
> >>For every loop
> >>- a temp table is cleared
> >>
> >
> >How exactly are you clearing the temp table?  DELETE FROM isn't a good
> >plan because you'll still have dead tuples in there.  You could do a
> >DELETE FROM and then VACUUM, but I'd suggest TRUNCATE instead.