Обсуждение: performance issues with DBI module when data too big

Поиск
Список
Период
Сортировка

performance issues with DBI module when data too big

От
"Nicolas Nolst"
Дата:
<div style="background-color:"><div><div><div></div><div></div><p>Hi all,<p>I have developped a perl script to populate
adatabase with two tables: sessions and actions.<p>the table actions contains the following columns: session_id, url,
timestamp.The column session_id references to the table sessions. <p>the table sessions contains the following columns:
session_id,remote_ip, phone_type, phone_number. The column session_id is serial.<p>The lines of the table actions which
arepart of the same session have the same session_id.<p>There are then more lines in the table actions than in the
tablesession.<p><br />To fill the two tables, I first need to know if the session already exists for a certain
phone_type,a certain remote_ip and a certain phone_number:<p>SELECT session_id FROM sessions WHERE (phone_number =
?) AND(remote_ip  = ?) AND (phone_type = ?) ORDER BY session_id;<p>I also need to apply a criteria to know if I have
to adda new entry in the table sessions or not:<p>SELECT (max(timestamp) + ?)<? FROM actions WHERE (session_id = ?);
 <p> <p>If the session already exists I add a line in the table actions with a INSERT<p>If the session doesn't exist or
ifthe criteria is true, I add a line in the table sessions with an INSERT, retrieve the session_id of the line just
addedwith the following request<p>SELECT session_id FROM sessions where (msisdn=?) AND (remote_ip=?) AND
(user_agent=?) ORDER BYsession_id DESC LIMIT 1<p>and the add with a INSERT a line in the table actions.<p>I have put
indexeson sessions(session_id), sessions(msisdn), actions(session_id).<p>My problem is that populating my database is
slower when the data gets bigger and the performance falls dramatically. I thought that is would be improve with my
indexesbut the problem still persists.<p>Could you please give me some clues that could solve this
issue.<p>Thanks.<p><br/> <br /><br />Nicolas Nolst <div></div><div></div><img
src="http://www.3dflags/World/Gif/belgium_gs.gif"/><div></div></div></div></div><br clear="all" /><hr />Join the
world�slargest e-mail service with MSN Hotmail. <a href="http://g.msn.com/1HM300901/158">Click Here</a><br /> 

Re: performance issues with DBI module when data too big

От
Jan Wieck
Дата:
Nicolas Nolst wrote:
[text/html is unsupported, treating like TEXT/PLAIN]

> [...]
> <P>My problem is that populating my database is slower  when the data gets bigger and the performance falls
dramatically.I thought that is would be improve with my indexes but the problem still persists.</P> 
> <P>Could you please give me some clues that could solve this issue.</P>

    Is your database frequently vacuumed and analyzed (and do you
    frequently read frequently asked questions)?

> <P>Thanks.</P>
> <P><BR> <BR><BR>Nicolas Nolst </P>

    And just for the record, do you prefer answers to  HTML  mail
    in Postscript, PDF, DVI or nothing at all?


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



Re: [ADMIN] performance issues with DBI module when data too big

От
Andrew Perrin
Дата:
Are you using the {AutoCommit => 0} argument to DBI->connect()?  If not,
do so, and then add a $dbh->commit; line when you're done with the
inserts. Should help a lot.

----------------------------------------------------------------------
Andrew J Perrin - http://www.unc.edu/~aperrin
Assistant Professor of Sociology, U of North Carolina, Chapel Hill
clists@perrin.socsci.unc.edu * andrew_perrin (at) unc.edu


On Mon, 3 Jun 2002, Nicolas Nolst wrote:

>
> Hi all,
>
> I have developped a perl script to populate a database with two tables:
> sessions and actions.
>
> the table actions contains the following columns: session_id, url,
> timestamp. The column session_id references to the table sessions. 
>
> the table sessions contains the following columns: session_id, remote_ip,
> phone_type, phone_number. The column session_id is serial.
>
> The lines of the table actions which are part of the same session have
> the same session_id.
>
> There are then more lines in the table actions than in the table session.
>
>
> To fill the two tables, I first need to know if the session already
> exists for a certain phone_type, a certain remote_ip and a certain
> phone_number:
>
> SELECT session_id FROM sessions WHERE (phone_number = ?) AND (remote_ip 
> = ?) AND (phone_type = ?) ORDER BY session_id;
>
> I also need to apply a criteria to know if I have to add a new entry in
> the table sessions or not:
>
> SELECT (max(timestamp) + ?)<? FROM actions WHERE (session_id = ?);  
>
>  
>
> If the session already exists I add a line in the table actions with a
> INSERT
>
> If the session doesn't exist or if the criteria is true, I add a line in
> the table sessions with an INSERT, retrieve the session_id of the line
> just added with the following request
>
> SELECT session_id FROM sessions where (msisdn=?) AND (remote_ip=?) AND
> (user_agent=?) ORDER BY session_id DESC LIMIT 1
>
> and the add with a INSERT a line in the table actions.
>
> I have put indexes on sessions(session_id), sessions(msisdn),
> actions(session_id).
>
> My problem is that populating my database is slower  when the data gets
> bigger and the performance falls dramatically. I thought that is would be
> improve with my indexes but the problem still persists.
>
> Could you please give me some clues that could solve this issue.
>
> Thanks.
>
>
>  
>
> Nicolas Nolst
>
> [belgium_gs.gif]
>
> ________________________________________________________________________________
> Join the world’s largest e-mail service with MSN Hotmail. Click Here
>
>