Обсуждение: Frequent 'deadlock detected' in 7.4 ... or just my bad code?

Поиск
Список
Период
Сортировка

Frequent 'deadlock detected' in 7.4 ... or just my bad code?

От
"Marc G. Fournier"
Дата:
G'day ...

  I've got a script that runs on all the servers that dump's IP traffic
data to a 7.4 database ... they all run at the same time, but I'm starting
to get the following on a reasonably regular basis:

ERROR:  deadlock detected at /usr/local/abin/ipaudit2ams.pl line 175.

  The code that is causing it, from the times I've been able to catch it,
is a simple update to the same table:

       $upd->execute( $traffic{$company_id}{$ip_id}{$port}, $company_id, $ip_id, $date ) || die $upd->errstr;

  Now, the scripts are wrap'd in a BEGIN/END ... if a file fails to be
loaded, I want the whole thing to rollback ... the deadlock itself, I'm
presuming, is because two servers are trying to update the same
$ip_id/$port/$company_id record, at the same time ...

  Now, reading the DEADLOCKS section at:

    http://www.postgresql.org/docs/7.4/static/explicit-locking.html

  This is to be expected ... but, other then breaking the transaction
itself into smaller chunks, or staggering the scripts run times ... is
there something I'm overlooking to eliminate this?  I could increase the
deadlock timeout, as an option as well ...



----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

Re: Frequent 'deadlock detected' in 7.4 ... or just my bad code?

От
Tom Lane
Дата:
"Marc G. Fournier" <scrappy@postgresql.org> writes:
>   Now, the scripts are wrap'd in a BEGIN/END ... if a file fails to be
> loaded, I want the whole thing to rollback ... the deadlock itself, I'm
> presuming, is because two servers are trying to update the same
> $ip_id/$port/$company_id record, at the same time ...

Actually, the problem is more likely that two servers try to update two
different rows in opposite orders.  It's not possible to deadlock when
only one lock is involved.

You could work around this by ensuring that all sessions update rows in
a consistent order; for instance, at the beginning of a transaction sort
your intended updates by primary key and then apply in that order.

            regards, tom lane

Re: Frequent 'deadlock detected' in 7.4 ... or just my bad

От
"Marc G. Fournier"
Дата:
On Mon, 5 Apr 2004, Tom Lane wrote:

> "Marc G. Fournier" <scrappy@postgresql.org> writes:
> >   Now, the scripts are wrap'd in a BEGIN/END ... if a file fails to be
> > loaded, I want the whole thing to rollback ... the deadlock itself, I'm
> > presuming, is because two servers are trying to update the same
> > $ip_id/$port/$company_id record, at the same time ...
>
> Actually, the problem is more likely that two servers try to update two
> different rows in opposite orders.  It's not possible to deadlock when
> only one lock is involved.
>
> You could work around this by ensuring that all sessions update rows in
> a consistent order; for instance, at the beginning of a transaction sort
> your intended updates by primary key and then apply in that order.

Actually, unless I'm mistaken about how hashes work in perl, the update
order for all servers is the same ... basically what happens is:

1. a traffic table is read in, and loaded into a hash table that is
   ordered by company_id, ip_id and port:

 $traffic{$ip_rec{$ip}{'company_id'}}{$ip_id}{$port} += $bytes1 + $bytes2;

2. a foreach loop is run on that resultant list to do the updates to the
   database:

 foreach $company_id ( keys %traffic ) {
   foreach $ip_id ( keys %{$traffic{$company_id}} ) {
     foreach $port ( keys %{$traffic{$company_id}{$ip_id}} ) {

   and the updates are done based on those 3 values, plus the byte value
of $traffic{$company_id}{$ip_id}{$port} ...

   Now, my first mistake may be that I'm mis-assuming that the hashes will
be read in a sorted order ... ?  If this is the case, though, then sort
order shouldn't be an issue, as all servers would be sorted the same way
...



----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

Re: Frequent 'deadlock detected' in 7.4 ... or just my bad

От
"Marc G. Fournier"
Дата:
On Mon, 5 Apr 2004, Tom Lane wrote:

> "Marc G. Fournier" <scrappy@postgresql.org> writes:
> > D'oh ... just tested my assumption, it was wrong ... *sigh*  okay, back
> > to the drawing board on the code ...
>
> Can't you just change
>
> >> foreach $company_id ( keys %traffic ) {
>
> to
>
> >> foreach $company_id ( sort keys %traffic ) {
>
> etc.

Ya, just saw the note from Matt on that also ... didn't realize it was
*that* simple ... was going to look into using the NULL Berkeley DB driver
... the O'Reilly Programming Perl book that I have, when you look into the
index at the back under Hashes: Automatic Sorting Of talks about the
Berkeley DB driver ... and that's it, nothing about simple sorting like
the above ... even if you look under 'Sorting: Hashes automatically', it
points to the same thing ...

Talk about over-complicating things :(


----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

Re: Frequent 'deadlock detected' in 7.4 ... or just my bad code?

От
Tom Lane
Дата:
"Marc G. Fournier" <scrappy@postgresql.org> writes:
> D'oh ... just tested my assumption, it was wrong ... *sigh*  okay, back
> to the drawing board on the code ...

Can't you just change

>> foreach $company_id ( keys %traffic ) {

to

>> foreach $company_id ( sort keys %traffic ) {

etc.

            regards, tom lane

Re: Frequent 'deadlock detected' in 7.4 ... or just my bad

От
"Matt Clark"
Дата:
> 1. a traffic table is read in, and loaded into a hash table that is
>    ordered by company_id, ip_id and port:
>
>  $traffic{$ip_rec{$ip}{'company_id'}}{$ip_id}{$port} += $bytes1 + $bytes2;
>
> 2. a foreach loop is run on that resultant list to do the updates to the
>    database:
>
>  foreach $company_id ( keys %traffic ) {
>    foreach $ip_id ( keys %{$traffic{$company_id}} ) {
>      foreach $port ( keys %{$traffic{$company_id}{$ip_id}} ) {
>
>    and the updates are done based on those 3 values, plus the byte value
> of $traffic{$company_id}{$ip_id}{$port} ...
>
>    Now, my first mistake may be that I'm mis-assuming that the hashes will
> be read in a sorted order ... ?  If this is the case, though, then sort
> order shouldn't be an issue, as all servers would be sorted the same way

The output of keys(%hash) is NOT ordered!  Try:

  foreach $company_id ( sort keys %traffic ) {
      foreach $ip_id ( sort keys %{$traffic{$company_id}} ) {
        foreach $port ( sort keys %{$traffic{$company_id}{$ip_id}} ) {


Matt


Re: Frequent 'deadlock detected' in 7.4 ... or just my bad

От
"Marc G. Fournier"
Дата:
D'oh ... just tested my assumption, it was wrong ... *sigh*  okay, back
to the drawing board on the code ...

On Mon, 5 Apr 2004, Marc G. Fournier wrote:

> On Mon, 5 Apr 2004, Tom Lane wrote:
>
> > "Marc G. Fournier" <scrappy@postgresql.org> writes:
> > >   Now, the scripts are wrap'd in a BEGIN/END ... if a file fails to be
> > > loaded, I want the whole thing to rollback ... the deadlock itself, I'm
> > > presuming, is because two servers are trying to update the same
> > > $ip_id/$port/$company_id record, at the same time ...
> >
> > Actually, the problem is more likely that two servers try to update two
> > different rows in opposite orders.  It's not possible to deadlock when
> > only one lock is involved.
> >
> > You could work around this by ensuring that all sessions update rows in
> > a consistent order; for instance, at the beginning of a transaction sort
> > your intended updates by primary key and then apply in that order.
>
> Actually, unless I'm mistaken about how hashes work in perl, the update
> order for all servers is the same ... basically what happens is:
>
> 1. a traffic table is read in, and loaded into a hash table that is
>    ordered by company_id, ip_id and port:
>
>  $traffic{$ip_rec{$ip}{'company_id'}}{$ip_id}{$port} += $bytes1 + $bytes2;
>
> 2. a foreach loop is run on that resultant list to do the updates to the
>    database:
>
>  foreach $company_id ( keys %traffic ) {
>    foreach $ip_id ( keys %{$traffic{$company_id}} ) {
>      foreach $port ( keys %{$traffic{$company_id}{$ip_id}} ) {
>
>    and the updates are done based on those 3 values, plus the byte value
> of $traffic{$company_id}{$ip_id}{$port} ...
>
>    Now, my first mistake may be that I'm mis-assuming that the hashes will
> be read in a sorted order ... ?  If this is the case, though, then sort
> order shouldn't be an issue, as all servers would be sorted the same way
> ...
>
>
>
> ----
> Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
> Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664
>

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

Re: Frequent 'deadlock detected' in 7.4 ... or just my bad

От
"Marc G. Fournier"
Дата:
That appears to have fixed it, thanks ... at least it hasn't happened in a
few hours, and it was happening at least once an hour previously ...


On Mon, 5 Apr 2004, Matt Clark wrote:

> > 1. a traffic table is read in, and loaded into a hash table that is
> >    ordered by company_id, ip_id and port:
> >
> >  $traffic{$ip_rec{$ip}{'company_id'}}{$ip_id}{$port} += $bytes1 + $bytes2;
> >
> > 2. a foreach loop is run on that resultant list to do the updates to the
> >    database:
> >
> >  foreach $company_id ( keys %traffic ) {
> >    foreach $ip_id ( keys %{$traffic{$company_id}} ) {
> >      foreach $port ( keys %{$traffic{$company_id}{$ip_id}} ) {
> >
> >    and the updates are done based on those 3 values, plus the byte value
> > of $traffic{$company_id}{$ip_id}{$port} ...
> >
> >    Now, my first mistake may be that I'm mis-assuming that the hashes will
> > be read in a sorted order ... ?  If this is the case, though, then sort
> > order shouldn't be an issue, as all servers would be sorted the same way
>
> The output of keys(%hash) is NOT ordered!  Try:
>
>   foreach $company_id ( sort keys %traffic ) {
>       foreach $ip_id ( sort keys %{$traffic{$company_id}} ) {
>         foreach $port ( sort keys %{$traffic{$company_id}{$ip_id}} ) {
>
>
> Matt
>
>

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

Re: Frequent 'deadlock detected' in 7.4 ... or just my bad

От
jseymour@LinxNet.com (Jim Seymour)
Дата:
"Marc G. Fournier" <scrappy@postgresql.org> wrote:
>
> On Mon, 5 Apr 2004, Tom Lane wrote:
>
> > "Marc G. Fournier" <scrappy@postgresql.org> writes:
> > > D'oh ... just tested my assumption, it was wrong ... *sigh*  okay, back
> > > to the drawing board on the code ...
> >
> > Can't you just change
> >
> > >> foreach $company_id ( keys %traffic ) {
> >
> > to
> >
> > >> foreach $company_id ( sort keys %traffic ) {
> >
> > etc.
>
> Ya, just saw the note from Matt on that also ... didn't realize it was
> *that* simple ... was going to look into using the NULL Berkeley DB driver
> ... the O'Reilly Programming Perl book that I have,...
[snip]

Kind of OT for this mailing list but...

What you just ran into is why I recommend to *anybody*, even the most
experienced of designers/engineers/coders/what-have-you, that they
start with "Learning Perl."  (Also an O'Reilly book.)  Saves no end of
grief ;).

IMO, "Programming Perl" is mainly useful after you already have a good
handle on the language.

Jim