swarm-support
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Strange occurence of NaN


From: Matthew Mitchell
Subject: Strange occurence of NaN
Date: Wed, 17 May 2000 15:35:52 +1000 (EST)

Hi, 

I have been using swarm version 1.4.1 (and previous versions) on Linux
(Redhat 5.2) for some time without any problems. However just recently I
have noticed I am getting NaN's occuring. I haven't noticed this before
(and I have been carefully checking for this type of thing), so I am
assuming that I must have started doing something I shouldn't, but for the
life of me I have no idea what. 

The problem is with the following code:

tempStrength = tempStrength + ([agentModel getLearningRate] *
            ((rewardStrength + (discountRate * Qreturn)) - tempStrength));

After executing this the first time the value of tempStrength is NaN.
Subsequent executions of the method in other objects seem ok. I can't
see any obvious memory allocation problems that may cause this. 

I have placed the following debugging code (which simply checks all
the prior values) around the statement:

       printf("\n 1. Node: %ld, tempStrength: %f, rewardStrength: %f, 
                Qreturn: %f learningRate: %f discountRate: %f", 
                nodeNumber, tempStrength, rewardStrength,
                Qreturn, [agentModel getLearningRate], discountRate);
            
        tempStrength = tempStrength + ([agentModel getLearningRate] *
            ((rewardStrength + (discountRate * Qreturn)) - tempStrength));

        printf("\n 2. Node: %ld, tempStrength: %f",nodeNumber, tempStrength);

Here is an extract of the output on two objects:

 1. Node: 11, tempStrength: 0.000000, rewardStrength: 0.000000, 
                Qreturn: 0.000000 learning: 0.200000 discount: 0.900000
 2. Node: 11, tempStrength: NaN

 1. Node: 19, tempStrength: 0.000000, rewardStrength: 0.000000, 
                Qreturn: 0.000000 learning: 0.200000 discount: 0.900000
 2. Node: 19, tempStrength: 0.000000

The first time this code is executed (for node 11) it results in
tempStrength having a value of NaN, but for node 19 the result is zero. In
both cases, the values used in the calculation are the same.

Since tempStrength is an instance variable, I thought I would change it to
a local variable (newValue) for the method to see what happens.  The code
is as follows: 

        newValue = tempStrength;
        printf("\n 1. Node: %ld, tempStrength: %f, rewardStrength: %f, 
                Qreturn: %f learningRate: %f discountRate: %f, newValue: %f", 
                nodeNumber, tempStrength, rewardStrength,
                Qreturn, [agentModel getLearningRate], discountRate, newValue);
            
        newValue = newValue + ([agentModel getLearningRate] *
            ((rewardStrength + (discountRate * Qreturn)) - newValue));

        printf("\n 2. Node: %ld, tempStrength: %f, newValue: %f",
               nodeNumber, tempStrength, newValue);
        tempStrength = newValue;

Here is the result:

 1. Node: 11, tempStrength: 0.000000, rewardStrength: 0.000000, 
                Qreturn: 0.000000 learning: 0.200000 discount: 0.900000, 
newValue: 0.000000
 2. Node: 11, tempStrength: 0.000000, newValue: 0.000000

This in fact appears to fix the problem, but it doesn't.  If I
continue to run the simulation the NaN for newValue for Node 11 appears
much later (it also appears for a couple of other nodes as well). Can
anyone suggest maybe why this is occuring?  I would have thought the type
of error which lead to something like this may eventually result in a
segmentation fault, but the simulation continues happily along apparently
indefinately.

I would appreciate it if anyone could make any suggestions as to what
may be happening here.

Thanks,

Matt





                  ==================================
   Swarm-Support is for discussion of the technical details of the day
   to day usage of Swarm.  For list administration needs (esp.
   [un]subscribing), please send a message to <address@hidden>
   with "help" in the body of the message.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]