Page 1 of 1

Is it possible to track "value explosions" and out of range errors

Posted: Thu Jul 23, 2020 9:43 pm
by duytanph
Hello,

I am working on creating my own mechanism .mod file and running into large value errors such as:

Code: Select all

exp(1.86734e+184) out of range, returning exp(700)
No more errno warnings during this execution
Without getting into too much detail, the mechanism works completely fine without any "out of range" errors when running on a single processor without any parallelization.

However, once I start simulating over a large neuronal network (across anywhere between 50 to 100 CPUs), I begin seeing multiple out of range errors. The main problem is that the errors are not specifying which variable is "exploding" making it difficult for me to know where the origin of the error is. I've tried using h.Vector() to record the different variables, but it seems that only some instantiations (over millions) of the mechanism are exploding making it hard to pinpoint the problem.

Is there anything I can do to help specify or print out where these out of range values are actually originating from? Thanks.

Re: Is it possible to track "value explosions" and out of range errors

Posted: Fri Jul 24, 2020 3:55 pm
by ted
Interesting problem. How big does the network itself have to be (total # cells, total # synapses)? At what point (time) in the simulation does it occur?

The first thing to do is to see if there are any peculiarities about the NMODL code itself that might cause problems, and the quickest way to do that is for you to email the mod file to me
ted dot carnevale at yale dot edu

Hopefully that will be sufficient; if not, further diagnosis will be much more involved.

Re: Is it possible to track "value explosions" and out of range errors

Posted: Tue Jul 28, 2020 4:47 pm
by hines
The exp function is often used for channel rates where the arg is some function of voltage. As the arg is 1.8...e184 that could be far downstream from the cause of such a large arg. Also it sometimes occurs under conditions of temporary instability while the variable time step method is busy reducing dt to a numerically stable value. But I doubt you have cvode enabled during your simulations. Without knowing precisely the best diagnostic sequence for your specific condition I would start by determining the rank on which the warning occurred. This presumes that you are using a spike coupled network in which you save the spike raster. With that and the rank, you can retrospectively simulate on a single process desktop the subnet that was on the rank while stimulating it with PatternStim that feeds the raster into the cells of the subnet so that the cells reproduce exactly all the state trajectories that resulted in the warning. From there you can use gdb and work backward to the original cause.

Determining the rank requires that you build again from the sources after modifying nrn/src/oc/math.c where you add a line after line 48

Code: Select all

        }else if (x > 700) {
which would look like

Code: Select all

    hoc_execerror("hoc_exp with arg > 700", NULL);
A lot of the above may be unfamiliar usage for you (e.g. PatternStim; running a fake rank, nhost sim on a single process; and gdb) and if you would like further help, contact me by email at michael dot hines at yale dot edu.