Transition from serial simulation to parallel

General issues of interest both for network and
individual cell parallelization.

Moderator: hines

Post Reply
Sergey Aleksin

Transition from serial simulation to parallel

Post by Sergey Aleksin »

Hello everybody,

I'm new to NEURON, but I know that the software can in some cases conduct simulation in parallel even though the code was developed for serial case. The only thing required is proper setting options in GUI. The questions are:
1) What number of threads should be specified in Tools->Parallel Computing if I'm going to run simulation on a cluster with multi-core CPUs? Should I provide any hostfile with names of cluster nodes and number of CPUs per node?
2) Should I tweak settings in menu points Tools->Point Processes and Tools->Distributed Mechanisms in order to transfer from serial simulation to parallel?
3) Where can I find any documentation for the menu points?
4) Is it possible to adjust the software to use only processes (but not threads) to overcome the "<name> is not thread safe" error?

Thanks in advance,
Sergey
ted
Site Admin
Posts: 6289
Joined: Wed May 18, 2005 4:50 pm
Location: Yale University School of Medicine
Contact:

Re: Transition from serial simulation to parallel

Post by ted »

Actually, NEURON can be used for three different styles of parallel simulation. The first question is which of these is most appropriate for the particular modeling project.

1. Multithreaded simulation. This is most suitable for complex models of individual cells--where there are at least several thousand states that need to be integrated. It's very easy to implement, since the only requirement is that biophysical mechanisms be "thread safe". Basically this means avoiding situations in which code in different threads tries to write to the same global variable. In most cases the user's original code can be used without having to make any changes at all; in those occasions when changes are necessary, it is often sufficient to simply insert the directive
THREADSAFE
in the NEURON block of one or more NMODL files, then recompile them. Works with the GUI, so very convenient for use on a single PC or Mac. Speedup tends to be sublinear with the number of processors, and there is usually little benefit to having more than a 12-16 threads.

2. "Bulletin board style." This is for embarrassingly parallel problems, that is, problems that require many runs of serial code. Examples include parameter space exploration. In "bulletin board style" parallel simulation, one processor is the "master" processor, and the other processors are "worker" processors. The master processor does this:

Code: Select all

REPEAT
  post a "job" to the "bulletin board"
UNTIL all jobs have been posted
REPEAT
  if a result has been posted to the bulletin board, take it from the board
  else
    take a "job" from the bulletin board
    execute the job
    post the result to the bulletin board
UNTIL there are no more jobs or results on the bulletin board
"Worker" processors do this:

Code: Select all

REPEAT
  take a job from the bulletin board
  execute it
  post the result back to the bulletin board
UNTIL no more jobs remain to be done
Speedup is proportional to the number of processors as long as there are enough jobs to keep all processors busy. Requres some reorganization of the user's model code (change it so that one makes a function call to launch a simulation run, and the value returned by the function is the result of the run). This can be done purely with hoc, but Python is better if the simulation results are anything other than purely numerical values since Python can return anything that is pickleable. Requires MPI. Read about the ParallelContext class in the Programmer's Reference
http://www.neuron.yale.edu/neuron/stati ... arcon.html

3. Simulation of a model that is distributed over multiple processors. The model can be of a single cell or of a network. If a network model, coupling between cells can be via spikes and/or gap junctions. Balance is essential for good performance. If necessary, one or more cells can be split into multiple pieces and distributed over 2 or more processors to achieve balance. Requires MPI. Read about ParallelContext (see above) and also see these papers which are available from http://www.neuron.yale.edu/neuron/nrnpubs:
Migliore, M, Cannia, C., Lytton, W.W., Markram, H. and Hines, M.L. Parallel network simulations with NEURON. Journal of Computational Neuroscience 21:119-129, 2006.
Brette, R., Rudolph, M., Carnevale, T., Hines, M., Beeman, D., Bower, J.M., Diesmann, M., Goodman, P.H., Harris, F.C.J., Zirpe, M., Natschläger, T., Pecevski, D., Ermentrout, B., Djurfeldt, M., Lansner, A., Rochel, O., Vieville, T., Muller, E., Davison, A., El Boustani, S., and Destexhe, A. Simulation of networks of spiking neurons: a review of tools and strategies. J. Comput. Neurosci. 23:349-398, 2007.
Hines, M.L. and Carnevale, N.T. Translating network models to parallel hardware in NEURON. J. Neurosci. Methods 169:425-455, 2008.
Hines, M.L., Eichner, H. and Schuermann, F. Neuron splitting in compute-bound parallel network simulations enables runtime scaling with twice as many processors. Journal of Computational Neuroscience 25:203-210, 2008.
Hines, M.L., Markram, H. and Schuermann, F. Fully implicit parallel simulation of single neurons. Journal of Computational Neuroscience 25:439-448, 2008.
Sergey Aleksin

Re: Transition from serial simulation to parallel

Post by Sergey Aleksin »

Hello Ted,

I learn three models provided as multisplit demo in the following package:
http://senselab.med.yale.edu/modeldb/Sh ... odel=97985
and notice that no one model uses synaptic connections (NetCon objects). Could you point me to an example in the database where multisplit and NetCon are combined?
The model I simulate is a single cell where NetCon is used to excite dendrites with a train of presynaptic stimuli (NetStim). The stimuli are perceived by point process with NET_RECEIVE block embedded into dendrites. For some reason I cannot launch the simulation with more than 1 process. "Segmentation violation" error appears. The error disappears when I remove connections between the target dendrites and the cell. Could you provide any help with this problem?
I know that the main purpose of NetCon is creation of connections between particular cells while multisplit algorithm concerns a single cell. Can it be the root cause of my problem? If so, could you suggest any changes to my model?

Thanks,
Sergey
ted
Site Admin
Posts: 6289
Joined: Wed May 18, 2005 4:50 pm
Location: Yale University School of Medicine
Contact:

Re: Transition from serial simulation to parallel

Post by ted »

Database + full text search today for models implemented with NEURON that contain the words netcon and multisplit generates 5 hits:

1 Cell splitting in neural networks extends strong scaling (Hines et al. 2008)
/splitcell/common/binfo.hoc
gid) sr.sec { pc.cell(spgid, new NetCon(&v(.5), nil), 0) //printf(... with that output port. proc multisplit() {localobj srout, cell $o1.preloc() srout

2 Fully Implicit Parallel Simulation of Single Neurons (Hines et al. 2008)
/multisplit/multisplit.hoc
cb = bi.bilist.object(0) nc = new NetCon(&v(.5), nil) cb.multisplit(nc,... 0) {execute("maxfactor = .3")} proc multisplit() {local c, cm localobj b,

3 Olfactory bulb cluster formation (Migliore et al. 2010)
/migliore2010/weightsave.hoc
synapse movie ncl = new List("NetCon") for i=0, ncl.count-1 { nc... pc.gid_exists(sgid)) {// correct even if multisplit // no way at present to

4 Synchrony by synapse location (McTavish et al. 2012)
/mctavish_syncbylocation/src/split.hoc
pc.cell($1 + 2*splitbit, new NetCon(&v(.5), nil), 0)... cpu // and connect pieces with multisplit // arg is the base gid

5 Large scale model of the olfactory bulb (Yu et al., 2013)
/YuEtAl2012/split.hoc
pc.cell($1 + 2*splitbit, new NetCon(&v(.5), nil), 0)... cpu // and connect pieces with multisplit // arg is the base gid

You can do this yourself by clicking on the ModelSearch link on ModelDB's home page, then selecting NEURON as the simulator and entering
multisplit netcon
in the text search field.
Sergey Aleksin

Re: Transition from serial simulation to parallel

Post by Sergey Aleksin »

Hello Ted,

I want to simulate a cell on a cluster where each machine has single CPU with multiple cores. I borrowed framework of parallelization from the following model package:
https://senselab.med.yale.edu/modeldb/S ... odel=97985
There is the problem that MPI communication becomes very slow when more than one process is running on a machine. It's clear that multithreading should be used within each MPI process in order to utilize all power of cores effectively. Unfortunately, I cannot find an example of such model in ModelDB.
Do you know, is there any model in the database where MPI is combined with multithreading?
If not so, is there any other way for NEURON to simulate a cell on a cluster with multicore CPUs effectively?

Thanks,
Sergey
hines
Site Admin
Posts: 1682
Joined: Wed May 18, 2005 3:32 pm

Re: Transition from serial simulation to parallel

Post by hines »

NEURON supports any combination of MPI, threads, multisplit, and parallel gap junctions. There have been some recent bug fixes in regard to some of these combination so I recommend
using the most recent repository version of the ansi or trunk branches.

My experience is that threads can approach MPI performance but never exceed it. Perhaps some of this is attributable to smart MPI developers but I've noticed that every improvement in
thread performance makes NEURON threads look more like MPI processes. Anyway, less thread performance compared to mpi is a partial explanation of why there are not many mixed models.

Multisplit should be avoided unless it is clearly impossible to decently load balance using whole cells.
Post Reply