parallelized NEURON does not start with pnm.run()

General issues of interest both for network and
individual cell parallelization.

Moderator: hines

Post Reply
MBeining
Posts: 22
Joined: Thu Apr 03, 2014 8:18 am

parallelized NEURON does not start with pnm.run()

Post by MBeining » Thu Sep 07, 2017 6:39 am

I am very new to parallel NEURON. Currently I try to setup a configuration where 2 cells are simply run in parallel *no NetCons between them". This should later change.

However when I run my code, with mpiexec, the neuron instances are in endless loop without something happening.
Here is the code (cells and pps etc are only initialized at the hosts where they should be initialized, using "if (pc.gid_exists(0))" etc):

Code: Select all

// ***** Initialize Variables *****
objref f
objref pnm,pc,nil,cvode,strf,tvec,cell,cellList,pp,ppList,con,conList,nilcon,nilconList,rec,recList,rect,rectList,playt,playtList,play,playList,APCrec,APCrecList,APC,APCList,APCcon,APCconList,thissec,thisseg,thisval,maxRa,maxcm 
 cellList = new List() // comprises all instances of cell templates, also artificial cells
 ppList = new List() // comprises all Point Processes of any cell
 conList = new List() // comprises all NetCon objects
 recList = new List() //comprises all recording vectors
 rectList = new List() //comprises all time vectors of recordings


// ***** Load standard libraries *****
io = load_file("stdgui.hoc")
io = xopen("lib_genroutines/genroutines.hoc")

// ***** Initialize parallel manager *****
load_file("netparmpi.hoc")
pnm = new ParallelNetManager(2)
pc = pnm.pc

pc.set_gid2node(0, 0)
pc.set_gid2node(1, 1)

// ***** Load cell morphologies and create artificial cells *****
io = xopen("init_cells.hoc")

// ***** Load mechanisms and adjust nseg *****
io = xopen("init_mech.hoc")


// ***** Place Point Processes *****
io = xopen("init_pp.hoc")


// ***** Define recording sites *****
io = xopen("init_rec.hoc")


// ***** Last settings *****
tstart = 0
tstop = 100+ 0.001 //advances one more step due to roundoff errors for high tstops
dt = 0.001000
steps_per_ms = 1000

// ***** Run NEURON *****
pnm.set_maxstep(10)
if (pc.id()==0){
pnm.prun()
}
print "here is node ", pc.id()


io = xopen("save_rec.hoc")
The "here is node x" message appears for all nodes but node 0. The recordings that are saved during save_rec.hoc are all-zero, so I guess it did non run. What am I doing wrong? Tried around with pc.barrier() and pc.runworker() but I do not want to submit jobs I simply want that all hosts run and, in case of synapes between them, are considered.
Thanks for any help

ted
Site Admin
Posts: 5057
Joined: Wed May 18, 2005 4:50 pm
Location: Yale University School of Medicine
Contact:

Re: parallelized NEURON does not start with pnm.run()

Post by ted » Thu Sep 07, 2017 9:33 am

I can't see most of your code, and problems could lurk anywhere e.g. in the code that sets up recording to Vectors, so most of these comments are somewhat general.

1. When doing initial "does this run or not?" development and programming, save time by choosing tstop and dt values that produce short runs, e.g. 5 ms and 0.025 ms.
2. Most bugs are in user-written code, but sometimes user-written code that is "correct" will expose a bug in NEURON itself. There have been many releases of NEURON over time, and a lot of these include bug fixes. This is why it is always a good idea to make sure you are using a recent release of NEURON.
3. NEURON's internals for parallel simulation have been a particular focus of revision and improvement, but sometimes revisions introduce new bugs. This is why we often ask "what version are you using?" The easiest way to tell us is just to start NEURON, then copy the first line that its interpreter prints to the terminal and paste that into your message. It should look something like
NEURON -- VERSION 7.5 master (720143f) 2017-06-12
but the numbers are likely to differ.
4.
when I run my code, with mpiexec, the neuron instances are in endless loop without something happening.
Do you mean that model execution continues endlessly? If that's the case, (1) how did you stop it? (2) output files will be empty if you force execution to stop before files are written.
5. The ParallelNetworkManager seemed like a good idea when it was first developed, but it turned out to be unnecessary, and most parallelized network models have employed ParallelContext. See the examples in
Hines, M.L. and Carnevale, N.T.
Translating network models to parallel hardware in NEURON.
J. Neurosci. Methods 169:425-455, 2008
preprint available from http://www.neuron.yale.edu/neuron/stati ... nm2008.pdf
source code available from ModelDB https://senselab.med.yale.edu/ModelDB as model entry 96444.

MBeining
Posts: 22
Joined: Thu Apr 03, 2014 8:18 am

Re: parallelized NEURON does not start with pnm.run()

Post by MBeining » Thu Sep 07, 2017 12:34 pm

Thank you Ted for the fast reply. I changed the code to work only with the ParallelContext, but the problem stays the same.
Here is the complete code:

Code: Select all

// ***** Initialize Variables *****
strdef tmpstr,simfold // temporary string object
objref f
objref pc,nil,cvode,strf,tvec,cell,cellList,pp,ppList,con,conList,nilcon,nilconList,rec,recList,rect,rectList,playt,playtList,play,playList,APCrec,APCrecList,APC,APCList,APCcon,APCconList,thissec,thisseg,thisval,maxRa,maxcm 
 cellList = new List() // comprises all instances of cell templates, also artificial cells
 ppList = new List() // comprises all Point Processes of any cell
 recList = new List() //comprises all recording vectors
 rectList = new List() //comprises all time vectors of recordings
 nilconList = new List() //comprises all NULL object NetCons

// ***** Create empty cell object (parallel NEURON) *****
begintemplate emptyObject
endtemplate emptyObject
objref eObj
eObj = new emptyObject()

// ***** Load standard libraries *****
io = load_file("stdgui.hoc")
io = xopen("lib_genroutines/genroutines.hoc")

// ***** Initialize parallel manager *****
pc = new ParallelContext(2)


pc.set_gid2node(0, 0)
pc.set_gid2node(1, 1)
// ***** Load custom libraries *****


// ***** Load cell morphologies and create artificial cells *****
io = xopen("morphos/hocs/cell_testComp.hoc")
if (pc.gid_exists(0)) {
cell = new cell_testComp()
}else{cell = eObj}
io = cellList.append(cell)
if (pc.gid_exists(1)) {
cell = new cell_testComp()
}else{cell = eObj}
io = cellList.append(cell)
objref cell


// ***** Load mechanisms and adjust nseg *****
if (pc.gid_exists(0)) {
forsec cellList.o(0).allreg {
insert pas
}
}
if (pc.gid_exists(1)) {
forsec cellList.o(1).allreg {
insert pas
}
}

// ***** Place synapses, electrodes or other point processes *****
if (pc.gid_exists(0)) {
cellList.o(0).allregobj.o(0).sec{pp = new IClamp(0.000100)
pp.amp = 1.500000 
pp.del = 20.000000 
pp.dur = 50.000000 
}
io = ppList.append(pp)
}else{for (i=0;i<1;i=i+1) {io = ppList.append(eObj)}}
if (pc.gid_exists(1)) {
cellList.o(1).allregobj.o(0).sec{pp = new IClamp(0.000100)
pp.amp = 1.500000 
pp.del = 10.000000 
pp.dur = 50.000000 
}
io = ppList.append(pp)
}else{for (i=0;i<1;i=i+1) {io = ppList.append(eObj)}}
objref pp


// ***** Define recording sites *****
if (pc.gid_exists(0)) {
rec = new Vector(100001.000000)
rec.label("v at location 0.5000 of section 0 of cell 0")
io = rec.record(&cellList.o(0).allregobj.o(0).sec.v(0.500000),tvec)
io = recList.append(rec)


}else{for (i=0;i<1;i=i+1) {io = recList.append(eObj)}}
if (pc.gid_exists(1)) {
rec = new Vector(100001.000000)
rec.label("v at location 0.5000 of section 0 of cell 1")
io = rec.record(&cellList.o(1).allregobj.o(0).sec.v(0.500000),tvec)
io = recList.append(rec)


}else{for (i=0;i<1;i=i+1) {io = recList.append(eObj)}}
objref rec
objref rect



// ***** Last settings *****
tstart = 0
tstop = 100 + 0.001 //advances one more step due to roundoff errors for high tstops
dt = 0.001000
steps_per_ms = 1000


// ***** Run NEURON *****
pc.set_maxstep(10)
pc.psolve(tstop)

print "hallo", pc.id()

// Save recordings
proc save_rect() {strdef filnam
sprint(filnam,"%s/%s",simfold,$s1)
f = new File()
io = f.wopen(filnam)
io = rectList.o($2).printf(f, "%-20.20g\n")
io = f.close()
}

if (pc.gid_exists(0)) {
save_rec("cell0_sec0_loc0.5000_v.dat",0)
}
if (pc.gid_exists(1)) {
save_rec("cell1_sec0_loc0.5000_v.dat",1)
}


// ***** Make Matlab notice end of simulation *****
{pc.runworker()}
{pc.done()}
if (pc.id()==0){
f = new File()
io = f.wopen("readyflag")
io = f.close()
quit()
}

// *-*-*-*-* END *-*-*-*-*

The cell morphologies that are laoded are simple one compartment cells:

Code: Select all

begintemplate cell_testComp

proc celldef() {
  topol()
  subsets()
  geom()
  biophys()
  geom_nseg()
is_artificial = 0
}

public soma

public allregobj
public allreg
public alladendreg
public allaxonreg
public regsoma
public is_artificial

create soma[1]

proc topol_1() {
}
proc topol() {
  topol_1()
  basic_shape()
}

proc shape3d_1() {
  soma[0] {pt3dclear()
    pt3dadd(-0.0001, 0, 0, 1)
    pt3dadd(0, 0, 0, 1)
    pt3dadd(0, 0, 1, 1)
  }
}
proc basic_shape() {
  shape3d_1()
}

objref allreg, allregobj, alladendreg, allaxonreg, sec
objref regsoma
proc subsets() { local ward
  allregobj = new List()
  allreg = new SectionList()
  alladendreg = new SectionList()
  allaxonreg = new SectionList()
  regsoma = new SectionList()
  for ward = 0, 0 soma[ward] {
    regsoma.append()
    sec = new SectionRef()
    allregobj.append(sec)
    allreg.append()
  }
}
proc geom() {
}
proc geom_nseg() {
}
proc biophys() {
}
access soma
proc init() {
  celldef()
}

endtemplate cell_testComp

UPDATE: I fixed a bug (pc.psolve should of course run on all hosts), and now all NEURON instances are closing correctly. However, the vectors that are saved are still all-zero

MBeining
Posts: 22
Joined: Thu Apr 03, 2014 8:18 am

Re: parallelized NEURON does not start with pnm.run()

Post by MBeining » Fri Sep 08, 2017 9:05 am

I found the problem.. Without pnm.run() I need to call stdinit() of course ;-)

ted
Site Admin
Posts: 5057
Joined: Wed May 18, 2005 4:50 pm
Location: Yale University School of Medicine
Contact:

Re: parallelized NEURON does not start with pnm.run()

Post by ted » Fri Sep 08, 2017 10:04 am

Sometimes the easiest problems to fix can be the hardest to find.

What's the point of

Code: Select all

{pc.runworker()}
{pc.done()}
near the end of the program? You're trying to execute a distributed simulation, i.e. one in which different parts of a model are being executed on different processors at the same time. pc.runworker() is for launching bulletin-board-style parallel simulation, in which each processor executes a separate simulation of the same model. Time to re-read the documentation of ParallelContext.

MBeining
Posts: 22
Joined: Thu Apr 03, 2014 8:18 am

Re: parallelized NEURON does not start with pnm.run()

Post by MBeining » Fri Sep 08, 2017 12:04 pm

So true...

That is a residue from the ParallenNetManager as it says in the description
At any rate, before we quit we have to call it so that the master can tell all the workers to quit.

pnm.pc.runworker
pnm.pc.done
Also I found similar lines in your ring network on modelDB
https://senselab.med.yale.edu/ModelDB/S ... hoc#tabs-2
so I thought this is necessary :-)

ted
Site Admin
Posts: 5057
Joined: Wed May 18, 2005 4:50 pm
Location: Yale University School of Medicine
Contact:

Re: parallelized NEURON does not start with pnm.run()

Post by ted » Fri Sep 08, 2017 2:43 pm

Ah, I should have scrolled up to examine more of your program than the bit that is visible in phpBB's little code window.

The parallelized code in the article uses
stdinit()
{pc.psolve(tstop)}
to start execution of the parallelized model.

After execution has completed, the question is how to make the hosts report the spike times.
The code in the article simply dumps spike times to the terminal, but allowing all hosts to spill their results at the same time could easily produce a result similar to having multiple people typing simultaneously on one keyboard--gcharabaoges (garbage and chaos interleaved).
proc spikeout() prevents that by serializing the output, i.e. allowing only one host to report its results at a time. But how do you get each host to execute spikeout()? That's done with the idiom
statements
pc.runworker()
which makes each host execute statements; in this case statements is a call to proc spikeout().

In your program, statements consists of all the code between
{pc.psolve(tstop)}
and
{pc.runworker()}
And your program's statements make each host open a file with a unique name and write recorded data to that file. There's no way to produce data scrambling, so there's no absolute need to serialize output.

However, depending on your operating system, how nimble your hard drive is, and how much data must be written, I wonder if serializing data output might reduce the time needed to write the output files to disk.

By the way,
{pc.runworker()}
{pc.done()}
have nothing to do with Matlab at all, so for the sake of clarity these two lines should be moved above that comment line.

MBeining
Posts: 22
Joined: Thu Apr 03, 2014 8:18 am

Re: parallelized NEURON does not start with pnm.run()

Post by MBeining » Tue Sep 12, 2017 10:06 am

thanks Ted for the detailed explanation! I'll change it =)

Post Reply