Passing parameters using pack/post, etc.

General issues of interest both for network and
individual cell parallelization.

Moderator: hines

Passing parameters using pack/post, etc.

Postby JBall » Thu Oct 21, 2010 1:44 pm

Hello,

I've been trying to get some model optimization going in parallel neuron. This involves, among other things, the ability to pass numbers between nodes to set model parameters and to compare fitness scores for different parameter sets. As a starting point, the code below is built to generate an initial matrix of parameters for n = pc.nhost different cell models, and then pass the vectors of those parameters out to each node (I've cut out the parameter generation code for simplicity here). Then, a single simulation should run. As an initial test of the simulation, I'm printing the numbers of spikes for each model to the screen. The whole problem seems to be the way I'm unpacking the messages I send using pack & post. If I comment out the block of code where I retrieve the messages, the whole thing runs fine. I have a hunch that I have a conceptual misunderstanding of how the packing and posting procedure is supposed to go. Any help is very much appreciated.

Thanks!

Update: My mistake, commenting out the section where I take and unpack the posted messages allows me to get past that block, but pc.psolve() gives me a segmentation fault. To the best of my knowledge, the way I'm initializing things should create a copy of each cell and related objects on every node that can then be simulated in parallel. Is this not right?

Update 2(!): Again, my mistake--I'd left out stdinit(). I'm back to my previous state--if I leave out the message take/unpack process, the code runs fine and I get spike numbers for n = pc.host cells. If I try to take the parameters and use them, the simulation hangs.

Code: Select all
{load_file("nrngui.hoc")}
{load_file("template.hoc")}


objref pc
{pc = new ParallelContext()}
cvode_local(1)
cvode.atol(1e-5)
cvode.maxstep(0.05)
cvode.cache_efficient(1)
tstop = 1000
v_init = -66



val = 0
nmems = pc.nhost()

objref r
{r = new Random(pc.time())}

// Setting up things to be simulated----------------

objref cell,nc[2],nil,tvec[2],stim

cell = new Celltemp()
cell.soma stim = new IClamp(0.5)
stim.amp = 0.3
stim.del = 100
stim.dur = 800

cell.soma nc = new NetCon(&cell.soma.v,nil)
cell.dend nc[1] = new NetCon(&cell.dend.v,nil)

tvec[0] = new Vector()
tvec[1] = new Vector()

nc.record(tvec[0])
nc[1].record(tvec[1])
//--------------------------------------------------


objref pars, params
pars = new Vector()
params = new Matrix(18,nmems)


proc initvals() {
        for i=0,nmems-1 {
           // generate matrix of initial parameters
   }
}



proc usevals() {

// Set cell values to passed vector

}



if(pc.id==0) {

initvals()

  for i=0,nmems-1 {
   pars = params.getcol(i)
        pc.pack(i,pars)
        pc.post(i)
  }
}



for i=0,nmems-1 {
  if(pc.id==i) {

   pc.take(i)
   pc.unpack(&val,pars)

   usevals(pars)

  }
}

pc.barrier()


{pc.set_maxstep(0.05)}
{pc.psolve(tstop)}

pc.barrier()

for i=0, nmems-1 {

if (i==pc.id) {
printf("%g\t %g\t %d\n", tvec.size(),tvec[1].size(), pc.gid)
}
pc.barrier()
}
}

{pc.runworker()}
{pc.done()}
JBall
 
Posts: 17
Joined: Tue Jun 15, 2010 8:47 pm

Re: Passing parameters using pack/post, etc.

Postby hines » Sat Oct 30, 2010 12:26 pm

The fundamental problem here is the mixing of MPI parallelism which puts all communication onto the shoulders of the model author, and BulletinBoard parallelism which imposes a master worker style of computation where the master submits tasks to the bulletin board and the (master/workers) execute tasks taken from the bulletin board. The bottom line is that pc. post, take, etc only make sense AFTER pc.runworker is called. At that point only the master returns and all the workers never return but only execute submitted tasks. Since you are using pc.barrier for synchronization (possibly introducing a significant load balance performance waste if each simulation run takes significantly different time), the most general way to exchange information among the processors is via pc.alltoall
(see http://www.neuron.yale.edu/neuron/static/docs/help/neuron/neuron/classes/parcon.html#MPI )

Please take a look at
http://www.neuron.yale.edu/neuron/static/docs/help/neuron/neuron/classes/parcon.html#SubWorld
The ParallelContext has undergone three major conceptual extensions.
1. BulletinBoard only when first introduced. The master is in control and communicates with workers through the bulletin board AFTER pc.runworker is called.
2. Neuron Network parallel support suing global cell identifiers. A network is typically simulated PRIOR to pc.runworker and {pc.runworker() pc.done() quit()} is executed to get a proper exit.
3. Introduction of subworlds. Combines neural network, MPI synchronized parallelization, with Bulletin Board parallelization. The prototypical situation considered was parallel optimization of parallel networks. One creates a parallel network model that can be created, simulated, destroyed, as a single function call that returns some complicated value (in Python, typically a tuple of Objects) using function arguments to define the simulation (in Python a tuple of parameters). If only one conceptual network is involved and only a few parameter changes differ between simulations then the creation of the network can be factored out of the function and destrcution can be avoided. I.e. f(param) only involves setting the changed parameters and running the (parallel) simulation.

Now that one has the concept of a parameterized function returning a value, one can easily use the bulletin board to implement an optimization algorithm. As long as the bulletin board is non-empty of tasks to do, all the workers and master will be 100% active.

I have a hoc and python example of the use of subworlds. Send a request to michael dot hines at yale dot edu and let me know if you want the python, hoc, or both versions of the example. I believe the network is a ring of cells and the task return values are more or less useless. But it should give you a good analogy with which to work. ie. the fundamental concept is an "e = efun(args)" which to compute can use pc.nhost processors and the number of these functions that can be computing simultaneously is pc.nhost_world/pc.nhost.
hines
Site Admin
 
Posts: 980
Joined: Wed May 18, 2005 3:32 pm


Return to Parallel NEURON

Who is online

Users browsing this forum: No registered users and 1 guest