parallel and triple loop error

General issues of interest both for network and
individual cell parallelization.

Moderator: hines

Post Reply
matiasm
Posts: 12
Joined: Sun Jan 30, 2011 7:33 pm

parallel and triple loop error

Post by matiasm »

Hi,
I am trying to run some work on a parallel system and after 2 days I have come to a very annoying roadblock. I am trying to run a ParallelContext() from inside a triple loop. It seems to work fine for the first loop then the system just hangs. I am not sure what is wrong. I have simplified the code below. Any help would be great.


load_file("nrngui.hoc")
load_file("mosinit.hoc")

objref pc1
pc1 = new ParallelContext()

objectvar clamp
xopen("cell_clamp.ses")

/////////////////////////////////
///////FUNCTION TESTING CONDITIONS UNDER DIFFERENT CURRENTS
func function1 (){
amplitude=0.001*$1

tstop = 200
dt=0.1
steps_per_ms=10
soma clamp = new IClamp(0.5)
clamp.del = 100
clamp.dur = 300
clamp.amp = 0.000-amplitude

init()
run()

///TEST VARIOUS THINGS HERE/////
return 1
}


//////
//WILL INITIALIZE 3 VARIABLES AND THEN INCREASE THEM 10 TIMES WHILE RUNNING THE ABOVE FUNCTION
forall ghdbar_hd = 7e-6
forall gbar_lva = 1e-6
forall gbar_nap = 1e-6

nstimcurrent=5

for ii=0, 10{
for jj=0, 10{
for kk=0, 10{

pc1.runworker()
for pp=0, nstimcurrent-1{
pc1.submit("function1", pp)
}

while(pc1.working()){
i=int(pc1.retval())
printf("i=%g",i)
}

pc1.done()

doEvents()
forall ghdbar_hd =ghdbar_hd + 1e-5
printf("inc hd")
}
doEvents()
forall gbar_lva =gbar_lva + 1e-5
printf("inc lva")
}
doEvents()
forall gbar_nap =gbar_nap + 1e-5
printf("inc nap")
}

I am not sure if I have done this properly. This runs for the first loop but then hangs. If run on a mac (not under MPI) it runs properly. I understand that that might not be any indication that it should run on a parallel system.
Thanks in advance

Matias
matiasm
Posts: 12
Joined: Sun Jan 30, 2011 7:33 pm

Re: parallel and triple loop error

Post by matiasm »

I have now tried a different approach but still no luck. I thought that since running the pc.submit from inside the 3 loops doesn't work, i could put it into a function that is called upon from inside the loop. Here is a simplified version of the code:

load_file("nrngui.hoc")
load_file("mosinit.hoc")

objref pc1
pc1 = new ParallelContext()

objectvar clamp
xopen("cell_clamp.ses")


/////////////////////////////////
///////FUNCTION TESTING CONDITIONS UNDER DIFFERENT CURRENTS
func function1 (){
amplitude=0.001*$1

tstop = 200
dt=0.1
steps_per_ms=10
soma clamp = new IClamp(0.5)
clamp.del = 100
clamp.dur = 300
clamp.amp = 0.000-amplitude

init()
run()

///TEST VARIOUS THINGS HERE/////
return 1
}

//////////////////////
///FUNCTION 2

nstimcurrent=2
func function2(){
pc1.runworker()
for pp=0, nstimcurrent-1{
pc1.submit("function1", pp)
}
while(pc1.working()){
i=int(pc1.retval())
printf("i=%g",i)
}
pc1.done()
return 1
}

/////////////////////////////
//WILL INITIALIZE 3 VARIABLES AND THEN INCREASE THEM 10 TIMES WHILE RUNNING THE ABOVE FUNCTION

nloop=10

NaP=1.5e-6
forall gbar_nap = NaP

for ii=0, nloop-1{
forall ghdbar_hd=10e-13
step_hd=10
for jj=0, nloop-1{
LVA=0.6e-4
forall gbar_lva = LVA
for kk=0, nloop-1{

//CHECK THE TEXT FILE FOR VALID DATA POINTS
if (tmp_vect.x[num] == 0){condition=0}else{condition=1}
num=num+1
//If CONDITION=1 START THE CURRENT LOOP FOR THIS NODE
if (condition==1){
/////SOME CODE////
function2()
}
doEvents()
forall ghdbar_hd =ghdbar_hd*step_hd
}
doEvents()
forall gbar_lva =gbar_lva + step_lva

}
doEvents()
forall gbar_nap =gbar_nap + step_nap
}

Again this works once and then all the processors come back to do the code inside the loops. Then gets stuck. What is a better approach?
Thanks
Matias
matiasm
Posts: 12
Joined: Sun Jan 30, 2011 7:33 pm

Re: parallel and triple loop error

Post by matiasm »

After a long process of trial and error I have managed to get this working. For people with similar problems here is how I have done it. Also sorry for long posts with code, but I didnt realise how to post code correctly before.

Code: Select all

load_file("nrngui.hoc")
load_file("mosinit.hoc")

objref pc1
pc1 = new ParallelContext()

/////////////////////////////////
///////FUNCTION 1 TESTING CONDITIONS UNDER DIFFERENT CURRENTS

objectvar clamp
soma clamp = new IClamp(0.5)

func function1 (){
amplitude=0.001*$1

tstop = 10000
dt=0.1
steps_per_ms=10

clamp.del = 0
clamp.dur = 10000
clamp.amp = 0-amplitude
	///TEST VARIOUS THINGS
return 1
}

//////////////////////
///FUNCTION 2
nstimcurrent=20
func function2(){
xx=$1
for pp=0, nstimcurrent-1{
pc1.runworker()
pc1.submit("function1", pp, xx, LVA2, HD2, NAP2)
}

return 1
}



//////
//WILL INITIALIZE 3 VARIABLES AND THEN INCREASE THEM 10 TIMES WHILE RUNNING THE ABOVE FUNCTION

num=0
nloop=10

NaP=1.5e-6
forall gbar_nap = NaP
step_nap=7.5e-7

for ii=0, nloop-1{
forall ghdbar_hd=10e-13
step_hd=10
for jj=0, nloop-1{
LVA=0.6e-4
step_lva=1.9e-5

for kk=0, nloop-1{

	//CHECK LOOK UP TABLE FILE FOR VALID DATA POINTS
		
		if (tmp_vect.x[num] == 0){condition=0}else{condition=1}
		 num=num+1
//If CONDITION=1 START THE CURRENT LOOP FOR THIS NODE
		pc1.runworker()
		if (condition==1 && pc1.id()==0){
////SOME CODE
function2(folder_num,gbar_lva, ghdbar_hd, gbar_nap)
folder_num=folder_num+1
}		

doEvents()
LVA= gbar_lva + step_lva
forall gbar_lva = LVA
}		///END LOOP KK
doEvents()
forall ghdbar_hd=ghdbar_hd*step_hd
}////END LOOP JJ
doEvents()
NaP = gbar_nap + step_nap
forall gbar_nap = NaP

}//END LOOP ii

pc1.done()

This makes use of the command pc1.runworker() twice. Once inside the triple loop to make the master host return. The following if statement ensures that only the master (pc1.id==0) carries out the code inside the loop and calls the following function. Once the function is called (funtion 2), pc1.runworker is called upon again and then all other hosts are given a task, in this case, function 1. Once complete, the loop advances again with only the pc1.id==0 having control. I hope this helps someone

Matias
ted
Site Admin
Posts: 6300
Joined: Wed May 18, 2005 4:50 pm
Location: Yale University School of Medicine
Contact:

Re: parallel and triple loop error

Post by ted »

For a clear example of bulletin board style parallelization, see the code block just below the phrase
"The simplest form of parallelization of a loop from the users point of view is"
at
http://www.neuron.yale.edu/neuron/stati ... arcon.html

Program organization should follow the outline presented below the phrase
"The basic organization of a simulation is:"
at the same URL. Note that pc.runworker() is called just one time.
This makes use of the command pc1.runworker() twice.
In a properly written program it is only necessary to call pc.runworker() once.
matiasm
Posts: 12
Joined: Sun Jan 30, 2011 7:33 pm

Re: parallel and triple loop error

Post by matiasm »

Thanks for your reply Ted. I had thoroughly read through the manual you mention, but I am still not sure how to get my program to work with only one pc.runworker call. Just a quick description of what my program does:

It loops through 10 values of 3 conductances (all up 1000 combinations) and at each point it searches through a look up table (made previously) to see if the point is a valid point. Once it reaches a valid point, it does 20 current stimulations at that point ranging the current from 0 to -20pA. As far as I can tell, if I only use one pc.runworker, at for example the function called up to do the 20 current stimulations, the program will finish once its done 20 simulations and then not move onto the next valid point on the look up table. Otherwise if I only have one pc.runworker in the loop that searches through the lookup table, I can't program it to go through 20 current stimulations at that point. Maybe the method I have used to tackle this problem is wrong, I'm not sure.
I have been looking at the code for subworlds, maybe this is a better approach. I haven't played around with it yet but I will look into it soon.

Matias
ted
Site Admin
Posts: 6300
Joined: Wed May 18, 2005 4:50 pm
Location: Yale University School of Medicine
Contact:

Re: parallel and triple loop error

Post by ted »

Unless I misunderstand what you're trying to do, I don't see how subworlds would be necessary for this
matiasm wrote:loops through 10 values of 3 conductances (all up 1000 combinations) and at each point it searches through a look up table (made previously) to see if the point is a valid point. Once it reaches a valid point, it does 20 current stimulations at that point ranging the current from 0 to -20pA.
The project could be parallelized to the level of individual simulations, or to the level of "batches" of simulations. In In either case, the master iterates through all 1000 combinations of conductances and tests each combination for validity. If you decide to parallelize to the level of individual simulations, then for each valid combination of conductances, the master must post 20 separate tasks to the bulletin board (one for each current amplitude). When a worker picks up one of these tasks, it executes a single simulation run, the returns the results from that run. If you parallelize to the level of a batch of simulations, the master posts only the valid combinations of conductances. When a worker picks up one of these tasks, it runs 20 simulations, then returns the results of all 20. With either strategy there only has to be one execution of pc.runworker().
matiasm
Posts: 12
Joined: Sun Jan 30, 2011 7:33 pm

Re: parallel and triple loop error

Post by matiasm »

I think I understand what you mean. In the case of parallelising to individual current stimulations, would the format of this be as such:

pc.runworker()
for ii=0,10{
for jj=0, 10{
for kk=0, 10{

/////master finds valid point////

for xx=0, 20{
///for this point master submits 20 jobs to the bulletin board
}
}}}
pc.done()

The problem I have found is that using this style, for some reason doesn't allow me to run in parallel. If I have pc.submit inside the three loops, it only runs each simulation one at a time, not using the whole nhosts available.

Matias
matiasm
Posts: 12
Joined: Sun Jan 30, 2011 7:33 pm

Re: parallel and triple loop error

Post by matiasm »

Never mind, I have discovered my error with the last post, inside the 3 loops I also had a

while (pc.working){}

This has now been correctly placed outside the loop and it seems to be working.

Matias
ted
Site Admin
Posts: 6300
Joined: Wed May 18, 2005 4:50 pm
Location: Yale University School of Medicine
Contact:

Re: parallel and triple loop error

Post by ted »

matiasm wrote:In the case of parallelising to individual current stimulations, would the format of this be as such:
Yes, but to be exact the innermost loop posts 21 jobs to the board.
inside the 3 loops I also had a

while (pc.working){}
That would indeed gum things up.
Post Reply