Page 2 of 2

Re: How to run the simulation in-between?

Posted: Tue Nov 10, 2015 7:38 pm
by breakwave922
Hines,

Thanks for helping me to solve my SaveState() problem on the single cell model, so now I'm moving on to the network level.
I have some questions to ask.
In my network, each cell receives random noise inputs, based on their cell id, see codes below

Code: Select all

begintemplate BgGen
proc init() {
...
r = new Random()
r.MCellRan4(1000*($6+12))
r.negexp(1)

noise_netstim = new NetStim()
noise_netstim.interval = 1000/noise_freq
noise_netstim.number = 1e100 //noise_total_length*noise_freq/1000
noise_netstim.start = noise_start
noise_netstim.noise = 1
noise_netstim.noiseFromRandom(r)
...
}
endtemplate BgGen
$6 gets the cell id to generate different the sequence number of the relevant Random instances to different cells.

After simulation, I used SaveState() to save the state, but I realized, in paralleled computing environment, this SaveState() is based on each node which owns several cells, see codes below.

Code: Select all

proc savestate() {local i  localobj s, ss, f, rl
	s = new String()
	sprint(s.s, "svst.%04d", pc.id)
	f = new File(s.s)
	ss = new SaveState()
	ss.save()
	ss.fwrite(f, 0)
	f.close
	}
whereas the sequence number of the relevant Random instances is per cell.
My question is, how can I cooperate these two parts? I guess there might be two ways to do, one is using SaveState() per cell, but how can we do this? Another one is, to create a list for saving each cell's Random instances one by one, on the same node, and printf them into State file?
Please let me know your thoughts.

Re: How to run the simulation in-between?

Posted: Tue Nov 10, 2015 8:09 pm
by hines
Instead of MCellRan4 use Random123. Instead of trying to encode the streams and sequence into a single integer, you get 3 integers to specify the stream and a full 34 bit sequence.
One of the integers should be the gid, another can encode the purpose of the stream.

Assuming all your streams are in a rlist
after ss.fwrite(f, 1)

for i = 0, rlist.count()-1 {
f. printf("%d\n", rlist.o(i).seq())
}
f.close()

Re: How to run the simulation in-between?

Posted: Wed Nov 11, 2015 1:59 am
by breakwave922
Hi, Hines,

Thanks for the reply.
Here is what I modified to my code, according to your last threads:

I changed MCellRan4 to Random123, and also both gid and pc.id are used to encode the stream and sequence. See code below

Code: Select all

begintemplate BgGen
proc init() {
...
r = new Random()
r.Random123(1000*($6+$7))
r.negexp(1)

noise_netstim = new NetStim()
noise_netstim.interval = 1000/noise_freq
noise_netstim.number = 1e100 //noise_total_length*noise_freq/1000
noise_netstim.start = noise_start
noise_netstim.noise = 1
noise_netstim.noiseFromRandom(r)
...
}
endtemplate BgGen
$6 gets the cell id and $7 gets the node id that this cell belongs.

Then I created a list, and appended all the streams into it.

Code: Select all

randomlist.append(r)
In the saving part, here is what I did

Code: Select all

proc savestate() {local i  localobj s, ss, f, rl

	s = new String()
	sprint(s.s, "svst.%04d", pc.id)
	f = new File(s.s)
	ss = new SaveState()
	ss.save()
	ss.fwrite(f, 0)
	for i=0,randomlist.count-1 {
	f.printf("%g\n", randomlist.o(i).seq())
	}
	f.close
}
For the second part of the simulation, to read the state, here is what I did

Code: Select all

proc init() { 
finitialize()
s = new String()
	sprint(s.s, "svst.%04d", pc.id)
	f = new File(s.s)
	ss = new SaveState()
	ss.fread(f, 0)
	randomlist = new List()
	for i=0,randomlist.count-1 {
randomlist.o(i).seq(f.scanvar())
}

t=50
if (cvode.active()) {
cvode.re_init()
} else {
fcurrent()
}
frecord_init()
}
I always could match the first part of the simulation, but couldn't make it on the second part of the simulation. I didn't check the voltage traces, but by only checking the spiking timing data from 1000 cells, it shows mismatch.
Please let me know where you think the problem might be from?
Thank you.

Re: How to run the simulation in-between?

Posted: Wed Nov 11, 2015 7:13 am
by hines
I don't think it is ever useful to encode the processor rank (pc.id) in a random stream. That prevents reproducibility when running with different pc.nhost or different distributions of cells on ranks.

Code: Select all

   ss.fread(f, 0)
   randomlist = new List()
   for i=0,randomlist.count-1 {
randomlist.o(i).seq(f.scanvar())
}
your randomlist is empty

Re: How to run the simulation in-between?

Posted: Wed Nov 11, 2015 1:32 pm
by breakwave922
Hines,

If I use the same number of nodes to run the simulation, maybe using pc.id in a random stream is not a problem? But you are right, running on different pc.nhost would definitely have problems. Could you please let me know what should I do to this? Or can you show me some reference codes that I can take a look?

About that randomlist, yes, I also doubted the way I coded. Because the second part of the simulation is in different hoc file, how could it know the randomlist that I saved in the first part of the simulation? I did this because it didn't give me error, but I was not aware of the list is empty. To avoid this, do you suggest I should emerge the two hoc files to one?
Thanks.

Re: How to run the simulation in-between?

Posted: Wed Nov 11, 2015 3:10 pm
by hines
Just don't define the Random123 streams in terms of pc.id. (leave that arg as 0). r.Random123(id1, id2, id3) means that two streams are statistically independent if any of the three args are different when
the two streams are created. (typically the call is rlist.o(iicell).Random123(gid_of_icell, 0, 0) . ie. one stream per cell and all streams statistically independent). Sometimes one wishes to have large number of
streams per cell and then you can start to use the other args.

The model creation for the first and second parts should create identical models. That includes all the random streams. For restoring, you need to set the sequence number properly for each stream.
You undoubtedly have a named list of random instances which is identical for the first and second parts. Use the name of that list. You do not need to combine hoc files.

Re: How to run the simulation in-between?

Posted: Wed Nov 11, 2015 5:45 pm
by breakwave922
Hines,

Here is what I did according to your advise.
Remove pc.id from the Random123, use three args instead of one. One of the three args is cell's global id, other twos are zero.

Code: Select all

begintemplate BgGen
proc init() {
...
r = new Random()
r.Random123(1000*($6+12),0,0)
r.negexp(1)

noise_netstim = new NetStim()
noise_netstim.interval = 1000/noise_freq
noise_netstim.number = 1e100 //noise_total_length*noise_freq/1000
noise_netstim.start = noise_start
noise_netstim.noise = 1
noise_netstim.noiseFromRandom(r)
...
}
endtemplate BgGen
$6 gets the cell gid.

About the restoring part, I realized that I forgot to call restore(). Now it looks like this

Code: Select all

objref s, ss,f,randomlist
proc init() { 
finitialize()
s = new String()
	sprint(s.s, "svst.%04d", pc.id)
	f = new File(s.s)
	ss = new SaveState()
	ss.fread(f, 0)
	randomlist = new List()
	for i=0,randomlist.count-1 {
randomlist.o(i).seq(f.scanvar())
}
        f.close
        ss.restore()
t=50
if (cvode.active()) {
cvode.re_init()
} else {
fcurrent()
}
frecord_init()
}
The results are exciting now. Only 7 spikes out of more than 1400 are mismatching, when compared to simulation without segmentation.
As I'm still not sure what to do with your advice on
For restoring, you need to set the sequence number properly for each stream.
You undoubtedly have a named list of random instances which is identical for the first and second parts.
I guess the mismatching part may be due to this?

Re: How to run the simulation in-between?

Posted: Wed Nov 11, 2015 5:59 pm
by hines
your randomlist is still empty. don't you have a list of BgGen instances.

No need for the 1000*($6+12) arg. Just $6 is sufficient.

Re: How to run the simulation in-between?

Posted: Wed Nov 11, 2015 6:27 pm
by breakwave922
Hines,

Sorry, I realize that randomlist is empty just now. And here is what I modified:

Code: Select all

for m = 0, 799{
    if(!pc.gid_exists(m)) { continue }				// Can't connect to target if it doesn't exist 
													// on the node ("continue") skips rest of code
	bggen[m] = new BgGen(3,0,tstop,30,dt,m,pc.id)
	cellid = pc.gid2cell(m)                     	// get GID object from ID	
	cellid.dend bg2LAPsyn[m] = new bg2pyr(0.9)
	bg2LAPsyn[m].initW = 7.0//6.3
randomlist.append(bggen[m].r)
Saving state part

Code: Select all

proc savestate() {local i  localobj s, ss, f, rl

	s = new String()
	sprint(s.s, "svst.%04d", pc.id)
	f = new File(s.s)
	ss = new SaveState()
	ss.save()
	ss.fwrite(f, 0)
	for i=0,randomlist.count-1 {
	f.printf("%g\n", randomlist.o(i).seq())
	}
	f.close
}

savestate()
But when I re-ran the simulation with this modification, I still got the same results.

Regarding that 1000*($6+12) arg, I just want to make sure the arg is always>0, according to the paper "Translating network models to parallel hardware in NEURON" by you.

Re: How to run the simulation in-between?

Posted: Wed Nov 11, 2015 8:17 pm
by hines
You should try some simple debugging such as printing the sequences saved and restored.

Re: How to run the simulation in-between?

Posted: Fri Nov 13, 2015 6:31 pm
by breakwave922
Hi, Hines,

Thanks for the help. Yes, I've been trying to debug this issue step-by-step.
I created a small network with only 100 cells. I printed out the sequences saved and restored.
But I'm not sure whether I did it correctly. Below is what I did.

For the first part of the simulation, I saved the sequences to a file:

Code: Select all

objref randomfile

randomfile = new File("randomfile")

if(pc.id==0){     
randomfile.wopen()
randomfile.close()
}

proc randomsave() { local i, rank
pc.barrier() // wait for all hosts to get to this point

for rank=0, pc.nhost-1 { // host 0 first, then 1, 2, etc.

if (rank==pc.id) {
for i=0, randomlist.count-1 {
randomfile.aopen()                               
randomfile.printf("%g\n", randomlist.o(i).seq())
randomfile.close()
}
}
pc.barrier() // wait for all hosts to get to this point
}
}

randomsave()
I got a file with one column with 100 values, with 2 or 3. As I'm not familiar with what's been saved, I not sure whether I did it correctly.

Then I moved on to the restore part, i.e. the second part of the simulation.
This is the restore part that I used in the second part:

Code: Select all

proc init() { 
finitialize()
s = new String()
	sprint(s.s, "svst.%04d", pc.id)
	f = new File(s.s)
	ss = new SaveState()
	ss.fread(f, 0)
	randomlist = new List()
	for i=0,randomlist.count-1 {
randomlist.o(i).seq(f.scanvar())
}
        f.close
        ss.restore()

t=50
if (cvode.active()) {
cvode.re_init()
} else {
fcurrent()
}
frecord_init()
}
As I couldn't do printf in the init(), I saved the second part's random seq in the same way as I did in the first part.

Code: Select all

objref randomfile_restore

randomfile_restore = new File("randomfile_restore")

if(pc.id==0){     //"wopen" once by node 0 to clear the contents of the file
randomfile_restore.wopen()
randomfile_restore.close()
}

proc randomrestore() { local i, rank
pc.barrier() // wait for all hosts to get to this point
for rank=0, pc.nhost-1 { // host 0 first, then 1, 2, etc.
if (rank==pc.id) {
for i=0, randomlistII.count-1 {
randomfile_restore.aopen()                           
randomfile_restore.printf("%g\n", randomlistII.o(i).seq())
randomfile_restore.close()
}
}
pc.barrier() // wait for all hosts to get to this point
}
}
randomrestore()
randomlistII is the list that I created in the second simulation.

Then I got randomfile_restore file from the second simulation. From my understanding, randomfile and randomfile_restore are supposed to be the same. But when I checked those two files, the values are different. However, it's been strange that, for this small network simulation, I could match the spikes data to non-segmentation. But once the network size goes to 1000-cell, I got mismatch. And also, as long as I got mismatch, the mismatched spiking data always happened after more than 20ms from the beginning of the second simulation. For example, for 100ms-simulation, 0-50ms was for the first part, and 50-100ms was for the second part, then the mismatch part always happens after 70ms or even more. If the noise generator seq was not restored properly, the mismatch could happen right from the start of the second simulation, my thought.
It seems there are several issues, maybe they are from the same reason?
Please let me know your thought. Thank you so much.

Re: How to run the simulation in-between?

Posted: Fri Nov 13, 2015 7:35 pm
by hines
The whole point of not closing the file on return from savestate.fread and savestate.fwrite is so you can read and write extra state (the random sequence numbers) at the end of those files.
That said, it is possible to make your approach work.

Re: How to run the simulation in-between?

Posted: Fri Nov 13, 2015 11:20 pm
by breakwave922
Hi, Hines,
Yes, I didn't close the file on return from savestate.fread and savestate.fwrite. Please see following codes:

Code: Select all

ss.fwrite(f, 0)
	for i=0,randomlist.count-1 {
	f.printf("%g\n", randomlist.o(i).seq())
	}
	f.close
}

Code: Select all

proc init() { 
finitialize()
s = new String()
	sprint(s.s, "svst.%04d", pc.id)
	f = new File(s.s)
	ss = new SaveState()
	ss.fread(f, 0)
	randomlist = new List()
	for i=0,randomlist.count-1 {
randomlist.o(i).seq(f.scanvar())
}
        f.close
        ss.restore()
t=50
if (cvode.active()) {
cvode.re_init()
} else {
fcurrent()
}
frecord_init()
}
But how can we printf different node's seq number into one file in init()? Additional question, can we read the saved state file on WINDOWS by any tool? I felt very difficult to debug if I don't know what's in the state file that I saved.
Another question I have is, can the SaveState() save at particular time point? Meaning, can SaveState() save only state at that time point?
I read from the documentation, that, "SaveState class can be an expensive object in terms of memory storage", see http://www.neuron.yale.edu/neuron/stati ... state.html
Does this mean, SaveState() save all the information from the begins of the simulation? From my knowledge, we don't need to do this, cause if we want to restore the simulation, only state info at last time point is sufficient to start with.
Thanks a lot.

Re: How to run the simulation in-between?

Posted: Sat Nov 14, 2015 9:08 am
by hines
The reason you use sprint(s.s, "svst.%04d", pc.id) is because there is a separate savestate file per process.

SaveState writes ONLY the information at time t. It is expensive because it doubles the size of memory needed by a model.

SaveState does NOT save all state trajectories.

Only SaveState can read files written by SaveState. The internal binary format is very complex.