How to run the simulation in-between?

Moderator: wwlytton

breakwave922

Re: How to run the simulation in-between?

Post by breakwave922 »

Hines,

Thanks for helping me to solve my SaveState() problem on the single cell model, so now I'm moving on to the network level.
I have some questions to ask.
In my network, each cell receives random noise inputs, based on their cell id, see codes below

Code: Select all

begintemplate BgGen
proc init() {
...
r = new Random()
r.MCellRan4(1000*($6+12))
r.negexp(1)

noise_netstim = new NetStim()
noise_netstim.interval = 1000/noise_freq
noise_netstim.number = 1e100 //noise_total_length*noise_freq/1000
noise_netstim.start = noise_start
noise_netstim.noise = 1
noise_netstim.noiseFromRandom(r)
...
}
endtemplate BgGen
$6 gets the cell id to generate different the sequence number of the relevant Random instances to different cells.

After simulation, I used SaveState() to save the state, but I realized, in paralleled computing environment, this SaveState() is based on each node which owns several cells, see codes below.

Code: Select all

proc savestate() {local i  localobj s, ss, f, rl
	s = new String()
	sprint(s.s, "svst.%04d", pc.id)
	f = new File(s.s)
	ss = new SaveState()
	ss.save()
	ss.fwrite(f, 0)
	f.close
	}
whereas the sequence number of the relevant Random instances is per cell.
My question is, how can I cooperate these two parts? I guess there might be two ways to do, one is using SaveState() per cell, but how can we do this? Another one is, to create a list for saving each cell's Random instances one by one, on the same node, and printf them into State file?
Please let me know your thoughts.
hines
Site Admin
Posts: 1682
Joined: Wed May 18, 2005 3:32 pm

Re: How to run the simulation in-between?

Post by hines »

Instead of MCellRan4 use Random123. Instead of trying to encode the streams and sequence into a single integer, you get 3 integers to specify the stream and a full 34 bit sequence.
One of the integers should be the gid, another can encode the purpose of the stream.

Assuming all your streams are in a rlist
after ss.fwrite(f, 1)

for i = 0, rlist.count()-1 {
f. printf("%d\n", rlist.o(i).seq())
}
f.close()
breakwave922

Re: How to run the simulation in-between?

Post by breakwave922 »

Hi, Hines,

Thanks for the reply.
Here is what I modified to my code, according to your last threads:

I changed MCellRan4 to Random123, and also both gid and pc.id are used to encode the stream and sequence. See code below

Code: Select all

begintemplate BgGen
proc init() {
...
r = new Random()
r.Random123(1000*($6+$7))
r.negexp(1)

noise_netstim = new NetStim()
noise_netstim.interval = 1000/noise_freq
noise_netstim.number = 1e100 //noise_total_length*noise_freq/1000
noise_netstim.start = noise_start
noise_netstim.noise = 1
noise_netstim.noiseFromRandom(r)
...
}
endtemplate BgGen
$6 gets the cell id and $7 gets the node id that this cell belongs.

Then I created a list, and appended all the streams into it.

Code: Select all

randomlist.append(r)
In the saving part, here is what I did

Code: Select all

proc savestate() {local i  localobj s, ss, f, rl

	s = new String()
	sprint(s.s, "svst.%04d", pc.id)
	f = new File(s.s)
	ss = new SaveState()
	ss.save()
	ss.fwrite(f, 0)
	for i=0,randomlist.count-1 {
	f.printf("%g\n", randomlist.o(i).seq())
	}
	f.close
}
For the second part of the simulation, to read the state, here is what I did

Code: Select all

proc init() { 
finitialize()
s = new String()
	sprint(s.s, "svst.%04d", pc.id)
	f = new File(s.s)
	ss = new SaveState()
	ss.fread(f, 0)
	randomlist = new List()
	for i=0,randomlist.count-1 {
randomlist.o(i).seq(f.scanvar())
}

t=50
if (cvode.active()) {
cvode.re_init()
} else {
fcurrent()
}
frecord_init()
}
I always could match the first part of the simulation, but couldn't make it on the second part of the simulation. I didn't check the voltage traces, but by only checking the spiking timing data from 1000 cells, it shows mismatch.
Please let me know where you think the problem might be from?
Thank you.
hines
Site Admin
Posts: 1682
Joined: Wed May 18, 2005 3:32 pm

Re: How to run the simulation in-between?

Post by hines »

I don't think it is ever useful to encode the processor rank (pc.id) in a random stream. That prevents reproducibility when running with different pc.nhost or different distributions of cells on ranks.

Code: Select all

   ss.fread(f, 0)
   randomlist = new List()
   for i=0,randomlist.count-1 {
randomlist.o(i).seq(f.scanvar())
}
your randomlist is empty
breakwave922

Re: How to run the simulation in-between?

Post by breakwave922 »

Hines,

If I use the same number of nodes to run the simulation, maybe using pc.id in a random stream is not a problem? But you are right, running on different pc.nhost would definitely have problems. Could you please let me know what should I do to this? Or can you show me some reference codes that I can take a look?

About that randomlist, yes, I also doubted the way I coded. Because the second part of the simulation is in different hoc file, how could it know the randomlist that I saved in the first part of the simulation? I did this because it didn't give me error, but I was not aware of the list is empty. To avoid this, do you suggest I should emerge the two hoc files to one?
Thanks.
hines
Site Admin
Posts: 1682
Joined: Wed May 18, 2005 3:32 pm

Re: How to run the simulation in-between?

Post by hines »

Just don't define the Random123 streams in terms of pc.id. (leave that arg as 0). r.Random123(id1, id2, id3) means that two streams are statistically independent if any of the three args are different when
the two streams are created. (typically the call is rlist.o(iicell).Random123(gid_of_icell, 0, 0) . ie. one stream per cell and all streams statistically independent). Sometimes one wishes to have large number of
streams per cell and then you can start to use the other args.

The model creation for the first and second parts should create identical models. That includes all the random streams. For restoring, you need to set the sequence number properly for each stream.
You undoubtedly have a named list of random instances which is identical for the first and second parts. Use the name of that list. You do not need to combine hoc files.
breakwave922

Re: How to run the simulation in-between?

Post by breakwave922 »

Hines,

Here is what I did according to your advise.
Remove pc.id from the Random123, use three args instead of one. One of the three args is cell's global id, other twos are zero.

Code: Select all

begintemplate BgGen
proc init() {
...
r = new Random()
r.Random123(1000*($6+12),0,0)
r.negexp(1)

noise_netstim = new NetStim()
noise_netstim.interval = 1000/noise_freq
noise_netstim.number = 1e100 //noise_total_length*noise_freq/1000
noise_netstim.start = noise_start
noise_netstim.noise = 1
noise_netstim.noiseFromRandom(r)
...
}
endtemplate BgGen
$6 gets the cell gid.

About the restoring part, I realized that I forgot to call restore(). Now it looks like this

Code: Select all

objref s, ss,f,randomlist
proc init() { 
finitialize()
s = new String()
	sprint(s.s, "svst.%04d", pc.id)
	f = new File(s.s)
	ss = new SaveState()
	ss.fread(f, 0)
	randomlist = new List()
	for i=0,randomlist.count-1 {
randomlist.o(i).seq(f.scanvar())
}
        f.close
        ss.restore()
t=50
if (cvode.active()) {
cvode.re_init()
} else {
fcurrent()
}
frecord_init()
}
The results are exciting now. Only 7 spikes out of more than 1400 are mismatching, when compared to simulation without segmentation.
As I'm still not sure what to do with your advice on
For restoring, you need to set the sequence number properly for each stream.
You undoubtedly have a named list of random instances which is identical for the first and second parts.
I guess the mismatching part may be due to this?
hines
Site Admin
Posts: 1682
Joined: Wed May 18, 2005 3:32 pm

Re: How to run the simulation in-between?

Post by hines »

your randomlist is still empty. don't you have a list of BgGen instances.

No need for the 1000*($6+12) arg. Just $6 is sufficient.
breakwave922

Re: How to run the simulation in-between?

Post by breakwave922 »

Hines,

Sorry, I realize that randomlist is empty just now. And here is what I modified:

Code: Select all

for m = 0, 799{
    if(!pc.gid_exists(m)) { continue }				// Can't connect to target if it doesn't exist 
													// on the node ("continue") skips rest of code
	bggen[m] = new BgGen(3,0,tstop,30,dt,m,pc.id)
	cellid = pc.gid2cell(m)                     	// get GID object from ID	
	cellid.dend bg2LAPsyn[m] = new bg2pyr(0.9)
	bg2LAPsyn[m].initW = 7.0//6.3
randomlist.append(bggen[m].r)
Saving state part

Code: Select all

proc savestate() {local i  localobj s, ss, f, rl

	s = new String()
	sprint(s.s, "svst.%04d", pc.id)
	f = new File(s.s)
	ss = new SaveState()
	ss.save()
	ss.fwrite(f, 0)
	for i=0,randomlist.count-1 {
	f.printf("%g\n", randomlist.o(i).seq())
	}
	f.close
}

savestate()
But when I re-ran the simulation with this modification, I still got the same results.

Regarding that 1000*($6+12) arg, I just want to make sure the arg is always>0, according to the paper "Translating network models to parallel hardware in NEURON" by you.
hines
Site Admin
Posts: 1682
Joined: Wed May 18, 2005 3:32 pm

Re: How to run the simulation in-between?

Post by hines »

You should try some simple debugging such as printing the sequences saved and restored.
breakwave922

Re: How to run the simulation in-between?

Post by breakwave922 »

Hi, Hines,

Thanks for the help. Yes, I've been trying to debug this issue step-by-step.
I created a small network with only 100 cells. I printed out the sequences saved and restored.
But I'm not sure whether I did it correctly. Below is what I did.

For the first part of the simulation, I saved the sequences to a file:

Code: Select all

objref randomfile

randomfile = new File("randomfile")

if(pc.id==0){     
randomfile.wopen()
randomfile.close()
}

proc randomsave() { local i, rank
pc.barrier() // wait for all hosts to get to this point

for rank=0, pc.nhost-1 { // host 0 first, then 1, 2, etc.

if (rank==pc.id) {
for i=0, randomlist.count-1 {
randomfile.aopen()                               
randomfile.printf("%g\n", randomlist.o(i).seq())
randomfile.close()
}
}
pc.barrier() // wait for all hosts to get to this point
}
}

randomsave()
I got a file with one column with 100 values, with 2 or 3. As I'm not familiar with what's been saved, I not sure whether I did it correctly.

Then I moved on to the restore part, i.e. the second part of the simulation.
This is the restore part that I used in the second part:

Code: Select all

proc init() { 
finitialize()
s = new String()
	sprint(s.s, "svst.%04d", pc.id)
	f = new File(s.s)
	ss = new SaveState()
	ss.fread(f, 0)
	randomlist = new List()
	for i=0,randomlist.count-1 {
randomlist.o(i).seq(f.scanvar())
}
        f.close
        ss.restore()

t=50
if (cvode.active()) {
cvode.re_init()
} else {
fcurrent()
}
frecord_init()
}
As I couldn't do printf in the init(), I saved the second part's random seq in the same way as I did in the first part.

Code: Select all

objref randomfile_restore

randomfile_restore = new File("randomfile_restore")

if(pc.id==0){     //"wopen" once by node 0 to clear the contents of the file
randomfile_restore.wopen()
randomfile_restore.close()
}

proc randomrestore() { local i, rank
pc.barrier() // wait for all hosts to get to this point
for rank=0, pc.nhost-1 { // host 0 first, then 1, 2, etc.
if (rank==pc.id) {
for i=0, randomlistII.count-1 {
randomfile_restore.aopen()                           
randomfile_restore.printf("%g\n", randomlistII.o(i).seq())
randomfile_restore.close()
}
}
pc.barrier() // wait for all hosts to get to this point
}
}
randomrestore()
randomlistII is the list that I created in the second simulation.

Then I got randomfile_restore file from the second simulation. From my understanding, randomfile and randomfile_restore are supposed to be the same. But when I checked those two files, the values are different. However, it's been strange that, for this small network simulation, I could match the spikes data to non-segmentation. But once the network size goes to 1000-cell, I got mismatch. And also, as long as I got mismatch, the mismatched spiking data always happened after more than 20ms from the beginning of the second simulation. For example, for 100ms-simulation, 0-50ms was for the first part, and 50-100ms was for the second part, then the mismatch part always happens after 70ms or even more. If the noise generator seq was not restored properly, the mismatch could happen right from the start of the second simulation, my thought.
It seems there are several issues, maybe they are from the same reason?
Please let me know your thought. Thank you so much.
hines
Site Admin
Posts: 1682
Joined: Wed May 18, 2005 3:32 pm

Re: How to run the simulation in-between?

Post by hines »

The whole point of not closing the file on return from savestate.fread and savestate.fwrite is so you can read and write extra state (the random sequence numbers) at the end of those files.
That said, it is possible to make your approach work.
breakwave922

Re: How to run the simulation in-between?

Post by breakwave922 »

Hi, Hines,
Yes, I didn't close the file on return from savestate.fread and savestate.fwrite. Please see following codes:

Code: Select all

ss.fwrite(f, 0)
	for i=0,randomlist.count-1 {
	f.printf("%g\n", randomlist.o(i).seq())
	}
	f.close
}

Code: Select all

proc init() { 
finitialize()
s = new String()
	sprint(s.s, "svst.%04d", pc.id)
	f = new File(s.s)
	ss = new SaveState()
	ss.fread(f, 0)
	randomlist = new List()
	for i=0,randomlist.count-1 {
randomlist.o(i).seq(f.scanvar())
}
        f.close
        ss.restore()
t=50
if (cvode.active()) {
cvode.re_init()
} else {
fcurrent()
}
frecord_init()
}
But how can we printf different node's seq number into one file in init()? Additional question, can we read the saved state file on WINDOWS by any tool? I felt very difficult to debug if I don't know what's in the state file that I saved.
Another question I have is, can the SaveState() save at particular time point? Meaning, can SaveState() save only state at that time point?
I read from the documentation, that, "SaveState class can be an expensive object in terms of memory storage", see http://www.neuron.yale.edu/neuron/stati ... state.html
Does this mean, SaveState() save all the information from the begins of the simulation? From my knowledge, we don't need to do this, cause if we want to restore the simulation, only state info at last time point is sufficient to start with.
Thanks a lot.
hines
Site Admin
Posts: 1682
Joined: Wed May 18, 2005 3:32 pm

Re: How to run the simulation in-between?

Post by hines »

The reason you use sprint(s.s, "svst.%04d", pc.id) is because there is a separate savestate file per process.

SaveState writes ONLY the information at time t. It is expensive because it doubles the size of memory needed by a model.

SaveState does NOT save all state trajectories.

Only SaveState can read files written by SaveState. The internal binary format is very complex.
Post Reply