Page 1 of 1
Misalignment of Spike data
Posted: Fri Nov 07, 2014 10:13 pm
by guntu
I am trying to run a 100 cell Network in parallel mode with 30 nodes. I am able to get the spike data (spike time,cell_ID)but it is misaligned as shown below
25289.5 88
173909.6 95
15821.5 88
38750.6 74
19580.9 72
26563.8 95
16970.7 95
67205.0 21
38142.4 92
173918.8 53 3632.6 42
22061.0 70
38750.9 92
15050.3 56
15757.2 89
25107.6 99
27930.4 17
25098.0 81
173934.9 84
Some of the spike times are written side by side and this occurred in several instances so I cannot manually correct it. I am attaching the piece of code where I am saving the spike data. Please let me know what can I change to get them aligned.
Code: Select all
objref savet
savet = new File()
savet.wopen("data")
proc spikeout() { local i, rank
pc.barrier() // wait for all hosts to get to this point
for rank=0, pc.nhost-1 { // host 0 first, then 1, 2, etc.
if (rank==pc.id) {
for i=0, tvec.size-1 {
savet.aopen("data")
savet.printf("%7.1f\t %d\n", tvec.x[i], idvec.x[i])
//savet.close()
}
}
pc.barrier() // wait for all hosts to get to this point
}
}
spikeout()
savet.close()
{pc.runworker()}
{pc.done()}
Re: Misalignment of Spike data
Posted: Sat Nov 08, 2014 1:06 am
by ted
Maybe it's because of all the aopen() calls in the innermost loop.
Code: Select all
for rank=0, pc.nhost-1 { // host 0 first, then 1, 2, etc.
if (rank==pc.id) {
for i=0, tvec.size-1 {
savet.aopen("data")
savet.printf("%7.1f\t %d\n", tvec.x[i], idvec.x[i])
//savet.close()
}
}
pc.barrier() // wait for all hosts to get to this point
}
What's wrong with doing that one level higher, like this?
Code: Select all
for rank=0, pc.nhost-1 { // host 0 first, then 1, 2, etc.
if (rank==pc.id) {
savet.aopen("data")
for i=0, tvec.size-1 {
savet.printf("%7.1f\t %d\n", tvec.x[i], idvec.x[i])
}
savet.close()
}
pc.barrier() // wait for all hosts to get to this point
}
Or maybe it's sufficient to wopen the output file once, and forget about all the aopens entirely, like this:
Code: Select all
savet.wopen("data")
for rank=0, pc.nhost-1 { // host 0 first, then 1, 2, etc.
if (rank==pc.id) {
for i=0, tvec.size-1 {
savet.printf("%7.1f\t %d\n", tvec.x[i], idvec.x[i])
}
}
pc.barrier() // wait for all hosts to get to this point
}
savet.close()
COMMENT: I now think it's best to append data, as illustrated in ftest3.hoc in this thread
Writing output files from parallel simulations
viewtopic.php?f=28&t=3230
Re: Misalignment of Spike data
Posted: Wed Dec 03, 2014 8:14 pm
by breakwave922
Hi, Ted,
I'm confused by this saving procedure. Because it's a paralleled code, what will happen if two nodes trying to write the spiking time into the data file at the same time? Or this case would never happen?
Thanks in advance.
Re: Misalignment of Spike data
Posted: Thu Dec 04, 2014 12:43 pm
by ted
breakwave922 wrote:what will happen if two nodes trying to write the spiking time into the data file at the same time?
Can't happen. This for loop serializes output.
Code: Select all
for rank=0, pc.nhost-1 { // host 0 first, then 1, 2, etc.
if (rank==pc.id) {
for i=0, tvec.size-1 {
savet.printf("%7.1f\t %d\n", tvec.x[i], idvec.x[i])
}
}
pc.barrier() // wait for all hosts to get to this point
}
Re: Misalignment of Spike data
Posted: Fri Dec 05, 2014 5:28 pm
by breakwave922
Hi, Ted,
Thanks for your reply. It is clear to me now.
About this code, I have another question. Before we append spiking infomation to the "data" file, we actually used savet.wopen("data") to create a file on each node, since we didn't specify pc.id when creating, which means that there would be #nodes of data files, right? But from our practice, there was only one data file with all the information we wanted. Is my understanding that, in parallel computation platform, Neuron scripts are executed on every node if no pc.id is specified, is wrong? What is the difference between master machine(pc.id=0) and workers machine(pc.id>0) in executing the script?
Thanks in advance.
Re: Misalignment of Spike data
Posted: Sun Dec 07, 2014 4:58 pm
by ted
Good question. Apparently what happens is that each host flushes and closes the file that it wrote, and the last host to close is the one that determines what remains in the file. Consider this example:
Code: Select all
objref pc
pc = new ParallelContext()
objref fil
{
fil = new File("ofil.dat")
if (pc.id==0) {
print "ftest4.hoc--on a shared filesystem machine"
print "only one host's output (the last host to flush and close the file) will remain in the output file."
}
if (pc.id==0) { // test for existence of output file, delete if found
if (fil.ropen()==1) fil.unlink()
}
pc.barrier() // wait for all hosts to get to this point
fil.wopen()
for rank=0,pc.nhost-1 { // host 0 first, then 1, 2, etc.
if (rank==pc.id) {
printf("I am %d of %d\n", pc.id, pc.nhost)
fil.printf("I am %d of %d\n", pc.id, pc.nhost)
}
pc.barrier() // wait for all hosts to get to this point
}
fil.close()
}
{pc.runworker()}
{pc.done()}
quit()
When executed with
mpiexec -n 4 nrniv -mpi ftest4.hoc
stdout shows this
I am 0 of 4
I am 1 of 4
I am 2 of 4
I am 3 of 4
but ofil.dat contains only this
I am 0 of 4
I correct my previous error--the proper strategy is to use aopen so that each host's data are appended to the file, as in ftest3.hoc in this thread
Writing output files from parallel simulations
viewtopic.php?f=28&t=3230
Re: Misalignment of Spike data
Posted: Tue Dec 09, 2014 4:07 pm
by breakwave922
Thanks, Ted.
So basically, it's a shared system, only one shared file, no # of nhost files. Am I correct?
Re: Misalignment of Spike data
Posted: Tue Dec 09, 2014 4:25 pm
by ted
On my PC running Linux, and on the Neuroscience Gateway Portal, the file system is shared. That's also most likely the case on individual Macs running OS X and PCs running MSWin. You'll want to check on your own hardware, just to make sure.