Code: Select all
objref pc
pc = new ParallelContext()
. . . many statements and procedures later . . .
proc spikeout() { local i, rank
pc.barrier() // wait for all hosts to get to this point
if (pc.id==0) printf("\ntime\t cell\n") // print header once
for rank=0, pc.nhost-1 { // host 0 first, then 1, 2, etc.
if (rank==pc.id) {
// the elements of the tvec and idvec Vectors
// are the time of each spike
// and the gid of the cell that generated it, respectively
for i=0, tvec.size-1 {
printf("%g\t %d\n", tvec.x[i], idvec.x[i])
}
}
pc.barrier() // wait for all hosts to get to this point
}
}
spikeout()
However, sometimes it is desirable to explicitly write results to a file, to save results to two or more files, or to save results in binary format. This raises questions such as
* does each host has its own file system, or do all hosts share the same file system?
* if the file system is shared, and the aim is to write all results to the same file, how can this be done without each host overwriting what was written by another host?
To answer these questions, I just ran some tests on my own desktop PC under Linux, and on the Neuroscience Gateway Portal NSG http://www.nsgportal.org/. Here's what I found out.
First: is the file system shared?
On my PC the answer had to be yes, but I wasn't sure about the NSG. To find out, I wrote and executed ftest1.hoc
Code: Select all
objref pc
pc = new ParallelContext()
strdef nom
objref fil
{
if (pc.id==0) print "ftest1.hoc--generates nhost output files"
sprint(nom,"f%d.dat",pc.id)
printf("I am %d of %d, nom is %s\n", pc.id, pc.nhost, nom)
fil = new File(nom)
fil.wopen()
fil.printf("I am %d of %d, nom is %s\n", pc.id, pc.nhost, nom)
fil.close()
}
{pc.runworker()}
{pc.done()}
quit()
mpiexec -n 2 nrniv -mpi ftest1.hoc
produced files called f0.dat and f1.dat which contained
I am 0 of 2, nom is f0.dat
and
I am 1 of 2, nom is f1.dat
respectively. So on my PC, each host saw the same file system. No surprise, but it's a nice sanity check.
Executing ftest1.hoc on the NSG with two cores, each on a different node, produced the same result. This means that each host saw the same file system, even if the host was on a different node. So the NSG has a shared file system too.
And if that's the case, it should be possible to produce a program in which file output generated by one host is overwritten by file output generated by another. To this end I wrote ftest2.hoc
Code: Select all
objref pc
pc = new ParallelContext()
objref fil
{
if (pc.id==0) {
print "ftest2.hoc--on a shared filesystem machine"
print "generates one output file that is overwritten by each host"
}
fil = new File("ofil.dat")
fil.wopen()
printf("I am %d of %d\n", pc.id, pc.nhost)
fil.printf("I am %d of %d\n", pc.id, pc.nhost)
fil.close()
}
{pc.runworker()}
{pc.done()}
quit()
mpiexec -n 4 nrniv -mpi ftest2.hoc
produced a single file called ofil.dat which contained
I am 1 of 4
which is what would happen if each host was writing to the same file system, indeed overwriting whatever already existed on disk, and host 1 was the slowest. I tried this again a couple of times, and occasionally a different host was the last one, but none of these runs produced an ofil.dat that contained more than one line of text. A run on NSG with 4 cores on 1 node produced similar results.
So that confirmed the conjecture that the NSG's file system is shared, but it raised a new, important question: How to prevent each host's file output from interfering with the output from each other host? In particular:
How to make all hosts write nondestructively to the same output file?
The trick is to make each host append its output to the same file. "But what if there is already a file with the same name that contains results of a previous simulation?" Of course, one must first test for the existance of such a file, and delete it if found. And that's what ftest3.hoc does
Code: Select all
objref pc
pc = new ParallelContext()
objref fil
{
fil = new File("ofil.dat")
if (pc.id==0) {
print "ftest3.hoc--on a shared filesystem machine"
print "generates one output file to which each host appends data"
}
if (pc.id==0) { // test for existence of output file, delete if found
if (fil.ropen()==1) fil.unlink()
}
pc.barrier() // wait for all hosts to get to this point
for rank=0,pc.nhost-1 { // host 0 first, then 1, 2, etc.
if (rank==pc.id) {
printf("I am %d of %d\n", pc.id, pc.nhost)
fil.aopen()
fil.printf("I am %d of %d\n", pc.id, pc.nhost)
fil.close()
}
pc.barrier() // wait for all hosts to get to this point
}
}
{pc.runworker()}
{pc.done()}
quit()
mpiexec -n 4 nrniv -mpi ftest3.hoc
it produced an ofil.dat that contained these lines
I am 0 of 4
I am 1 of 4
I am 2 of 4
I am 3 of 4
which is exactly as expected. And any ofil.dat that already exists is deleted before a new ofil.dat is generated. I repeated this test on NSG using a total of 4 cores (2 cores on each of 2 nodes), and got the same result.