www.neuron.yale.edu

Posted: **Fri Nov 07, 2014 10:13 pm**

I am trying to run a 100 cell Network in parallel mode with 30 nodes. I am able to get the spike data (spike time,cell_ID)but it is misaligned as shown below

25289.5 88
173909.6 95
15821.5 88

38750.6 74
19580.9 72
26563.8 95
16970.7 95
67205.0 21
38142.4 92
173918.8 53 3632.6 42
22061.0 70
38750.9 92

15050.3 56
15757.2 89
25107.6 99
27930.4 17
25098.0 81
173934.9 84

Some of the spike times are written side by side and this occurred in several instances so I cannot manually correct it. I am attaching the piece of code where I am saving the spike data. Please let me know what can I change to get them aligned.

Code: Select all

objref savet
savet = new File()
savet.wopen("data")

proc spikeout() { local i, rank
	pc.barrier() // wait for all hosts to get to this point
	for rank=0, pc.nhost-1 { // host 0 first, then 1, 2, etc.
		if (rank==pc.id) {
			for i=0, tvec.size-1 {
			savet.aopen("data")	
			savet.printf("%7.1f\t %d\n", tvec.x[i], idvec.x[i])
			//savet.close()
			}
		}
		pc.barrier() // wait for all hosts to get to this point
	}
}
spikeout()
savet.close()

{pc.runworker()}
{pc.done()}

Posted: **Sat Nov 08, 2014 1:06 am**

Maybe it's because of all the aopen() calls in the innermost loop.

Code: Select all

	for rank=0, pc.nhost-1 { // host 0 first, then 1, 2, etc.
		if (rank==pc.id) {
			for i=0, tvec.size-1 {
			savet.aopen("data")	
			savet.printf("%7.1f\t %d\n", tvec.x[i], idvec.x[i])
			//savet.close()
			}
		}
		pc.barrier() // wait for all hosts to get to this point
	}

What's wrong with doing that one level higher, like this?

Code: Select all

	for rank=0, pc.nhost-1 { // host 0 first, then 1, 2, etc.
		if (rank==pc.id) {
			savet.aopen("data")	
			for i=0, tvec.size-1 {
			savet.printf("%7.1f\t %d\n", tvec.x[i], idvec.x[i])
			}
			savet.close()
		}
		pc.barrier() // wait for all hosts to get to this point
	}

Or maybe it's sufficient to wopen the output file once, and forget about all the aopens entirely, like this:

Code: Select all

	savet.wopen("data")
	for rank=0, pc.nhost-1 { // host 0 first, then 1, 2, etc.
		if (rank==pc.id) {
			for i=0, tvec.size-1 {
			savet.printf("%7.1f\t %d\n", tvec.x[i], idvec.x[i])
			}
		}
		pc.barrier() // wait for all hosts to get to this point
	}
	savet.close()

COMMENT: I now think it's best to append data, as illustrated in ftest3.hoc in this thread
Writing output files from parallel simulations
viewtopic.php?f=28&t=3230

Posted: **Wed Dec 03, 2014 8:14 pm**

Hi, Ted,

I'm confused by this saving procedure. Because it's a paralleled code, what will happen if two nodes trying to write the spiking time into the data file at the same time? Or this case would never happen?
Thanks in advance.

Posted: **Thu Dec 04, 2014 12:43 pm**

breakwave922 wrote:what will happen if two nodes trying to write the spiking time into the data file at the same time?

Can't happen. This for loop serializes output.

Code: Select all

       for rank=0, pc.nhost-1 { // host 0 first, then 1, 2, etc.
          if (rank==pc.id) {
             for i=0, tvec.size-1 {
             savet.printf("%7.1f\t %d\n", tvec.x[i], idvec.x[i])
             }
          }
          pc.barrier() // wait for all hosts to get to this point
       }

Posted: **Fri Dec 05, 2014 5:28 pm**

Hi, Ted,
Thanks for your reply. It is clear to me now.
About this code, I have another question. Before we append spiking infomation to the "data" file, we actually used savet.wopen("data") to create a file on each node, since we didn't specify pc.id when creating, which means that there would be #nodes of data files, right? But from our practice, there was only one data file with all the information we wanted. Is my understanding that, in parallel computation platform, Neuron scripts are executed on every node if no pc.id is specified, is wrong? What is the difference between master machine(pc.id=0) and workers machine(pc.id>0) in executing the script?
Thanks in advance.

Posted: **Sun Dec 07, 2014 4:58 pm**

Good question. Apparently what happens is that each host flushes and closes the file that it wrote, and the last host to close is the one that determines what remains in the file. Consider this example:

Code: Select all

objref pc
pc = new ParallelContext()
objref fil
{
  fil = new File("ofil.dat")
  if (pc.id==0) {
    print "ftest4.hoc--on a shared filesystem machine"
    print "only one host's output (the last host to flush and close the file) will remain in the output file."
  }
  if (pc.id==0) { // test for existence of output file, delete if found
    if (fil.ropen()==1) fil.unlink()
  }
  pc.barrier() // wait for all hosts to get to this point
  fil.wopen()
  for rank=0,pc.nhost-1 { // host 0 first, then 1, 2, etc.
    if (rank==pc.id) {
      printf("I am %d of %d\n", pc.id, pc.nhost)
      fil.printf("I am %d of %d\n", pc.id, pc.nhost)
    }
    pc.barrier() // wait for all hosts to get to this point
  }
  fil.close()
}
{pc.runworker()}
{pc.done()}
quit()

When executed with
mpiexec -n 4 nrniv -mpi ftest4.hoc
stdout shows this
I am 0 of 4
I am 1 of 4
I am 2 of 4
I am 3 of 4
but ofil.dat contains only this
I am 0 of 4

I correct my previous error--the proper strategy is to use aopen so that each host's data are appended to the file, as in ftest3.hoc in this thread
Writing output files from parallel simulations
viewtopic.php?f=28&t=3230

Posted: **Tue Dec 09, 2014 4:07 pm**

Thanks, Ted.
So basically, it's a shared system, only one shared file, no # of nhost files. Am I correct?

Posted: **Tue Dec 09, 2014 4:25 pm**

On my PC running Linux, and on the Neuroscience Gateway Portal, the file system is shared. That's also most likely the case on individual Macs running OS X and PCs running MSWin. You'll want to check on your own hardware, just to make sure.

www.neuron.yale.edu

Misalignment of Spike data

Misalignment of Spike data

Re: Misalignment of Spike data

Re: Misalignment of Spike data

Re: Misalignment of Spike data

Re: Misalignment of Spike data

Re: Misalignment of Spike data

Re: Misalignment of Spike data

Re: Misalignment of Spike data