Neuron and Myrinet

Post Reply
Thomas Wennekers
Posts: 5
Joined: Sun May 28, 2006 6:28 am
Location: Plymouth / UK
Contact:

Neuron and Myrinet

Post by Thomas Wennekers » Sun May 28, 2006 6:39 am

Hi all

Did somebody manage to compile Neuron with MPI for Myrinet and/or LAM-MPI?

The following code snippet from src/nrnmpi/nrnmpi.c seems to restrict usage to MPICH and ethernet networks only:

#if !ALWAYS_CALL_MPI_INIT
/* this is not good. depends on mpirun adding at least one
arg that starts with -p4 but that probably is dependent
on mpich and the use of the ch_p4 device. We are trying to
work around the problem that MPI_Init may change the working
directory and so when not invoked under mpirun we would like to
NOT call MPI_Init.
*/
{
int i, b;
b = 0;
for (i=0; i < *pargc; ++i) {
if (strncmp("-p4", (*pargv), 3) == 0) {
b = 1;
break;
}
}
if (!b) {
nrnmpi_use = 0;
return;
}
}
#endif


Does somebody know a workaround?

Thanks
Thomas


PS: Just inactivating the above code is not a workaround; it results in the subsequent errors on my Laptop (used for testing) running Kubuntu and LAM-MPI, with or without iv [doNotify() somehow links to hoc_notify_iv() in src/oc/hoc_init.c so I thought it could be iv]

mpirun -np 2 nrniv test0.hoc
NEURON -- Version 5.8 2005-10-7 13:46:29 Main (85)
by John W. Moore, Michael Hines, and Ted Carnevale
Duke and Yale University -- Copyright 1984-2005

hello from id 1 on beatrix

nrnmpi_init(): numprocs=2 myid=0
NEURON -- Version 5.8 2005-10-7 13:46:29 Main (85)
by John W. Moore, Michael Hines, and Ted Carnevale
Duke and Yale University -- Copyright 1984-2005

hello from id 0 on beatrix

0
oc: Resource temporarily unavailable
0 nrniv: errno set during call of doNotify
0 in test0.hoc near line 9
0 for i=1, 1000000 doNotify()
^
oc: Resource temporarily unavailable
0 nrniv: errno set during call of doNotify
0 in test0.hoc near line 9
0 ^
oc: Resource temporarily unavailable
0 nrniv: errno set during call of doNotify
0 in test0.hoc near line 9
0 ^
oc: Resource temporarily unavailable
0 nrniv: errno set during call of doNotify
0 in test0.hoc near line 9
0 ^
oc: Resource temporarily unavailable
No more errno warnings during this execution
0 nrniv: errno set during call of doNotify
0 in test0.hoc near line 9
0 ^
errno set 6666 times on last execution
bbs_msg_cnt_=1 bbs_poll_cnt_=6667 bbs_poll_=93
0


Same problem appears on Beowulf cluster with MPICH and Myrinet

Neuron version is 5.8 but the newest version on the Neuron-website 5.9.?? contains the same code particle. Didn't try to compile and test yet, but am pretty sure that the same problem would likely occur

Again, thanks for hints!

Best
Thomas

hines
Site Admin
Posts: 1577
Joined: Wed May 18, 2005 3:32 pm

Post by hines » Sun May 28, 2006 5:27 pm

The problem is not the #if !ALWAYS_CALL_MPI_INIT code section. In fact everything is working in your example except for the benign but annoying error messages. They may just go away if you upgrade to 5.9. If not let me know. Usually the problem traces to failing to reset errno=0 after a call to an mpi function.

Thomas Wennekers
Posts: 5
Joined: Sun May 28, 2006 6:28 am
Location: Plymouth / UK
Contact:

5.3 compilation fails

Post by Thomas Wennekers » Tue May 30, 2006 10:46 am

Hi

Thanks for the info.

Compilation of version 5.9 fails with

-----------------------------
.....
mpicxx -g -O2 -o .libs/ivoc nrnmain.o ivocmain.o classreg.o datapath.o ocjump.o symdir.o ../oc/nocable.o ../oc/modlreg.o ../oc/.libs/libocxt.so ../oc/.libs/liboc.so -L/usr/X11R6/lib64 -lX11 ./.libs/libivoc.so ../nrnmpi/.libs/libnrnmpi.so ../memacs/.libs/libmemacs.so ../mesch/.libs/libmeschach.so ../gnu/.libs/libneuron_gnu.so /usr/local/Cluster-Apps/nrn/nrn-5.9/mx-iv-mpi-gcc/iv/x86_64/lib/libIVhines.so /usr/lib64/libstdc++.so -lreadline -lncurses -ldl -lm -Wl,--rpath -Wl,/usr/local/Cluster-Apps/nrn/nrn-5.9/mx-iv-mpi-gcc//x86_64/lib -Wl,--rpath -Wl,/usr/local/Cluster-Apps/nrn/nrn-5.9/mx-iv-mpi-gcc/iv/x86_64/lib
./.libs/libivoc.so: undefined reference to `ListImpl_best_new_count(long, unsigned int, unsigned int)'
collect2: ld returned 1 exit status
make[3]: *** [ivoc] Error 1
make[3]: Leaving directory `/home/thomas/soft/nrn-5.9/src/ivoc'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/home/thomas/soft/nrn-5.9/src'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/thomas/soft/nrn-5.9'
make: *** [all] Error 2
----------------------------------

configuration complains about varargs dperecated, but that's unlikekly the reason.

The above error is the first during compilation (beside several size mismatches in variables)

full logs of configuration and make are here: www.pion.ac.uk/~thomas/open

Best wishes
Thomas

hines
Site Admin
Posts: 1577
Joined: Wed May 18, 2005 3:32 pm

Post by hines » Tue May 30, 2006 11:00 am

reinstall interviews (get it from the v5.9
directory or the alpha directory or the link in the install page of the web site). Note that if you are running with mpi then all gui is turned off.
In that case interviews is unnecessary and you can configure with the --without-x option

Thomas Wennekers
Posts: 5
Joined: Sun May 28, 2006 6:28 am
Location: Plymouth / UK
Contact:

--without-iv worked

Post by Thomas Wennekers » Tue May 30, 2006 12:43 pm

Hi

Configuration without interviews worked. I still had to switch off the above code-snippet in nrnmpi.c by hand. Looks like things compiled properly (up to a few warnings). Simple example programs did run without errors. The pretend-to-be error messaged I had with version 5.8 vanished.

Thanks!

Thomas

hines
Site Admin
Posts: 1577
Joined: Wed May 18, 2005 3:32 pm

Post by hines » Tue May 30, 2006 12:58 pm

you can avoid changing the .h file by hand by using the configuration argument
always_call_mpi_init=yes
(there is no -- before the name since it is just an environment variable.)

Thomas Wennekers
Posts: 5
Joined: Sun May 28, 2006 6:28 am
Location: Plymouth / UK
Contact:

Neuron & Intel & Myrinet

Post by Thomas Wennekers » Tue May 30, 2006 2:13 pm

Hi again

Just for info:

Successfully compiled Neuron version 5.9. with Intel compilers for myrinet, but without interviews.

Lots of warnings, basically related to missing prototypes.

One bug: In oc/ocbbs.cpp I had to replace "nrnmpi_nhost" by "nrnmpi_numprocs". I am not entirely sure wether that made a bug just more subtle or killed it. Nonethelss, compilation run through afterwards and simple tests succeeded (src/parallel/test0.hoc and the example from the ParallelNetworkManager manpage)

static double nhost(void* v) {
#if defined(HAVE_STL)
OcBBS* bbs = (OcBBS*)v;
return double(bbs->nhost());
#else
// return nrnmpi_nhost;
return nrnmpi_numprocs;
#endif
}

config and make logs are here: www.pion.ac.uk/~thomas/open

Regards,
Thomas

hines
Site Admin
Posts: 1577
Joined: Wed May 18, 2005 3:32 pm

Post by hines » Tue May 30, 2006 2:48 pm

We are going to have to straighten that out. That is a piece of code that is never supposed to be compiled because HAVE_STL is required nowadays.
Lets deal with this by email. Please send your src/parallel/bbsconf.h file to michael.hines@yale.edu.

Post Reply