Fast NEURON on a budget

The basics of how to develop, test, and use models.
Post Reply
mattmc88

Fast NEURON on a budget

Post by mattmc88 »

I have about $4000 to spend on a computer (or cluster) to run NEURON. Any suggestions on what to buy to get the fastest performance out of NEURON ?

I've been looking at the Mac PRO, a Dell workstation running XP, or an HP workstation running XP or Linux.

Requirements:

1) This is something I'm looking to get (and use) very soon, so I want to make sure NEURON can take advantage of the machine pretty much right-away. Should I hope to get returns out of a quadcore, or a machine with two quadcores?

2) I'd like to get a machine that is upgradeable, so that I can add more/faster processors/memory over time.

3) 64-bit or 32-bit OS?

4) How much RAM will I want?

5) Which is better, more CPUs, or more clockspeed/CPU?

6) Is i7 worth the wait? Are Xeon quadcores worth the money?

7) I'd really like to be able to use the GUI!!!

NOTE: I'm mainly going to be running optimization routines on model-to-data fits with single cell models using a simulated annealing routine.

Any-and-all advice would be GREATLY appreciated!

Thanks!!

Matt
ted
Site Admin
Posts: 5810
Joined: Wed May 18, 2005 4:50 pm
Location: Yale University School of Medicine
Contact:

Re: Fast NEURON on a budget

Post by ted »

Mac PRO, a Dell workstation running XP, or an HP workstation running XP or Linux
I'd suggest sticking with whatever OS you're most familiar with. Your clock cycles are more precious than those of any silicon CPU.

The first question is what kind of parallelization you need: execution of multiple simulations at the same time (each processor simulates its own model), or distributed (parallel) execution of individual simulations (each processor simulates a different part of the model). The former is well-suited to optimization problems; the latter is most beneficial when the model cell is large and complex (anatomical detail, mechanisms that are described by kinetic schemes and/or differential equations).

NEURON can handle either type of parallelization. On multicore standalone workstations, 7.x allows distributed multithreaded execution, which lets you continue to use the GUI and does not require revision of hoc code. NMODL code, however, may need to be changed to make it threadsafe.
quadcore, or a machine with two quadcores
I have not heard of any comparisons between specific platforms or architectures that address this issue. Depends on technical details of how shared memory is implemented.
I'd like to get a machine that is upgradeable, so that I can add more/faster processors/memory over time.
My guess is you'll pay through the nose for that. Historically the upgrades for consumer hardware that give the most bang for the buck are adding main RAM, and switching to bigger/faster hard drive(s).
64-bit or 32-bit OS?
Under MSWin, NEURON relies on Cygwin. As I understand it, there is no 64 bit Cygwin yet, so forget about 64 bit MSWin. I don't know about OS X. 64 bit Linux is available, and is a good bet if you're already a Linux user, and plan to use the machine principally for simulations.
How much RAM will I want?
As much as you can get. Always.
Which is better, more CPUs, or more clockspeed/CPU?
Depends on whether you need distributed execution or not. More CPUs help with distributed execution of large models.
Is i7 worth the wait?
What speedup does it promise, and at what price? The first release of any new hardware commands a premium that is readily paid by those with big needs and big budgets (think commercial applications in engineering and finance), but is likely to have serious bugs that are discovered later.
Are Xeon quadcores worth the money?
Maybe somebody who has used one can tell you.
mattmc88

Re: Fast NEURON on a budget

Post by mattmc88 »

Thanks Ted. I'm always impressed with the speed, detail, and expertise of your replies.

As for the type of simulation I am doing: Multi-compartment models derived from neurolucida reconstructions, with "passive" biophysics. I do plan to add active conductances with kinetic schemes at a later point. So, I guess my modeling falls into your "the latter" category. Though, it seems to me that with the simulated annealing optimization routine it should be possible to run multiple simulations at the same time (i.e., within each temperature step) while the routine explores the space. I don't know if this is possible with Andrew Davison's sim. anneal. routine (the one I am using). Do you?

As far as the "technical details of how shared memory is implemented," what should I look for, and what should I avoid?

So 64-bit linux will perform simulations faster than 32-bit linux?

As I understand it, a big change is that i7 gets rid of the bus interfacing the RAM with the processor. Wikipedia says this:

________
FSB is replaced by QuickPath interface. Motherboards must use a chipset that supports QuickPath. As of November 2008, only the Intel X58 does this.
On-processor memory controller: the memory is directly connected to the processor.
Three channel memory: each channel can support one or two DDR3 DIMMs. Motherboards for Core i7 have four (3+1) or six DIMM slots instead of two or four, and DIMMs should be installed in sets of three, not two.
Support for DDR3 only.
"Turbo Boost" technology allows the cores to intelligently "over clock" themselves to 133Mhz or 266Mhz over the design clock speed so long as the CPU's thermal requirements are still met.
Single-die device: all four cores, the memory controller, and all cache are on a single die.
Re-implemented Hyper-threading. Each of the four cores can process two threads simultaneously, so the processor appears to the OS as eight CPUs. This feature was present in the older Netburst architecture but was dropped in Core.
On-die, shared, inclusive 8MB L3 cache.
Only one QuickPath interface: not intended for multi-processor motherboards.
45nm process technology.
731M transistors.
Sophisticated power management can place unused core in a zero-power mode.
"Turbo mode" can increase the speed of one or two cores by 400Mhz when the other cores are turned off.
________


Can 7.x utilize combined multithreaded, multicore, and multiple CPU execution? For example, a computer with two quadcore processors, or even two of these i7 processors? What I am envisioning is a single machine with 2-4 dual- or quad-core processors. That way, each multithreaded multicore processor could run a multi-compartment simulation, quickly, and multiple simulations could be run at a time (one on each processor).

Unfortunately, I can't afford to "experiment" with this. I really need it to work.

Thanks again!!

Matt
ted
Site Admin
Posts: 5810
Joined: Wed May 18, 2005 4:50 pm
Location: Yale University School of Medicine
Contact:

Re: Fast NEURON on a budget

Post by ted »

I don't know if this is possible with Andrew Davison's sim. anneal. routine
If you want each processor to execute its own simulation, you will have to recode his optimizer to take advantage of the Linda-style master-worker parallelization that is supported by NEURON's ParallelContext class--see http://www.neuron.yale.edu/neuron/stati ... arcon.html

Multithreaded execution on shared memory architecture hardware distributes each simulation over multiple processors. It requires no changes to source code (other than to ensure that NMODL code is thread safe--see http://www.neuron.yale.edu/neuron/stati ... ml#Threads).
64-bit linux will perform simulations faster than 32-bit linux?
64 bit can accommodate larger models. One user reports faster execution viewtopic.php?f=5&t=1434 but without further details about the nature of the models or the simulations it is difficult to know what to make of that.
Can 7.x utilize combined multithreaded, multicore, and multiple CPU execution?
Yes. Multiple threads can be used even on single processor machines (although they will execute serially; any speedup will be from cache optimization).
For example, a computer with two quadcore processors
Yes, but an important question is whether each group of 4 CPUs can only share memory among themselves, or whether all 8 CPUs share a common memory--a detail that should be discoverable from the CPU maker's documentation. If the former, each group of 4 CPUs will handle a separate cell; if the latter, all 8 CPUs will handle a single cell.

You may want to read this:
Hines, M.L., Markram, H. and Schuermann, F.
Fully implicit parallel simulation of single neurons.
Journal of Computational Neuroscience
(preprint available from http://www.neuron.yale.edu/neuron/bib/nrnpubs.html)
mattmc88

Re: Fast NEURON on a budget

Post by mattmc88 »

With core i7 based systems commercially released last week, I thought I would reopen the question: what system, on the market right now, gives the most bang-for-the-buck running NEURON simulations? Is anyone else looking at buying a workstation/PC at the moment that has any suggestions? I've seen some pretty exciting benchmark results with the i7 processor. Has anyone tried running NEURON with an i7-based machine?

Thanks!

Matt
Post Reply