Log in

View Full Version : Linux Clusters


Akash Rastogi
11-09-2007, 00:23
Anyone have ideas for running Linux distro's on a cluster? My friend and I just wanted to set it up for fun but what should I install on it? I was thinking Debian or Gentoo.:confused: Any suggestions? Thanks.

11Mort11
12-12-2007, 17:14
maybe you can install it on a computer

cbhl
12-12-2007, 21:31
Heh, clustering is always fun -- even as proof-of-concept.

What kind of cluster are you planning on doing?

The choice of distribution often depends on how familiar you are with various software/packaging mentalities (rpm, deb, emerge, etc.), how willing you are to tinker with the innards (i.e. source code, compile flags, etc.), and how much time you have.

High-Performance Clustering (HPC) often comes in two flavours: openMosix/Mosix (in 2.4 most of the code was kept in the kernel) and beowulf/MPI/etc (which are usually per-application).

openMosix/Mosix code for Linux 2.6 is not fully-functional, last I checked, so I wouldn't recommend it's use (and AFAIK the 2.4 code is no longer easily found on newer distributions; only on the occasional outdated knoppix variant)

As for beowulf/MPI/etc. clusters, those require the "clustering" to be mostly application-level, which means it's only useful for select tasks e.g. 3D render farms.

High Availability (HA) clusters are often used where machines can take over for each other in the event that one fails (e.g. two web servers; one starts serving requests when the other goes down) -- tools like heartbeat are useful for this. This kind of cluster seems to have the most maintained code, if only because threading on SMP (a.k.a. dual-/tri-/quad- core) is slowly starting to replace the standard "cluster over network"/message-passing mentality found in the other clustering methods.

In terms of distributions, I've heard many good things about ROCKS (http://www.rocksclusters.org/) which is based on Red Hat Linux -- apparently it makes clustering relatively easy. (I've never tried it in its entirety personally, but I have used some of the software which is used within the distribution.)

Debian is interesting, because they have a very conservative release strategy -- packages in stable may be months (if not years) old. However, the ease-of-use of "apt-get" and related graphical tools (e.g. synaptic and adept) makes it an attractive choice. Configuring X requires some work on many machines, though. Clustering, as far as I know, isn't in any sort of default packages, so you'll need to do some work finding a clustering system (3rd-party separate packages which you'll need to install, programs you'll need to compile from source, a kernel recompile, or a combination of these will likely be required). Various HOWTOs can be found via Google, although some of the ones I saw were quite dated (e.g. 2005).

Gentoo recommends you build everything from source. This will take time and a significant amount of disk space (compared to e.g. a direct binary installation provided by Debian), and if done wrong can lead to a lot of cryptic messages and erratic behaviour. Done right, however, it results in a very speedy and customized installation. Again, various HOWTOs can be found by a Google search. (Note: I have not personally used Gentoo before. I have heard many very good things, and equally many very bad things about it -- YMMV (your mileage may vary).)

Disclaimer: I personally haven't used Debian since I switched the last of my machines to Ubuntu earlier this year. I have never used Gentoo, but I personally know people who have and I've done some reading on it. The last time I tried to do "clustering", I used a combination of HTTP, XML-RPC, custom-written perl (server), and python (client) to render frames in Blender.

Akash Rastogi
13-12-2007, 10:09
maybe you can install it on a computer

lol, i know it can be installed on 1 computer, but that defeats the purpose of a cluster now doesn't it?

EDIT: Thanks for all your info Michael. I think I'm going to keep Debian in mind but also keep looking at other distro's.

AustinSchuh
13-12-2007, 13:56
Disclaimer: I personally haven't used Debian since I switched the last of my machines to Ubuntu earlier this year.

I currently use Debian, and besides the occasional application that is way out of date, I love it. (I solve this for most peripheral applications by downloading the source package from testing and compiling it for stable)

I have not done any cluster stuff using Debian though. I am interested in hearing about what you end up doing and how it goes. Keep us posted.

cbhl
13-12-2007, 15:49
EDIT: Thanks for all your info Michael. I think I'm going to keep Debian in mind but also keep looking at other distro's.

Good luck. I used Debian for a few years and loved it. The only reason I switched to Ubuntu was because I suddenly came across maybe three or four computers, each with different monitors, and I was getting kind of tired of configuring X individually on each one. :D ;)