I’m back from a 3-day-training about clusters with Linux which was pretty exciting, and here are the main points which were covered :
- Vitualization with Xen
- Sharing data with GFS / GNBD
- Clusters with RedHat Cluster Suite
- Load Balancing with Linux Virtual Server (LVS)
Xen is a virtualization software, pretty similar to VMWare. Each student having only a single workstation, we used it to simulate multiple servers for load balancing and high availability.
Live migration was demonstrated (moving a running VM from a physical machine to another one without disrupting service) which is quite impressive (yes, VMWare ESX does this, this is known as VMotion)
GFS is a filesystem designed for use in a cluster environment. Basically, a GNBD server will export its storage (disk, partition, logical volume, whatever) over the network as a block device (just like with iSCSI) holding a filesystem. The filesystem could be pretty much anything but if you plan to mount it on multiple servers concurrently, then ext3 won’t do (you’ll get corrupted data / metadata). On the other hand, GFS is specifically designed for concurrent accesses, which makes it suitable for usage in a cluster environment.
A network exported GNBD block device will then be attached to multiple servers, which will locally mount the GFS filesystem.
Performance is much higher than NFS (as it is seen as a local block device, I/O cache can be managed efficiently by each client).
RedHat Cluster Suite
RedHat Cluster Suite is a set of tools to implement high-availability. Basically you’ll define your services, your cluster nodes, and what to do should a failure happen.
Cluster Suite will then act to try to restore service on its own (for example by starting the faulty service on another node).
Linux Virtual Server
LVS is what you need to implement load balancing. LVS is the basic networking layer for this (let’s say it manages the connections tables). You can use the CLI tools to set the parameters (add/remove nodes, set their weights and so on), but you want it to happen automatically.
This is where you’ll use need to use another set of tools such as PIRANHA. It will provide you with a web interface to define services, nodes, which algorithm to use to distribute load amongst nodes, how to check health of services and so on, as well as a daemon which will continuously check the health of the real servers. It will play on the fly with the parameters of LVS, which will then accordingly distribute incoming connections to the nodes of your cluster.
These are some of the basic tools you can use to build highly available or load balanced infrastructures. Their features are very close to the commercial alternatives, which makes them a possible alternative to those commercial products.
One thing I regret : LVS/Piranha and Cluster Suite are two distinct world, so you basically do twice most of the jobs (defining services, servers, how to check health). It would be really nice to have one single set of tools to manage both high availability and load balancing !