Clustering Linux Servers with the Concurrent Deployment of HP Serviceguard Linux and Red Hat Global File System for RHEL5, October 2008
10
Cluster Configuration System
The Cluster Configuration System (CCS) manages the cluster configuration and provides
configuration information to other cluster components in a Red Hat Cluster. CCS daemon runs in
each cluster node and makes sure that the cluster configuration file in each cluster is up to date.
When the cluster.conf is modified by the operator, the local CCS daemon broadcasts the new
cluster.conf file and CCS daemon on other cluster nodes replace their copy with the new one.
Also, when CCS daemon starts up, it broadcasts its cluster.conf to see if it needs to be replaced
by a newer version in use on other nodes. The cluster configuration file (/etc/cluster/cluster.conf)
is an XML file that describes the cluster characteristics and is stored locally on all nodes in the
cluster.
SAN failure
Loss of the storage system makes data unavailable and causes most services to failover to an
alternate node. Therefore, all storage systems must have redundant controllers and power
supplies. Multiple paths to shared storage are required so that the loss of storage path
connectivity does not require a failover between nodes but instead causes a failover to the
redundant path on the same node. If a node has only a single path to shared storage, then any
failure in that path may cause all packages relying on that shared storage to failover to another
node in the cluster.
Since, Red Hat Cluster does not fence a node when there is loss of access to storage, another
mechanism must be used to force a package failover when Serviceguard is running. In a
recommended Serviceguard configuration, there is a dual path to storage so two failures are
necessary to lose access to storage. For customers that want their systems to be able to survive
that dual failure, the disk monitor can be used in all the packages on a dual-cluster. This ensures
that when there is a node with failed FC link(s), the packages on this node will be moved to other
adoptive nodes. However, if Serviceguard attempts to move a package back to such a failed
node before the failure is fixed, it will fail again and move on to other adoptive nodes. For more
information, refer to the manual Using High Availability Monitors available at
http://docs.hp.com.
After the FC links are restored, the failed node must be rebooted, in order to, restore GFS mount
points.
Cluster management in a Dual Cluster
Automatic cluster startup at node startup
Red Hat cluster can be configured to startup at boot time by enabling services “cman”, “clvmd”
and “gfs” through chkconfig command. By enabling these the services, a node will attempt to join
the existing Red Hat cluster at boot time, and mount the GFS file system. Similarly in Serviceguard
cluster, by setting the AUTOSTART_CMCLD to 1, “cmcluster” service will startup on a node at
boot up. By starting the “cmcluster” service, a node will automatically join the existing
Serviceguard cluster at boot time.
At Serviceguard package startup time, the package control script verifies if the GFS file systems is
mounted, and if not, mounts it before starting the application. To have GFS file system mounted,
the Red Hat cluster services should be up on that node. Hence it is recommended to have Red Hat
cluster services, cman, clvmd and gfs started at node startup time. This is to ensure that the GFS