Optimizing Failover Time in a Serviceguard Environment, June 2007
The time needed for this step is determined by RAC, and the user cannot directly change it.
The process when failover is caused by a package failure
These are the steps in a failover that is caused by a package failure (rather than a node failure).
Figure 3. Steps in a failover caused by package failure—standard Serviceguard implementation
Resource
failure
detection
Package determination Resource
recovery
(VG, FS, IP)
Applications
recovery
Serviceguard component of failover time Application-dependent
failover time
Note: Diagram is not to scale.
The steps in the Serviceguard component of failover are:
• Resource failure detection—Serviceguard notices that a monitored service or resource is down.
• Package determination—Serviceguard decides whether, and where, to restart the packages.
With standard Serviceguard implementations, the steps in the application-dependent component are:
• Resource recovery—Using the Package Control Scripts, Serviceguard makes the resources available
to the packages.
• Application recovery—This time is to restart applications or processes that were moved to a new
node.
With Serviceguard Extension for RAC, failure of a RAC package does not trigger any failover action
from Serviceguard. If there is a failure, such as a database instance crash, the Serviceguard cluster
will keep running without re-forming, so there is no Serviceguard component to the failover.
Oracle RAC re-forms a new membership and performs database recovery.
Figure 4. Steps in a failover caused by package failure—Serviceguard Extension for RAC implementation
Group
membership
reconfiguration
RAC
reconfiguration and
database recovery
Application-dependent failover time
Note: Diagram is not to scale.
With RAC, the two application-dependent steps of failover are different from the steps with standard
Serviceguard implementations. They are:
• Group membership reconfiguration—If there is a change in membership, RAC will start
reconfiguration.
6