HP Matrix Operating Environment 7.
Legal Notices © Copyright 2005-2012 Hewlett-Packard Development Company, L.P. Confidential computer software. Valid license from HP required for possession, use, or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor’s standard commercial license. The information contained herein is subject to change without notice.
Contents Preface........................................................................................................5 Publishing history......................................................................................................................5 1 Overview..................................................................................................6 2 Installing and configuring Matrix recovery management..................................8 Installation and configuration overview...........
5 Issues, limitations and suggested actions......................................................41 Limitations.............................................................................................................................41 No integrated support for 3PAR remote copy in asynchronous mode.........................................41 No automatic synchronization of configuration between sites...................................................
Preface The HP Matrix Operating Environment 7.0 Recovery Management User Guide contains information on installation, configuration, testing, and troubleshooting HP Matrix Operating Environment recovery management (Matrix recovery management). Publishing history The latest publication date and part number indicate the current edition. Table 1 Publishing History Publication Date Part Number Edition Changes February 2012 5900–2035 1 This is the initial release of the HP Matrix Operating Environment 7.
1 Overview Matrix recovery management is a component of the HP Matrix Operating Environment. Matrix recovery management provides disaster recovery protection for logical servers configured and managed by HP Matrix OE visualization. Logical servers that are included in a Matrix recovery management configuration are referred to as DR Protected logical servers. A DR protected logical server can run on a physical machine (C-class blade), or on a virtual machine hosted by a hypervisor.
• Supports HP P9000 Continuous Access Software (formerly known as HP StorageWorks Continuous Access XP) storage replication in synchronous, asynchronous, and asynchronous journal mode. • Supports HP 3PAR remote copy in synchronous mode. • Supports integration with the remote failover features of user defined storage adapters for storage types other than HP P6000, HP P9000, or HP 3PAR.
2 Installing and configuring Matrix recovery management Installation and configuration overview The following Matrix recovery management installation and configuration overview includes links to information on each step in the process. 1. Confirm that all Matrix recovery management installation and configuration prerequisites have been met — see “Installation and configuration prerequisites” (page 8) for more information. 2.
Networking setup It is assumed that networking links are present between the Local Site and the Remote Site. You can use Matrix recovery management in a variety of networking configurations, but it is important that you take note of the following Matrix recovery management networking configuration parameters: • Matrix recovery management assumes that the Local and Remote Sites operate in a mode with DR protected workloads running simultaneously at both sites.
feature. For example, on the Local Site CMS, exclude addresses from 00-21-5A-9B-00-00 to 00-21-5A-9B-FF-FF, and on the Remote Site CMS exclude addresses from 00-21-5A-9C-00-00 to 00-21-5A-9C-FF-FF. Storage setup Matrix recovery management depends on storage array replication to enable failover of logical servers. It is assumed that storage replication links are present between the Local Site and the Remote Site.
• Storage Replication Group name given for the boot and data LUNs of the logical servers that will be part of the same Recovery Group, for example, the HP P6000 DR Group name. NOTE: HP P6000, HP P9000, and User Defined Storage Replication Groups must use the same Storage Replication Group name at the Local Site and the Remote Site. NOTE: HP 3PAR remote copy Storage Replication Groups will have different names at the Local Site and the Remote Site.
on HP P9000 RAID Manager Software to manage P9000 storage replication. HP P9000 RAID Manager Software instances and configuration files must be configured to manage various device groups that are configured in Matrix recovery management. For more information, refer to: • ◦ HP P9000 Cluster Extension Software documentation available at: http:// h20000.www2.hp.com - click on Manuals, then go to Storage –> Storage Software –> Storage Replication Software –> HP Cluster Extension Software.
Creating and installing a User Defined storage adapter Matrix recovery management provides a User Defined storage adapter interface specification to enable one-step Matrix recovery management failover capability for storage types that are supported by Matrix OE, but not yet integrated with Matrix recovery management.
User Defined storage adapter interface specification The following three commands are defined in the User Defined storage adapter interface specification: • validatesms.cmd — validates a Storage Management Server during configuration. • validatesrg.cmd — validates a Storage Replication Group during configuration. • failoversrg.cmd — fails over a Storage Replication Group while Recovery Group activation occurs. Command line arguments: • For validatesms.
Example invocations of Matrix recovery management User Defined adapter implementation • During Storage Management Server configuration in Matrix recovery management: /STORAGE/EMC/validatesms.cmd sms_name=EMC_SMS1 sms_username=admin • During Storage Replication Group configuration in Matrix recovery management: /STORAGE/EMC/validatesrg.
NOTE: You cannot change the datastore of a VM hosted logical server while it is being managed by Matrix recovery management. To change the datastore, first remove the VM hosted logical server from the Matrix recovery management configuration, then use the Logical Servers Activate operation in the Tools menu of the Visualization tab to change the datastore.
NOTE: Do not activate the recovery logical servers at this time. During the Matrix recovery management configuration process, the recovery logical servers will be further configured in the configuration import process at the Remote Site - there will be an opportunity to activate and deactivate the recovery logical servers at this time.
• Recovery Groups Create or import Recovery Groups, edit or delete existing Recovery Groups, view Recovery Group configuration details. • Jobs Monitor Job progress, cancel Jobs in progress, delete completed Jobs, restart failed Jobs, view Job and Sub Job logs. The Matrix recovery management online help system and tooltips provide answers to questions you may have while using the graphical user interface.
5. 6. 7. 8. 9. From the Sites tab, create an export file at the Local Site. For information on export and import parameters, see “About Matrix recovery management export and import” (page 19) From the Sites tab at the Remote Site, import the Local Site Matrix recovery management configuration at the Remote Site. For more information see: “About Matrix recovery management export and import” (page 19) Test the recovery logical servers.
• Storage Replication Group information associated with activated Recovery Groups in the exportconfig file is imported, if the importing site has no Storage Replication Group configuration. • If the importing site already contains Storage Replication Group information: ◦ If the Storage Replication Group is included in another Recovery Group that belongs to the same Recovery Group Set, the import will be allowed.
3 Testing and failover operations This chapter contains sections on testing Recovery Groups, planned failovers and unplanned failovers using the Activate... and Deactivate... operations. Testing Recovery Groups There are two ways to test Recovery Groups: • Test individual Recovery Groups using Maintenance Mode • Perform a planned failover to test all Recovery Groups (see “Planned failover” (page 22) for more information) This section focuses on testing individual Recovery Groups using Maintenance Mode.
6. 7. At the Remote Site, take the Recovery Group out of Maintenance Mode using the Disable Maintenance Mode button in the Matrix recovery management Recovery Groups tab. At the Local Site, repeat the storage failover, rescan and refresh (if needed) sequences, then activate the logical servers in the Recovery Group using Matrix OE visualization.
NOTE: A successful activate or deactivate operation ensures that all of the Recovery Groups within a Recovery Group Set are in the same state (enabled or disabled). However, certain operations (for example, a Recovery Group edit to change site preference) may result in some Recovery Groups within a Recovery Group Set being enabled and others being disabled.
NOTE: Matrix recovery management is able to prevent split-brain from occurring during an unplanned failover, by regulating the auto-power configuration of managed nodes (whether virtual or physical) that are assigned to DR Protected logical servers so that they do not automatically power up after an outage. If, for example, a site loses power and site failover is invoked, the site where the power outage occurred will not resume running the DR Protected logical servers when power is restored.
4 Dynamic workload movement with CloudSystem Matrix: Fluid movement between physical and virtual resources for flexibility and cost-effective recovery The HP Matrix Operating Environment facilitates the fluid movement of workloads between dissimilar servers within a site and across sites. Workloads can be moved between physical servers and virtual machines and between dissimilar physical servers.
hosted by a smaller set of servers. Recovery time objectives require that the moves be achieved quickly and automatically, as in the previous example. Capabilities and limitations Using the tools and procedures described in this chapter you can: • Configure and manage a logical server that can perform physical to virtual cross-technology movements within the datacenter. • Configure and manage a DR-protected logical server that can be failed over across data centers in a cross-technology movement.
• The network name used by an ESX Host must match the network name used in the Virtual Connect Enterprise Manager (VCEM) configuration, as displayed in the following screens: Capabilities and limitations 27
• • When moving a logical server between physical and virtual servers within a site, the following server IDs are not preserved: ◦ Network MAC addresses ◦ Server/Initiator WWNs (on the virtual machine, the storage adapter is a virtual SCSI controller) ◦ Logical Serial Number ◦ Logical UUID In a DR configuration, the site that you configure first must have both physical servers and virtual machine hosts available so a logical server can be configured to run on both types of servers, and tested.
The procedures for enabling movement across different physical servers documented in this chapter are supported for managed systems specified as supported by the HP Matrix Operating Environment, with the following restriction: • Matrix recovery management, the component of the HP Matrix Operating Environment that provides disaster recovery across sites does not support Integrity managed nodes.
3. Configure the logical server for activation on both physical and VM host targets Modify the logical server configuration as follows: • In the Create logical server: identity screen, set the portability group of the logical server to the portability group created in Step 2 above. • In the Create logical server: storage screen, select a VM data store (this data store will be used to store VM configuration information).
b. At the Local Site, create a Recovery Group containing the logical server. Export the Matrix recovery management configuration to a file. c. Deactivate the logical server at the Local Site and failover the array Replication Group to the Remote Site. Perform VM host rescan and Matrix OE visualization refresh procedures to ensure that the VM configuration data store is accessible to the logical server configuration. Create a portability group as at the Local Site.
Configuring logical servers for movement between dissimilar physical servers The HP Matrix Operating Environment provides the ability to fine tune the list of failover targets that are considered most suitable for a DR protected logical server to be activated on. The ability to modify target attributes is useful to ensure a successful failover.
The command line interface for PISA is described below. The options are mutually exclusive. PISA will run on supported versions of Windows only, and requires that the user be a member of the Administrator user group. Usage: hppisa -h, -?, -help Show this information -e, -enable Enable the LSI driver -d, -disable Disable the LSI driver Once these changes are made, the OS image can be moved back and forth between physical servers and virtual machines.
You can also create user-defined portability groups that extend the portability of a logical server to unlike technologies. For example, moving logical servers between a Virtual Connect physical server and a VMware ESX virtual machine host. User defined portability groups are defined by selecting Modify -> logical server Portability Groups in Matrix OE visualization. If you have selected one or more targets in Matrix OE visualization, they are then presented as potential targets.
Provide the portability group a name and optional description – the name will be used when defining logical servers. The set of Group Types is selected automatically based on the targets inserted into the portability group. Valid combinations of targets include: • A single Virtual Connect Domain Group (VCDG) • A set of ESX Hypervisors • A set of Hyper-V Hypervisors • A set consisting of a single VCDG plus a set of ESX Hypervisors.
The portability group for any logical server can easy be seen by clicking the View Movable logical server details icon in Matrix OE visualization. The details for this logical server will then be displayed. Logical servers can be made portable through techniques described earlier in this chapter. NOTE: Only you can determine if the provisioned operating system within a logical server truly performs as desired on a variety of platforms.
Storage definition Storage can be defined via Storage Pool entries or Storage Entries tied directly to a logical server. Cross-technology logical servers require their storage to be SAN-based. This approach uses the normal SAN-boot approach within Virtual Connect and leverages ESX Raw Disk Mapping (RDM) technology which presents boot and data LUNs directly to the virtual machine. When defining storage for a portable logical server, you must select SAN Storage Entry.
Moving between technologies Activation and movement of cross-technology logical servers is accomplished in the same way as with standard logical servers. However, the Unlike Move operation is used for cross-technology logical servers when an activate or move operation is about to be performed on a server with a different underlying technology from its previous target host. Targets for a logical server are selected from that logical server's portability group.
Target attributes You can track where a logical server has been successfully activated or moved in the past using logical server target attributes. Target attributes provide a greater number of “most suitable” targets where you can activate or move a logical server. You can view or modify target attributes on a logical server by selecting the logical server and then clicking on Modify -> logical server Target Attributes -> Manage.
Setting failover target type preference During a site failover, for every logical server that has been configured with disaster protection, Matrix recovery management activates a similarly configured peer logical server at the recovery site. For this purpose, Matrix recovery management interacts with the Matrix OE logical server management capability to determine a list of appropriate available targets, and chooses the most suitable target to activate the logical server.
5 Issues, limitations and suggested actions Issues and limitations of this release are listed below. The following categories are used: Limitations Limitations of the implemented functions and features of this release. Major issues Issues that may significantly affect functionality and usability in this release. Minor issues Issues that may be noticeable but do not have a significant impact on functionality or usability.
Suggested actions: • For ESX3.X hosts, set Lvm.DisallowSnapshotLun to 0 using Virtual Center → Configuration → Advanced Settings. • For ESX4.X hosts, to mount the Local Site datastore with an existing signature, refer to the chapter titled Mount a VMFS Datastore with an Existing Signature in the ESX Configuration Guide Update 1, ESX 4.0, vCenter Server 4.0 available at: http://www.vmware.com/pdf/ vsphere4/r40_u1/vsp_40_u1_esx_server_config.
managing the HP P9000 Device groups. Each HP P9000 Continuous Access Software Storage Replication Group configured in Matrix recovery management will be managed by one HP P9000 RAID Manager Software instance at each site. Suggested actions: There is no workaround for this issue.
6 Troubleshooting This chapter is divided into three sections: • “Configuration troubleshooting” (page 44) • “Matrix recovery management troubleshooting” (page 49) • “Matrix recovery management log files” (page 53) Configuration troubleshooting To troubleshoot Matrix recovery management configuration operations, take note of any on-screen error messages, then review this section for relevant information. You can also view the mxdomainmgr log files for additional information.
• Unable to add or edit HP P6000 Storage Replication Group Possible causes include: ◦ • Unable to obtain Storage Replication Group information from Command View servers to validate the Storage Replication Group information provided by the user. Unable to add or edit HP P9000 Storage Replication Group Possible causes include: • ◦ The Storage Replication Group is not configured to be managed by the RAID manager instances.
• No configuration operation can be run Possible causes include: • ◦ An Activate... or Deactivate... operation is in progress. ◦ Another configuration operation may be in progress Unable to import Storage Management Servers as part of import operation Possible causes include: • ◦ The Storage Management Server was not discovered in the HP Matrix Operating Environment user interface.
Error message Error: Invalid storage manager username and/or domain name. Cause The sign-in credentials for the server stored in the HP Matrix Operating Environment user interface do not include the user name specified as part of the Storage Management Server configuration operation in Matrix recovery management. Action For the server specified, ensure that the sign-in credentials stored in the HP Matrix Operating Environment user interface include the credentials for the user name specified.
Error message Unable to add CLX credentials for this HP 3PAR storage system. Cause The encrypted password file for the corresponding HP 3PAR storage system is incorrect or the password for the storage system has been changed and hence cannot be authenticated using the existing password file. Action Ensure the correct password file is present in the /storage/3par/conf directory where Matrix recovery management is installed for both the local and remote HP 3PAR storage systems.
Error message Import Failed. Cause Possible causes include an invalid import file, HP Logical Server Automation service is not running, or one or more recovery logical servers are in an active state. Action Ensure that a valid file exported from the Local Site is used to import the Matrix recovery management configuration at the Remote Site. Confirm that the HP Logical Server Automation service is running on the CMS.
For a failed Job, click the check-box next to the Job Id to get detailed information about the associated Sub Jobs. A site Job contains a Sub Job for each Recovery Group. Similarly, each Recovery Group has Sub Jobs for its Storage Replication Group and logical server, respectively.
NOTE: Restarting the Job will only retry Sub Jobs that previously failed; servers associated with completed Jobs or Sub Jobs are already up and running and will not be impacted. IMPORTANT: If correcting the problem that caused the Job to fail included re-configuration of logical server(s), before you restart the Job, go to the Recovery Groups tab and delete the Recovery Group(s) that contain the re-configured logical server(s).
• Matrix recovery management job failed because of unlocatable logical server in Matrix OE logical server management. Possible causes include: ◦ • A logical server managed by Matrix recovery management was removed from Matrix OE logical server management, before it was unmanaged in Matrix recovery management.
Matrix recovery management log files There are several log files available with detailed information that you can view to help identify the sources of Matrix recovery management failover or failback problems: • For errors that occur during the initial Matrix recovery management configuration steps, check the mxdomainmgr(0).log file located in the logs directory where HP Systems Insight Manager is installed on the system.
7 Support and other resources Information to collect before contacting HP Be sure to have the following information available before you contact HP: • Software product name • Hardware product model number • Operating system type and version • Applicable error message • Third-party hardware or software • Technical support registration number (if applicable) How to contact HP Use the following methods to contact HP technical support: • See the Contact HP Worldwide website: http://www.hp.
Warranty information HP will replace defective delivery media for a period of 90 days from the date of purchase. This warranty applies to all HP Insight Management products. HP authorized resellers For the name of the nearest HP authorized reseller, see the following sources: • In the United States, see the HP U.S. service locator website at: http://www.hp.com/ service_locator • In other locations, see the Contact HP worldwide website at: http://welcome.hp.com/country/ us/en/wwcontact.
• Matrix recovery management White Papers Matrix recovery management white papers are available at: http://www.hp.com/go/matrixoe/ docs • Matrix recovery management Online Help The Matrix recovery management online help system provides information on operations that are performed from the Matrix recovery management user interface. It is accessible from the Matrix recovery management user interface and from the Help menu on the HP Matrix Operating Environment home page.
Glossary CMS HP Systems Insight Manager (HP SIM) Central Management Server — A system in the management domain that executes the HP SIM software. All central operations within HP SIM are initiated from this system. consistency group Consistency groups are an important property of asynchronous mode volumes. A consistency group is a group of LUNs that need to be treated the same from the perspective of data consistency (I/O ordering).
split-brain Split brain occurs when two or more instances of the same application are active simultaneously - possibly leading to data corruption. Storage Management Servers As part of the Matrix recovery management configuration process, servers that manage HP P6000 storage devices and servers that manage HP P9000 storage devices must be defined. These servers are referred to as Storage Management Servers.