HP Global Workload Manager 7.1 User Guide Abstract This document presents an overview of the techniques and tools available for using Global Workload Manager servers (or gWLM). It exposes you to the essentials and allows you to quickly get started with gWLM. This document is intended to be used by HP Matrix Operating Environment system administrators, application administrators, and other technical professionals involved with data center operations, administration, and planning.
© Copyright 2004, 2012 Hewlett-Packard Development Company, L.P. Legal Notice Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor’s standard commercial license. The information contained herein is subject to change without notice.
Contents 1 Overview..................................................................................................6 gWLM Overview......................................................................................................................6 Benefits of using gWLM.............................................................................................................6 Comparison of PRM, WLM, and gWLM features..........................................................................
5 Additional configuration and administration tasks.........................................33 Manually adjusting CPU resources............................................................................................33 Manually adjusting memory resources.......................................................................................34 Setting aside space for historical data.......................................................................................
Rare incompatibility with virtual partitions.............................................................................55 Workloads in gWLM do not follow associated Serviceguard packages.....................................55 Host name aliases are not supported....................................................................................55 Making a configuration change to a large SRD is slow...........................................................
1 Overview This chapter provides an overview of gWLM, including benefits, key concepts and terms, and the gWLM management model. gWLM Overview gWLM allows you to centrally define resource-sharing policies that you can use across multiple HP servers. Using these policies can increase system utilization and facilitate controlled sharing of system resources. In addition, gWLM provides both real-time and historical monitoring of the resource allocation.
Concepts and terms for using gWLM Here are some concepts and terms to know when using gWLM: Workload The collection of processes executing within a single compartment. The compartment can be an nPartition (npar), a virtual partition (vPar), a Virtualization Services Platform (VSP), a virtual machine provided by HP Integrity Virtual Machines (hpvm), a processor set (pset), or a Fair Share Scheduler (fss) group. gWLM manages a workload by adjusting the system resource allocations for its compartment.
Mode Two modes are available: advisory and managed. Advisory mode allows you to see what CPU resource requests gWLM would make for a workload—without actually affecting resource allocation. Advisory mode is not available for SRDs containing virtual machines, psets, or fss groups due to the nature of these compartments. Use this mode when creating and fine-tuning your policies.
Undeploy Disable gWLM’s management of resources in a specified SRD. If an SRD is in managed mode, undeploying stops the migration of system resources among workloads in the SRD. If the SRD is in advisory mode, gWLM no longer provides information on what requests would have been made. The gWLM management model gWLM enables utility computing across a data center by providing resource-sharing policies that you centrally create and monitor.
For more information on these system divisions, visit: • HP Matrix Operating Environment website: http://www.hp.com/go/matrixoe/integrity • The “Technical Documentation website for HP Matrix Operating Environment” website: http://www.hp.com/go/matrixoe/docs • The “Global Workload Manager” topic and the glossary in the online help for gWLM, available in gWLM’s graphical interface in System Insight Manager. gWLM manages resources based on the following model: 1. You define an SRD by: a.
Table 1 Default weights by policy type (continued) Policy type Default weight (You cannot deploy an SRD where all the workloads with fixed policies are not satisfied.) Utilization 1 OwnBorrow Equal to its owned value Custom 1 NOTE: To ensure CPU resource allocations behave as expected for OwnBorrow policies, the sum of the CPU resources owned cannot exceed the number of cores in the SRD.
2. Initialize the CMS by running the vseinitconfig command. For more information, see vseinitconfig(1M). 3. 4. Decide which systems will be your managed nodes, then install the gWLM agent software on those systems. (The agent software is free, but it is functional only for a limited time. For unlimited use, purchase the agent license to use, or LTU.) On each managed node, start the gWLM agent daemon gwlmagent.
Table 2 Where to find additional information (continued) To... See... Learn about using gWLM with HP Serviceguard. The HP Matrix Operating Environment documentation website at http://www.hp.com/go/matrixoe/docs Learn more about nPars, vPars, virtual machines, • HP Integrity Virtual Machines website: and psets. www.hp.com/go/hpux-vpars-docs • HP Matrix Operating Environment documentation website: http://www.hp.com/go/matrixoe/docs • The HP Matrix Operating Environment for HP-UX product website: http://www.
2 Configuring gWLM to manage workloads This chapter describes the various aspects of configuring gWLM to effectively manage the resources for your workloads. Policy types You can define several types of policies to instruct gWLM how to manage the resources for your workloads. These types are: Fixed Allocates a fixed (constant) amount of CPU resources to a workload’s compartment. gWLM satisfies these policies before attempting to satisfy any other type of policies.
gWLM provides conditional policies that can detect the following HP Serviceguard conditions: • SgReducedClusterCapacity: Detects whether any cluster members are missing from the HP Serviceguard cluster associated with the host of the workload. • SgNonPrimaryPackagePresent: Detects whether any HP Serviceguard package that is active on the host of the workload does not have the host configured as its primary node.
Choosing a policy type How do you decide which policy type to use? Table 3 answers this question for several common use cases. The section following the table helps you decide between using an OwnBorrow policy or a utilization policy. Table 3 Choosing a policy type If... Use the following type of policy... You want gWLM to allocate a constant amount of CPU resources to a workload. Fixed You have your own metric by which you want gWLM to manage a workload.
Combining the different policy types Each workload in an SRD must have a policy. Starting with gWLM A.02.00.00.07, you can use any combination of the policy types within an SRD. Seeing how gWLM will perform without affecting the system gWLM provides an advisory mode that allows you to see how gWLM will approximately respond to a given SRD configuration—without putting gWLM in charge of your system’s resources. Using this mode, you can safely gain a better understanding of how gWLM works.
NOTE: You must be logged in as root on the systems where you run the mxstart, gwlmcmsd, and gwlmagent commands mentioned below. In System Insight Manager, you must be logged in as root or have authorizations for “All Tools” or “Matrix OE All tools.” 1. 2.
NOTE: You must be logged in as root on the systems where you run the mxstart, gwlmcmsd, and gwlmagent commands mentioned below. In System Insight Manager, you must be logged in as root or have authorizations for “All Tools” or “Matrix OE All tools.” 1.
Common uses for gWLM gWLM is a powerful tool that allows you to manage your systems in numerous ways. The following sections explain some of the more common tasks that gWLM can do for you. Fixing the amount of CPU resources a workload gets gWLM allows you to give a workload a fixed amount of CPU resources. This fixed amount is in the form of a set amount of CPU resources given to an npar, a vpar, a VSP, a virtual machine, a pset, or an fss group.
Resizing a workload’s npar, vpar, VSP, virtual machine, pset, or fss group as needed To ensure a workload gets the CPU resources it needs—while also allowing resource sharing when possible—gWLM provides OwnBorrow policies. With such a policy, you indicate the amount of CPU resources a workload should own. The workload is then allocated this owned amount of CPU resources—when it needs it.
Changing from advisory mode to managed mode Advisory mode allows you to see what CPU resource requests gWLM would make for a workload—without actually affecting resource allocation. (Advisory mode is not available for SRDs containing virtual machines, psets, or fss groups due to the nature of these compartments.) Managed mode, however, allows gWLM to automatically adjust the resource allocations for your defined workloads.
where hostname represents the hostname of the CMS. 3. From the System Insight Manager menu bar, select: Tools→HP Matrix OE visualization... and then click the Shared Resource Domain tab. 4. From the HP Matrix OE visualization menu bar, select: Policy→Create gWLM Policy... 5. 6. Edit the settings, selecting a policy type and specifying the required values and optional values as desired. Click OK. Editing a policy A policy instructs gWLM how to manage a workload’s resources.
where hostname represents the hostname of the CMS. 3. From the System Insight Manager menu bar, select: Tools→HP Matrix OE visualization... and then click the Shared Resource Domain tab. 4. 5. 6. Select the shared resource domain containing the workload for which you want to change the policy. Select the workload for which you want to change the policy. From the HP Matrix OE visualization menu bar, select: Policy→Change Associated gWLM Policy... 7. 8.
1. 2. Ensure in System Insight Manager, the gWLM CMS daemon or service (gwlmcmsd), and all the gWLM agents (gwlmagent) are still running, as explained in the section “Setting up gWLM (initial setup steps)” (page 21). Connect to System Insight Manager by pointing your web browser to: http://hostname:280 where hostname represents the hostname of the CMS. 3. Associate fixed policies with all workloads that you want to unmanage that are based on nPars or vPars.
NOTE: When gWLM manages a VSP, it sets the OLSTARPOLICY policy parameter of the VSP to GUEST. The status of the OLSTARPOLICY policy parameter does not change even after the SRD is undeployed. Quick Link Option In the previous procedure, instead of selecting an SRD and using the HP Matrix OE visualization menu bar, you can find the Details table for the SRD and click the Undeploy SRD link.
3 Monitoring workloads and gWLM This chapter describes how to monitor workloads and gWLM. Monitoring workloads There are several methods for monitoring workloads, as described below. High-Level view To see a high-level view of the performance of your SRDs and workloads: 1. From the System Insight Manager menu bar, select: Tools→HP Matrix OE visualization... 2. Click the Shared Resource Domain tab.
Monitoring gWLM from the command line There are several command-line tools for monitoring gWLM. These commands are added to the path during installation. On HP-UX systems, the commands are in /opt/gwlm/bin/. On Microsoft Windows systems, the commands are in C:\Program Files\HP\Virtual Server Environment\bin\gwlm\ by default. However, a different path might have been selected at installation.
Table 4 gWLM log files (continued) Log for Location Windows: C:\Program Files\HP\Virtual Server Environment\logs\gwlm.log.0 gwlm command HP-UX: /var/opt/gwlm/gwlmcommand.log.0 Windows: C:\Program Files\HP\Virtual Server Environment\logs\gwlmcommand.log.0 NOTE: On systems running Windows, log files are in C:\Program Files\HP\Virtual Server Environment\logs\ by default. However, a different path might have been selected at installation. The name of the current log always ends in .log.0.
Viewing HP Systems Insight Manager events gWLM allows you to configure a number of events you can monitor through System Insight Manager. Set these events in System Insight Manager as follows: 1. From the System Insight Manager menu bar, select: Tools→HP Matrix OE visualization... 2. 3. Click the Shared Resource Domain tab.
4 Security This chapter highlights several security items you should be aware of. General security topics The following items are a few general topics on security: • HP provides the HP-UX Bastille product, available from http://software.hp.com at no charge, for enhancing system security. • You can secure gWLM’s communications as explained in the following section. • System Insight Manager allows you to create user roles with different levels of privileges.
3. Set the following properties as desired: • oracle.net.encryption_types_client • oracle.net.crypto_checksum_types_client For more information on these properties, read their associated comments in the gwlmcms.properties file. 4. 5. 6. Ensure the Oracle listener and port being used by System Insight Manager is configured to accept secure communication for the encryption and checksum types specified in the previous step.
5 Additional configuration and administration tasks This chapter covers various configuration and administration tasks. Manually adjusting CPU resources When an SRD is created, it has a certain number of cores. gWLM manages the SRD using the same number of cores. If the SRD—or a policy used in the SRD—is configured to use Temporary Instant Capacity (TiCAP), gWLM can automatically activate that additional capacity to meet policies.
gWLM cannot take advantage—even temporarily—of resources added by: • Adjustments to entitlements for virtual machines. • Changes to a virtual machine's number of virtual CPUs while gWLM is managing the virtual machine. • Creation or deletion of a pset using psrset on a system where gWLM is managing pset compartments. • Performing online cell operations using parolrad. • Enabling and disabling Hyper-Threading. To make use of these additional resources using the gWLM command-line interface: 1.
Setting cache size for historical configuration data The gWLM CMS daemon or service (gwlmcmsd) maintains a cache of historical configuration data of workloads. If there are huge number of workloads on the CMS, to avoid the gwlmcmsd running out of heap space, reduce the size of the cache by setting the com.hp.gwlm.cms.cachesize property to a lower value. The com.hp.gwlm.cms.cachesize is part of the gwlmcms.properties file. The gwlmcms.
If you are using Transact-SQL to create the maintenance plan, use the following SQL statements: USE gwlm GO ALTER INDEX ALL ON gwlm_config_wkld REBUILD GO Setting gWLM properties gWLM provides two properties files that allow you to control various gWLM behaviors. One file is for the CMS daemon or service, and the other is for use on all the managed nodes. Read the files for information on the behaviors they control. CMS properties The CMS properties are in /etc/opt/gwlm/conf/gwlmcms.
# Specify the size (in MB) and number of files to use # for logging. For a single file of unlimited size, set # logFileSize to negative one (logFileSize=-1). # Otherwise, total log file size is # logFileSize * logNFiles # com.hp.gwlm.util.Log.logFileSize = 20 com.hp.gwlm.util.Log.logNFiles = 3 # # Support for automatic database statistics gathering. These properties # control how often row-level statistics are gathered from the database in # order to optimize performance. # # com.hp.gwlm.cms.db.analyze.
# # Support for real-time graphing properties. # # viewport: # The size of the displayed real-time graph (in minutes). # # refresh: # The refresh rate of the real-time graphs and tables (in seconds). # com.hp.gwlm.ui.monitor.viewport = 20 com.hp.gwlm.ui.monitor.refresh = 15 # # Support for securing Oracle communication. # # com.hp.gwlm.jdbc.oracle.secure: # Whether communication with Oracle server is secure or not. Possible # values are 'on' and 'off'. Default is off. # # oracle.net.
# SEVERE # WARNING # INFO # CONFIG # FINE # FINER # FINEST # When you set the level, you will see messages only from that level and # the levels that are more severe. So, the SEVERE level produces the fewest # messages, while the FINEST level includes messages from all seven levels. # com.hp.gwlm.util.Log.logLevel = INFO # # Specify the size (in MB) and number of files to use # for logging. For a single file of unlimited size, set # logFileSize to negative one (logFileSize=-1).
com.hp.gwlm.node.port = portY to both properties files: • gwlmcms.properties On HP-UX, this file is in /etc/opt/gwlm/conf/. On Windows, it is in C:\ Program Files\HP\Virtual Server Environment\conf\. (The given Windows path is the default; however, a different path might have been selected at installation.) • /etc/opt/gwlm/conf/gwlmagent.properties The portX and portY values cannot be the same value. The com.hp.gwlm.cms.
# secure operating environment.) # # NOTE: GWLM_CMS_START=0 prevents automatic use at boot of # HP Matrix OE visualization and # HP Capacity Advisor. GWLM_CMS_START=0 # Set GWLM_AGENT_START to 1 to have the init process start the gWLM agent # daemon. (HP recommends setting this variable to 1 only when used in a # secure operating environment.) GWLM_AGENT_START=0 # Set GWLM_HOME to the location where gWLM is installed. # Default is /opt/gwlm.
the CMS to have the CMS re-deploy its view of the SRD. If the CMS cannot be contacted, the SRD in the deployed.config file is deployed as long as all nodes agree. In general, when an SRD is disrupted by a node’s going down, by a CMS's going down, or by network communications issues, gWLM attempts to reform the SRD. gWLM maintains the concept of a cluster for the nodes in an SRD. In a cluster, one node is a master and the other nodes are nonmasters.
# /opt/gwlm/bin/gwlmagent --restart Manually clearing an SRD If gWLM is unable to reform an SRD, you can manually clear the SRD, as described in the following section. Clearing an SRD of A.02.50.00.04 (or later) agents The following command is an advanced command for clearing an SRD. The recommended method for typically removing a host from management is by using the gwlm undeploy command. Starting with A.02.50.00.
Nesting partitions gWLM allows you to form SRDs consisting of various compartment types. This ability provides flexibility in dividing your complex. For example, you can divide your complex as shown in Figure 2. The complex has five nPars, two of which are divided into vPars. One npar is hosting virtual machines, fourth npar is not divided, and the fifth nPar is hosting vPars.
Changing the gWLM resource allocation interval The frequency of gWLM’s changes in the CPU resource allocations is an attribute of the SRDs. Once you create an SRD, you can change how often gWLM adjusts the CPU resource allocations of the workloads in that SRD using either of the methods discussed in the following sections.
processes, each consuming an entire logical CPU, the reported utilization depends on where those processes are. If the processes are on only two cores, the utilization is 50% (2/4). With the processes distributed across all four cores though, each process can consume an entire core, resulting in a utilization of 100%. When fss groups are being used, gWLM disables Hyper-Threading for the default pset, where fss groups are created, to optimize workload performance.
Figure 3 shows a management LAN in which the hosts are known as mgmtA, mgmtB, mgmtC, and mgmtD. With this management LAN, gWLM can manage the hosts in a single SRD. Complete the following procedure to set up gWLM to manage such hosts in an SRD: 1. For each host in the management LAN that you want to manage in an SRD: a. Edit the /etc/opt/gwlm/conf/gwlmagent.properties file to include the following property: com.hp.gwlm.security.
This issue is most often a concern when a host is connected to both of the following items: • A corporate LAN/WAN via one network interface card and IP address • A second, private internal network and private IP address for communicating with a certain other set of hosts (such as cluster members) Global Workload Manager attempts to detect and report network configuration issues that can cause undesirable behavior, but in some cases this detection occurs in a context that can be reported only into a log
[mysystem#2] > nslookup mysystem Trying DNS Name: mysystem.mydomain.com Address: 15.11.100.17 c. Verify that /etc/hosts has the same name configured for the address. Note that the first name should be the fully qualified domain name, and any aliases are listed afterward. [mysystem#3] > grep 15.11.100.17 /etc/hosts 15.11.100.17 mysystem.mydomain.com mysystem d. Verify that the reverse lookup of the IP address returns the same fully qualified domain name as configured in /etc/hosts.
6 Support and other resources This chapter contains support information and the available resources for the HP Global Workload Manager (gWLM) servers.
Warranty information HP will replace defective delivery media for a period of 90 days from the date of purchase. This warranty applies to all Insight software products. HP authorized resellers For the name of the nearest HP authorized reseller, see the following sources: • In the United States, see the HP U.S. service locator web site: http://www.hp.com/service_locator • In other locations, see the Contact HP worldwide web site: http://welcome.hp.com/country/us/en/wwcontact.
A Compatibility with agents The gWLM A.7.1.0* CMS runs on HP-UX 11i v2 (B.11.23), HP-UX 11i v3 (B.11.31), and Microsoft Windows systems. It works with the following versions of the agents: • gWLM A.03.00.00.05: HP-UX 11i v1, HP-UX 11i v2, HP-UX 11i v3 • gWLM A.03.00.01.05: HP-UX 11i v1, HP-UX 11i v2, HP-UX 11i v3 • gWLM A.04.00.07: HP-UX 11i v1, HP-UX 11i v2, HP-UX 11i v3 • gWLM A.04.01.00.*: HP-UX 11i v1, HP-UX 11i v2, HP-UX 11i v3 • gWLM A.6.0.0.
B Global Workload Manager A.7.1.0* known issues This appendix contains the limitations and known issues for the Global Workload Manager (gWLM) A.7.1 release. Limitations The following are limitations for Global Workload Manager. Cannot manage VSP if both VMs and vPars exist If a VSP has Integrity VMs and vPars configured on it, the VSP cannot be managed as HPVMs or vPars. However, you can manage the VSP as an nPar.
• The properties files for gwlmagent and gwlmcmsd are parsed as English, regardless of the locale setting. So, be careful of using commas where English would use periods. • Some items are always in English: ◦ Start-up messages from gwlmagent and gwlmcmsd ◦ Log files ◦ Messages from initial configuration Unable to manage partitions with inactive cells or deconfigured cores gWLM does not support management of partitions with either inactive cells or deconfigured cores.
Rare incompatibility with virtual partitions Depending on workload characteristics, gWLM can migrate CPU resources rapidly. This frequent migration can potentially, although very rarely, produce a race condition, causing the virtual partition to crash. It can also produce a panic, resulting in one or more of the following messages: No Chosen CPU on the cell-cannot proceed with NB PDC. or PDC_PAT_EVENT_SET_MODE(2) call returned error Workaround Upgrading to vPars A.03.04 resolves this issue.
Workaround The following options are available as workarounds: Option 1 For systems managed by gWLM that are running HP-UX 11i v3, install the patches PHCO_36126 and PHSS_36078. (These patches are included in the September 2007 Operating Environment Update Release.) A fix to EMS hardware monitors is available with the September 2007 Operating Environment Update Release. Even with these patches and fixes, there is still one event generated for each change in CPU count.
agents will not be able to restore SRD operations. Also, System Insight Manager events will be generated to report the validation failure. Workaround There are two workarounds: • Update to vPars A.04.00 or later. • Update your configurations so that psets are not nested in virtual partitions. Information error during shutdown You might see a message similar to the following: Information Error during shutdown.
Warning: gwlmagent cimserver error, icapd down, or icap out of compliance. First restart cimserver. Make sure icapd is running. If this error happens again, consult gwlmagent man page for steps to return to compliance. Workaround Verify that no partition changes (parmodify) are pending. If partition changes are pending, please restart the system. Then, either consult iCAP documentation or contact HP to return the iCAP system back to compliance.
Workaround HP recommends a regular database backup which will help to recover easily from such a scenario. Choose from the following options that best suits the situation after consulting the database vendor's guidelines. • Back up the log. • Free disk space so that the log can automatically grow. • Move the log file to a disk drive with sufficient space. • Increase the Maximum Size of the log file. • Add a new log file to the database on a different disk that has sufficient space.
Workaround Do not assign CPUs using cell specifications. Consider assigning CPUs to the virtual partitions using a hardware path. Alternatively, to use cell-local processors, update to vPars A.04.04 on HP-UX 11i v2 (B.11.23) or to vPars A.05.01 on HP-UX 11i v3 (B.11.31). CMS is slow to respond The CMS is slow to respond. Workaround Time a gwlm list command on the CMS. If it takes more than 10 seconds, perform the following steps: 1. In the file /etc/opt/gwlm/conf/gwlmcms.
You can find the nPartition Provider at the following locations: • The quarterly AR CD starting May 2005 • The Software Depot website: http://software.hp.com Modifying Java while gWLM is running gWLM does not support any actions (including the use of update-ux) that remove, overwrite, or otherwise modify the version of Java that gWLM is using in a managed node or CMS that is part of a deployed SRD.
Error trying to deploy SRD, mysystem.vpar.000 to mysystem2.mydomain.com. SRD, mysystem2.fss.000 is already deployed. Only one SRD is allowed to be deployed. Workaround Undeploy the SRD using the --force option with the gwlm undeploy command, and restart gwlmagent on the managed node.
Workaround To maintain the process placements across redeploys, use gWLM's application records or user records when creating or editing your workload definitions in gWLM. Sizes/allocations less than policy minimums for Virtual Machines The sizes or allocations for virtual machines in a deployed SRD can appear to be less than their policy minimums. Workaround Wait a few minutes, since it can take several minutes for gWLM to recognize a virtual machine transition between the states of off and on.
Configuration of agent and CMS not synchronized Occasionally, a gWLM agent and the gWLM CMS disagree on whether an SRD is actually deployed. This can occur when you use Ctrl-C to interrupt a gwlm deploy or gwlm undeploy command. It can also occur if there are errors saving a gWLM configuration; the configuration is deployed and then saved to the gWLM configuration repository. If the deploy occurs but the save fails, the gWLM agent sees the SRD as deployed; however, the CMS sees the SRD as undeployed.
Workaround Turn off the VM or vPar and redeploy the SRD that contained it.
Index security, 31 set up, 21 startup behavior, 40 support, 50 tabs and menus, 17 wizard, 17 A advisory mode, 17 advisory mode to managed mode change, 22 automatic restart gWLM managed nodes in SRD, 41 C communication ports, 39 compartment, 7 compatibility with agents, 52 conditional policy, 14 configuration, 21 CPU resources manual adjustment, 33 custom policy, 14 H D K database backup and restore, 35 deploy, 8 known issues, 53 F memory resources manual adjustment, 34 message logs, 28 messages inc
node failed to rejoin SRD, 42 stop managing, 25 startup behavior, 40 support, 50 T tabs and menus, 17 typographic conventions, 51 U unable to create new native thread, 49 undeploy, 9 utilization policy, 14 V virtual partition, 9 Virtualization Services Platform, 7 VSP, 7 W wizard, 17 workload, 7 disk space requirement, 34 monitoring, 27 stop managing, 24 67