Setup Guide

ManualsBrandsDell ManualsAccess PlatformsBrocade G620

Supporting Fabric OS 8.0.1

CONFIGURATION GUIDE

Brocade Monitoring and Alerting Policy Suite

Configuration Guide

53-1004121-01

22 April 2016

Summary of content (158 pages)

PAGE 1
CONFIGURATION GUIDE Brocade Monitoring and Alerting Policy Suite Configuration Guide Supporting Fabric OS 8.0.
PAGE 2
© 2016, Brocade Communications Systems, Inc. All Rights Reserved. Brocade, Brocade Assurance, the B-wing symbol, ClearLink, DCX, Fabric OS, HyperEdge, ICX, MLX, MyBrocade, OpenScript, VCS, VDX, Vplane, and Vyatta are registered trademarks, and Fabric Vision is a trademark of Brocade Communications Systems, Inc., in the United States and/or in other countries. Other brands, products, or service names mentioned may be trademarks of others.
PAGE 3
Contents Preface...........................................................................................................................................................................................................................................................................................7 Document conventions...............................................................................................................................................................................................
PAGE 4
MAPS Elements and Categories ................................................................................................................................................................................................................................ 31 MAPS structural elements.....................................................................................................................................................................................................................................
PAGE 5
Adding missing ports to a dynamic group ................................................................................................................................................................................................ 99 Removing ports from a group............................................................................................................................................................................................................................99 D_Port monitoring.........
PAGE 6
Brocade FCIP monitoring parameters and groups....................................................................................................................................................................138 Quality of Service monitoring example.............................................................................................................................................................................................140 IPEXT monitoring.....................................................
PAGE 7
Preface ∙ ∙ ∙ ∙ Document conventions..................................................................................................................................................................................................... 7 Brocade resources...............................................................................................................................................................................................................8 Contacting Brocade Technical Support.............
PAGE 8
Preface Convention Description ... Repeat the previous element, for example, member[member...]. \ Indicates a “soft” line break in command examples. If a backslash separates two lines of a command input, enter the entire command at the prompt without the backslash. Notes, cautions, and warnings Notes, cautions, and warning statements may be used in this document. They are listed in the order of increasing severity of potential hazards.
PAGE 9
Preface If you have purchased Brocade product support directly from Brocade, use one of the following methods to contact the Brocade Technical Assistance Center 24x7. Online Telephone E-mail Preferred method of contact for non-urgent issues: Required for Sev 1-Critical and Sev 2-High issues: support@brocade.
PAGE 10
Preface 10 Brocade Monitoring and Alerting Policy Suite Configuration Guide 53-1004121-01
PAGE 11
About This Document ∙ ∙ ∙ ∙ ∙ Supported hardware and software..............................................................................................................................................................................11 What's new in this document........................................................................................................................................................................................12 MAPS commands altered in this release......................
PAGE 12
About This Document Brocade Gen 6 Directors ∙ Brocade X6-4 Director ∙ Brocade X6-8 Director Fabric OS support for the Brocade Analytics Monitoring Platform (AMP) device depends on the specific version of the software running on that platform. Refer to the AMP Release Notes and documentation for more information. What's new in this document This document includes new and modified information for the Fabric OS 8.0.1 release of MAPS.
PAGE 13
About This Document ∙ Buffer credit zero counter monitoring on page 117 ∙ Latency state clearing on page 117 ∙ Port toggling support on page 120 ∙ Scalability limit monitoring on page 131 ∙ MAPS monitoring for Extension platforms on page 138 ∙ I/O latency monitoring on page 110 ∙ Zoned device ratio monitoring on page 117 ∙ Updating monitoring policies for devices with four PSUs on page 145 ∙ F_Port default Port Health monitoring thresholds on page 151 ∙ SFP monitoring thresholds on page
PAGE 14
About This Document NOTE Obsolete rules are automatically removed from user-defined policies, and they will be removed from default policies in future releases. The following table lists the rules that have been made obsolete for non-FCIP platforms. Obsolete rules are automatically removed from user-defined policies, and they will be removed from default policies in future releases.
PAGE 15
About This Document The following table lists the rules that have been made obsolete for non-extension platforms. Obsolete rules are automatically removed from user-defined policies, and they will be removed from default policies in future releases.
PAGE 16
About This Document Removed group ALL_BE_PORTS ALL_CORE_BLADES ALL_SW_BLADES ALL_SLOTS Removed from these platforms Removed from platforms that do not support BE_Ports Removed from systems that are not chassis-based. Deprecated rules The following rules have been deprecated in this release, but they are still present in policies in order to support backward compatibility. However, they will be obsoleted and replaced in future releases.
PAGE 17
About This Document TABLE 7 MAPS-related terminology Term Description Action An activity, such as RASlog, performed by MAPS if a condition defined in a rule evaluates to true. ASIC Application-Specific Integrated Circuit; the chip within a switch or blade that controls its operation, including packet switching and routing. Back-end port Port that connects a core switching blade to a port or application blade (and vice versa).
PAGE 18
About This Document 18 Brocade Monitoring and Alerting Policy Suite Configuration Guide 53-1004121-01
PAGE 19
Monitoring and Alerting Policy Suite Overview ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ MAPS overview .................................................................................................................................................................................................................. 19 MAPS license requirements.........................................................................................................................................................................................
PAGE 20
Monitoring and Alerting Policy Suite Overview MAPS activation The Fabric Vision license must be activated (enabled) in Fabric OS for you to use the full set of MAPS options. In addition, to reuse the Fabric Watch thresholds, you must enable MAPS in Fabric OS 7.3.x before upgrading to Fabric OS 8.0.1. The following information must be kept in mind when activating MAPS: ∙ Activating MAPS will activate in all the logical switches. ∙ On any given chassis there can be multiple logical switches.
PAGE 21
Monitoring and Alerting Policy Suite Overview ∙ Small form-factor pluggable (SFP) transceivers on simulated mode (SIM) ports cannot be monitored using MAPS. ∙ If an event occurs before the dashboard starts monitoring (such as an SCN or an alert), then the event might not be shown in the dashboard. Refer to Monitoring Flow Vision Flow Monitor data with MAPS on page 107 for additional details about monitoring Flow Vision flows.
PAGE 22
Monitoring and Alerting Policy Suite Overview Upgrading if MAPS is installed with a Fabric Vision license If MAPS is installed and enabled on the switch, then MAPS will continue to monitor the switch afterwards with no change in operation. MAPS will be enabled with the same active policy that was previously in force on each logical switch and will continue to monitor the fabric based on that policy.
PAGE 23
Monitoring and Alerting Policy Suite Overview ∙ Downgrading is not allowed if there is any quarantined port in the logical group ALL_QUARANTINED_PORTS. Before you can downgrade the switch firmware, you must clear the ports from the quarantined state using the sddquarantine --clear slot/port or all command. ∙ When downgrading from Fabric OS 8.0.
PAGE 24
Monitoring and Alerting Policy Suite Overview Features that do not require a Fabric Vision license Some features are monitored by MAPS even when the Fabric Vision license is not active. Monitors that do not require a Fabric Vision license The following features are monitored by both the unlicensed and licensed versions of MAPS: ∙ Switch status policies ∙ Switch resource changes ∙ FPI monitoring For more information, refer to MAPS commands that do not require a Fabric Vision license on page 24.
PAGE 25
MAPS Setup and Operation ∙ ∙ ∙ ∙ ∙ Initial MAPS setup............................................................................................................................................................................................................ 25 Monitoring across different time windows..........................................................................................................................................................28 Setting the active MAPS policy to a default policy...
PAGE 26
MAPS Setup and Operation Example of activating MAPS without activating a Fabric Vision license The following example shows the results when MAPS is automatically enabled without having the Fabric Vision license installed or activated.
PAGE 27
MAPS Setup and Operation 2. To take advantage of MAPS functionality, perform the following: a) Enter mapspolicy --show --summary to display a list of default policies. b) Use mapspolicy --enable default_policy_name to enable one of the default policies. NOTE If you have installed a Fabric Vision license, then you should use the conservative, aggressive, or moderate policies. Use the base policy only for basic monitoring, similar to using MAPS without a license.
PAGE 28
MAPS Setup and Operation Quickly monitoring a switch with predefined policies You can use MAPS to quickly start monitoring your switch with one of the predefined policies delivered with MAPS. Perform the following steps to quickly monitor a switch with a predefined policy: 1. Connect to the switch and log in using an account with admin permissions. 2. Enter mapspolicy --enable followed by the name of the policy you want to enable. You must include an existing policy name in this command.
PAGE 29
MAPS Setup and Operation Both of the following cases could indicate potential issues in the fabric. Configuring rules to monitor these conditions allows you to correct issues before they become critical. In the following example, the definition for crc_severe specifies that if the change in the CRC counter in the last minute is greater than 5, it must trigger an e-mail alert and SNMP trap. This rule monitors for the severe condition.
PAGE 30
MAPS Setup and Operation The following example sets “dflt_moderate_policy” as the active MAPS policy, and then displays the list of policies and names the active policy.
PAGE 31
MAPS Elements and Categories ∙ ∙ MAPS structural elements.............................................................................................................................................................................................31 MAPS monitoring categories .....................................................................................................................................................................................
PAGE 32
MAPS Elements and Categories In addition to being able to set alerts and other actions based on these categories, the MAPS dashboard displays their status. Refer to MAPS dashboard overview on page 79 for information on using the MAPS dashboard. Port Health The Port Health category monitors port statistics and takes action based on the configured thresholds and actions. You can configure thresholds per port type and apply the configuration to all ports of the specified type.
PAGE 33
MAPS Elements and Categories TABLE 11 Port Health category parameters (continued) Monitored parameter Description SFP current (CURRENT) The amperage supplied to the SFP transceiver in milliamps (mA). Current area events indicate hardware failures. SFP receive power (RXP) The power of the incoming laser in microwatts (µW). This is used to help determine if the SFP transceiver is in good working condition. If the counter often exceeds the threshold, the SFP transceiver is deteriorating.
PAGE 34
MAPS Elements and Categories FRU Health The FRU Health category enables you to define rules for field-replaceable units (FRUs). The following table lists the monitored parameters in this category. Possible states for all FRU measures are faulty, inserted, on, off, and out. TABLE 13 FRU Health category parameters Monitored parameter Description Power supplies (PS_STATE) State of a power supply has changed. Fans (FAN_STATE) State of a fan has changed. Blades (BLADE_STATE) State of a slot has changed.
PAGE 35
MAPS Elements and Categories TABLE 15 Fabric State Changes category parameters (continued) Monitored parameter Description Fabric reconfigurations (FAB_CFG) Tracks the number of fabric reconfigurations. These occur when the following events happen: ∙ Two fabrics with the same domain ID are connected ∙ Two fabrics are joined ∙ An E_Port or VE_Port goes offline ∙ A principal link segments from the fabric E_Port downs (EPORT_DOWN) Tracks the number of times that an E_Port or VE_Port goes down.
PAGE 36
MAPS Elements and Categories TABLE 16 Switch Resource category parameters Monitored parameter Description Temperature (TEMP) The ambient temperature inside the switch in degrees Celsius. Temperature sensors monitor the switch in case the temperature rises to levels at which damage to the switch might occur. Flash (FLASH_USAGE) The available compact flash space, calculated by comparing the percentage of flash space consumed with the configured high threshold value.
PAGE 37
MAPS Elements and Categories TABLE 18 FCIP Health category parameters (continued) Monitored parameter Description FCIP circuit packet loss (CIR_PKTLOSS) The percentage of the total number of packets that have had to be retransmitted. FCIP QoS utilization (UTIL) The percentage of FCIP circuit QoS groups utilization. FCIP packet loss (PKTLOSS) The percentage of the total number of packets that have had to be retransmitted in each QoS level. This applies to each FCIP QoS group only.
PAGE 38
MAPS Elements and Categories For more information on Fabric Performance Impact monitoring, refer to Fabric performance impact monitoring using MAPS on page 113. Switch Status Policy The Switch Status Policy category lets you monitor the health of the switch by defining the number of types of errors that transition the overall switch state into a state that is not healthy.
PAGE 39
MAPS Elements and Categories packets with a server. Web servers use the certificate during the encryption of data . The certificate binds the identity of a user, computer, or service to a public key by providing information about the subject of the certificate, the validity of the certificate, and applications and services that can use the certificate. Version 3 certificates support the following fields that are supported since X.509 version 1: ∙ Version—Gives the version of the certificate.
PAGE 40
MAPS Elements and Categories Certificate monitor rule creation The rule creation for certificate monitoring system is similar to the rule creation for any other security monitoring system. You can enable default policies to monitor the certificates or create custom policies rules to monitor the certificates. The following example defines a rule to send an alert when a certificate is about to expire.
PAGE 41
MAPS Groups, Conditions, Rules, and Policies ∙ ∙ ∙ ∙ MAPS groups overview...................................................................................................................................................................................................41 MAPS conditions...............................................................................................................................................................................................................
PAGE 42
MAPS Groups, Conditions, Rules, and Policies ALL_QSFP ALL_CERTS ALL_OTHER_F_PORTS ALL_E_PORTS ALL_WWN ALL_PORTS ALL_D_PORTS ALL_OTHER_SFP ALL_PS ALL_FLASH ALL_PIDS ALL_25Km_16GLWL_SFP ALL_32GSWL_SFP ALL_32GLWL_SFP ALL_32GSWL_QSFP io_mon_ |Yes |Yes |Yes |Yes |Yes |Yes |Yes |Yes |Yes |Yes |Yes |Yes |Yes |Yes |Yes |No |Sfp |0 |Certificate |0 |Port |1 |Port |9 |WWN |1 |Port |64 |Port |0 |Sfp |1 |Power Supply|2 |Flash |1 |Pid |12 |Sfp |0 |Sfp |1 |Sfp |0 |Sfp |0 |Flow |1 | | |12 |0,2-3,7,9,32,52,54-55 |1 |0-6
PAGE 43
MAPS Groups, Conditions, Rules, and Policies TABLE 22 Predefined MAPS groups Predefined group name Object type Description ALL_PORTS FC Port All ports in the logical switch. ALL_BE_PORTS N/A All back-end ports in the physical switch. ALL_D_PORTS FC Port All D_Ports in the logical switch. ALL_E_PORTS FC Port All E_Ports and EX_Ports in the logical switch. This includes all the ports in E_Port and EX_Port trunks as well. This includes AE ports as well.
PAGE 44
MAPS Groups, Conditions, Rules, and Policies TABLE 22 Predefined MAPS groups (continued) Predefined group name Object type Description SWITCH Switch Default group used for defining rules on parameters that are global for the whole switch level, for example, security violations or fabric health. CHASSIS Chassis Default group used for defining rules on parameters that are global for the whole chassis, for example, CPU or flash. ALL_EXT_GE_PORTS GE ports All GE ports in the chassis.
PAGE 45
MAPS Groups, Conditions, Rules, and Policies ∙ Group names are not case sensitive; My_Group and my_group are considered to be the same. Creating a static user-defined group MAPS allows you to create a monitorable group defined using a static definition, in which the membership is explicit and only changes if you redefine the group. As an example of a static definition, you could define a group called MY_CRITICAL_PORTS and specify its members as “2/1-10,2/15,3/1-20”.
PAGE 46
MAPS Groups, Conditions, Rules, and Policies As an example of a dynamic definition, you could specify a port name or an attached device node WWN, and all ports which match the port name or device node WWN will be automatically included in this group. As soon as a port meets the criteria, it is automatically added to the group. As soon as it ceases to meet the criteria, it is removed from the group.
PAGE 47
MAPS Groups, Conditions, Rules, and Policies NOTE The values for group_name and feature_name must match existing group and feature names. You can only specify one feature as part of a group definition. 3. Use the following commands to add or delete specific ports from the group. (You can also use this command to modify the group membership of pre-defined groups.) ∙ To explicitly add ports to the group, enter logicalGroup --addmember group_name -members member_list.
PAGE 48
MAPS Groups, Conditions, Rules, and Policies The following example restores all the deleted members and removes the added members of the GOBLIN_PORTS group. First it shows the detailed view of the modified GOBLIN_PORTS group, then restores the membership of the group and then it shows the post-restore group details. Notice the changes in the MemberCount, Members, Added Members, and Deleted Members fields between the two listings.
PAGE 49
MAPS Groups, Conditions, Rules, and Policies The following example shows that the user-defined group GOBLIN_PORTS exists, deletes the group, and then shows that the group has been deleted.
PAGE 50
MAPS Groups, Conditions, Rules, and Policies day Samples used for comparison are one day apart. hour Samples used for comparison are one hour apart. minute Samples used for comparison are one minute apart. none A comparison is made between the real-time value and the configured threshold value. ∙ day specifies that the samples will be compared once a day. ∙ hour specifies that the samples are compared every hour. ∙ minute specifies that the samples are compared every minute.
PAGE 51
MAPS Groups, Conditions, Rules, and Policies TABLE 25 Monitors and supported timebases (continued) Monitor Name Day Hour Minute None Invalid Transmit Words Yes Yes Yes No Sync Loss Yes Yes Yes No Link Failure Yes Yes Yes No Loss of Signal Yes Yes Yes No Protocol Errors Yes Yes Yes No Link Reset Yes Yes Yes No C3 Time outs Yes Yes Yes No State change Yes Yes Yes No SFP Current No No No Yes SFP Receive Power No No No Yes SFP Transmit Power No No No Y
PAGE 52
MAPS Groups, Conditions, Rules, and Policies TABLE 25 Monitors and supported timebases (continued) Monitor Name Day Hour Minute None Absent or faulty fans (BAD_FAN) No No No Yes Flash usage (FLASH_USAGE) No No No Yes Percentage of marginal ports (MARG_PORTS) No No No Yes Percentage of error ports (ERR_PORTS) No No No Yes Percentage of faulty ports (FAULTY_PORTS) No No No Yes Faulty blades (FAULTY_BLADE) No No No Yes Faulty WWN (WWN_DOWN) No No No Yes Core blade monit
PAGE 53
MAPS Groups, Conditions, Rules, and Policies ∙ Port fencing and port decommissioning on page 56 ∙ RASLog messages on page 59 ∙ SFP marginal on page 60 ∙ Slow Drain Device Quarantine on page 61 ∙ MAPS SNMP traps on page 56 ∙ Switch critical on page 61 ∙ Switch marginal on page 61 ∙ Port toggling on page 59 For each action, you can define a “quiet time” for most rules in order to reduce the number of alert messages generated. Refer to Quieting a rule on page 72 for details.
PAGE 54
MAPS Groups, Conditions, Rules, and Policies To disable all actions, enter mapsconfig --actions none. The keyword none cannot be combined with any other action. The following example shows that RASLog, e-mail, and fence notifications are not currently active actions on the switch, and then shows them being added to the list of allowed actions. NOTE Starting with 8.0.1, SW_CRITICAL and SW_MARGINAL notifications are enabled by default in a switch and they cannot be disabled by the user. Updated for 8.0.
PAGE 55
MAPS Groups, Conditions, Rules, and Policies Examples of e-mail alert enhancements The follow example shows the enhancements for e-mail alerts from threshold-based rules. The enhanced information is labeled "Subject," "Group," and "Current Value.
PAGE 56
MAPS Groups, Conditions, Rules, and Policies ∙ Condition on which the event was triggered ∙ Monitoring service details and the measured value MAPS SNMP traps When specific events occur on a switch, SNMP generates a message (called a “trap”) that notifies a management station.
PAGE 57
MAPS Groups, Conditions, Rules, and Policies decommissioning and port fencing can only be configured for the port health monitoring systems for which decommissioning is supported. Port decommissioning cannot be configured by itself in a MAPS rule or action. It requires port fencing to be enabled in the same rule. If you attempt to create a MAPS rule or action that has port decommissioning without port fencing, the rule or action will be rejected.
PAGE 58
MAPS Groups, Conditions, Rules, and Policies The following example enables port fencing and port decommissioning for a switch and then displays the confirmation. switch246:FID128:admin> mapsconfig --actions fence,decom switch246:admin> mapsconfig --show Configured Notifications: FENCE,DECOM Mail Recipient: Not Configured Paused members : =============== PORT : CIRCUIT : SFP : The following example makes port fencing and port decommissioning part of a rule and then displays the confirmation.
PAGE 59
MAPS Groups, Conditions, Rules, and Policies The following example enables port fencing on a switch and then displays the confirmation. switch1234:admin> mapsconfig --actions raslog,fence switch1234:admin> mapsconfig --show Configured Notifications: Mail Recipient: Paused members : =============== PORT : CIRCUIT : SFP : RASLOG,FENCE Not Configured The following example makes port fencing part of a rule and then displays the confirmation.
PAGE 60
MAPS Groups, Conditions, Rules, and Policies TABLE 29 RASLog message category for state-based monitoring systems Condition description RASLog message category Example A rule with “ ==” or “!=” condition Generates a WARNING (MAPS-1003) message. LOSS_SIGNAL monitoring system Exception: FPI monitoring for the IO_PERF_IMPACT state, where the “==” and “!=” generates a WARNING (MAPS-1003) message and a “==” for IO_FRAME_LOSS state generates a CRITICAL (MAPS-1001) message. In Fabric OS 8.0.
PAGE 61
MAPS Groups, Conditions, Rules, and Policies Port19: id (sw) Vndr: BROCADE Ser.No: JAA1152100181 Speed: 8,16,32_Gbps Health: Yellow Port60: id 3/0 (sw) Vndr: BROCADE Ser.No: ZTA1151600009 Speed: 32_Gbps Health: Green Port61: id 3/1 (sw) Vndr: BROCADE Ser.No: ZTA1151600009 Speed: 32_Gbps Health: Green Port62: id 3/2 (sw) Vndr: BROCADE Ser.No: ZTA1151600009 Speed: 32_Gbps Health: Green Port63: id 3/3 (sw) Vndr: BROCADE Ser.
PAGE 62
MAPS Groups, Conditions, Rules, and Policies – – You can have an active policy with no rules, but you must have an active policy. You cannot disable the active policy. You can only change the active policy by enabling a different policy. Viewing policy values You can display the values for a policy by using the mapspolicy --show policy_name |grep group_name command. The following example displays all the thresholds for host ports in the My_all_hosts_policy.
PAGE 63
MAPS Groups, Conditions, Rules, and Policies ∙ Modifying a default policy on page 67 ∙ Creating a policy on page 65 ∙ User-defined policies on page 63 MAPS automatically monitors the management port (Eth0 or Bond0), because the rule for Ethernet port monitoring is present in all four default policies. While the default policies cannot be modified, the management port monitoring rules can be removed from cloned policies.
PAGE 64
MAPS Groups, Conditions, Rules, and Policies Working with MAPS policies The following sections discuss viewing, creating, enabling, and modifying MAPS policies. Viewing policy information MAPS allows you to view the policies on a switch. You can use this command to show all policies, only a particular policy, or a summary. To view the MAPS policies on a switch, complete the following steps. 1. Connect to the switch and log in using an account with admin permissions. 2.
PAGE 65
MAPS Groups, Conditions, Rules, and Policies The following example shows an excerpted result of using the --show -all option. The entire listing is too long (over 900 lines) to include.
PAGE 66
MAPS Groups, Conditions, Rules, and Policies 2. Create or modify rules to configure the required thresholds in the new policy. ∙ To create a rule, enter mapsRule --create rule_name -group group_name -monitor ms name -timebase timebase -op op_value -value value -action action -policy policy_name. ∙ To clone an existing rule, enter mapsRule --clone rule_name -name clone_rule_name. ∙ To modify existing rules, enter mapsRule --config rule_name parameters.
PAGE 67
MAPS Groups, Conditions, Rules, and Policies The following example adds a rule to the policy named daily_policy, displays the policy, and then re-enables the policy so the change can become active.
PAGE 68
MAPS Groups, Conditions, Rules, and Policies – ∙ Chassis monitoring rules: applicable to only chassis platforms. – ∙ defSWITCHSEC_DCC_4, defSWITCHSEC_FCS_0, defSWITCHSEC_FCS_2, defSWITCHSEC_FCS_4, defSWITCHSEC_HTTP_0 defALL_WWNWWN_FAULTY, defALL_WWNWWN_ON, defALL_WWNWWN_OUT, defCHASSISDOWN_CORE_1, defCHASSISDOWN_CORE_2, defCHASSISFAULTY_BLADE_1, defCHASSISHA_SYNC_0, defCHASSISWWN_DOWN_1 Fixed-port switch monitoring rules: applicable to only fixed-port platforms.
PAGE 69
MAPS Groups, Conditions, Rules, and Policies The following example shows the policy names associated with the rule name “Rule1”. switch:admin> mapsrule --show Rule1 Rule Data: ---------RuleName: Rule1 Action: Raslog, Fence, SNMP Condition: Switch(SEC_IDB/Min>0) Associated Policies: daily_policy, crc_policy The following example shows the result of using the mapsrule --show -all command with the -concise option; this displays abbreviations instead of complete action names in the output.
PAGE 70
MAPS Groups, Conditions, Rules, and Policies Example of creating a rule to generate a RASLog message The following example creates a rule to generate a RASLog message if the CRC counter for a group of critical ports is greater than 10 in an hour. This rule is added to the daily_policy, and the daily_policy is re-enabled for the rule to take effect.
PAGE 71
MAPS Groups, Conditions, Rules, and Policies Changing one parameter The following example changes the timebase for a rule from minutes to hours.
PAGE 72
MAPS Groups, Conditions, Rules, and Policies Creating an exact clone The following example shows an existing rule, creates an exact clone of that rule and renames it, and then displays the new rule.
PAGE 73
MAPS Groups, Conditions, Rules, and Policies triggered, MAPS performs the configured and enabled actions for the rule the first time. Afterwards, if the rule is triggered again within the quiet time period, MAPS does not perform any of the actions until the quiet time expires. At that time, MAPS sends an update alert. This alert is the same as the initial alert, but it includes information about the number of times the rule was triggered in the interim.
PAGE 74
MAPS Groups, Conditions, Rules, and Policies TABLE 30 Minimum quiet time values for monitoring systems that support a time base of NONE (continued) Monitoring system Minimum quiet time (seconds) CURRENT 360 RXP 360 TXP 360 PWR_HRS 360 BAD_TEMP 60 BAD_PWR 60 BAD_FAN 60 The following example shows a rule that includes a 120-second quiet time period.
PAGE 75
MAPS Groups, Conditions, Rules, and Policies The following example shows that the rule port_test_rule35 exists in test_policy_1. The examples show the rule being deleted from that policy using the -force keyword, and then it shows a verification that the rule has been deleted from the policy.
PAGE 76
MAPS Groups, Conditions, Rules, and Policies Specifying multiple e-mail addresses for alerts The following example specifies multiple e-mail addresses for e-mail alerts on the switch, and then displays the settings. It assumes that you have already correctly configured and validated the e-mail server. switch:admin> mapsconfig --emailcfg -address admin1@mycompany.com, admin2@mycompany.
PAGE 77
MAPS Groups, Conditions, Rules, and Policies There is no confirmation of this action. 3. Optional: Enter relayconfig --show. This displays the configured e-mail server host address and domain name. The following example configures the relay host address and relay domain name for the switch, and then displays it. switch:admin> relayconfig --config -rla_ip 10.70.212.168 -rla_dname "mail.brocade.com" switch:admin> relayconfig --show Relay Host: 10.70.212.168 Relay Domain Name: mail.brocade.
PAGE 78
MAPS Groups, Conditions, Rules, and Policies 78 Brocade Monitoring and Alerting Policy Suite Configuration Guide 53-1004121-01
PAGE 79
MAPS Dashboard ∙ ∙ ∙ MAPS dashboard overview......................................................................................................................................................................................... 79 Viewing the MAPS dashboard.................................................................................................................................................................................. 83 Clearing MAPS dashboard data..........................................
PAGE 80
MAPS Dashboard Summary Report section The Summary Report section has two subsections, the Category report and the Rules Affecting Health report. The Category report subsection collects and summarizes the various switch statistics monitored by MAPS into multiple categories, and displays the current status of each category since midnight, and the status of each category for the past seven days.
PAGE 81
MAPS Dashboard ∙ Although a rule might be triggered multiple times within a given hour, only the timestamp of the latest violation is stored. ∙ However, each violation of a rule individually is reflected in the rule count for that category and the repeat count for that rule. For example, if the same rule was triggered 12 times in one hour, the repeat count value (shown as Repeat Count in the following example) for that rule will be 12, but only the timestamp for the last occurrence is displayed.
PAGE 82
MAPS Dashboard The following output extract shows a sample History Data section. (output truncated) 4.
PAGE 83
MAPS Dashboard Viewing the MAPS dashboard The MAPS dashboard allows you to monitor the switch status. There are three primary views: a summary view, a detailed view (which includes historical data), and a history-only view. To view the status of the switch as seen by MAPS, complete the following steps. 1. Connect to the switch and log in using an account with admin permissions. 2. Enter mapsdb --show followed by the scope parameter: all, history, or details.
PAGE 84
MAPS Dashboard The following example shows a typical result of entering mapsdb --show all.
PAGE 85
MAPS Dashboard UTIL(%) BN_SECS(Seconds) 7/0(29.10) 7/8(16.74) 2/11(38.46) 7/31(37.39) 7/0(29.11) 7/8(9.77) - 7/8(1.89) 2/11(3.38) 7/31(3.20) 7/0(2.33) 7/8(1.01) - 7/0(5.00) 7/8(4.02) 2/11(7.53) 7/31(6.68) 7/0(4.90) 7/8(2.
PAGE 86
MAPS Dashboard The following example displays the general status of the switch (MARGINAL) and lists the overall status of the monitoring categories for the current day (measured since midnight) and for the last seven days. If any of the categories are shown as being “Out of range”, the last five rules that caused this status are listed. If a monitoring rule is triggered, the corresponding RASLog message appears under Rules Affecting Health of the dashboard.
PAGE 87
MAPS Dashboard Sub-flow rule violation summaries In the MAPS dashboard you can view a summary of all sub-flows that have rule violations. When a rule is triggered, the corresponding RASLog rule trigger appears in the “Rules Affecting Health” sub-section of the dashboard as part of the Traffic Performance category. In this category, the five flows or sub-flows with the highest number of violations since the previous midnight are listed.
PAGE 88
MAPS Dashboard The following example shows the detailed switch status. The status includes the summary switch status, plus port performance data for the current day (measured since midnight). If a monitoring rule is triggered, the corresponding RASLog message appears under the summary section of the dashboard. The column headings in the example have been edited slightly so as to allow the example to display clearly.
PAGE 89
MAPS Dashboard |2 | | Fabric State|5 Changes(8) | | | | | |3 | | | |defALL_PSPS |STATE_FAULTY | |defSWITCHEPORT_ |DOWN_1 | | | | |defSWITCHEPORT_ |DOWN_1 | | |02/04/16 21:32:16|Power Supply 3 |FAULTY | | | | |Power Supply 4 |FAULTY |02/04/16 21:29:02|Switch |6 Ports | | | | |Switch |2 Ports | |Switch |8 Ports | |Switch |4 Ports | |Switch |2 Ports |02/04/16 20:58:31|Switch |2 Ports | | | | |Switch |2 Ports | |Switch |2 Ports | | | | | | | | | | | | | 4 History Data: ================================= Sta
PAGE 90
MAPS Dashboard NOTE The output of the mapsdb --show history command differs depending on the platform on which you run it. On fixed-port switches, ports are shown in port index format; on chassis-based platforms, ports are shown in slot/port format. The values are expressed in kilos (k), Million (m), and Giga (g) units. To view a summarized history of the switch status, complete the following steps. 1. Connect to the switch and log in using an account with admin permissions. 2.
PAGE 91
MAPS Dashboard Viewing data for a specific time window Detailed historical data provides the status of the switch for a specific time window. This is useful if, for example, users are reporting problems on a specific day or time. The same port-display patterns apply to viewing detailed historical data as for ordinary historical data. To view detailed historical data about a switch, complete the following steps. 1. Connect to the switch and log in using an account with admin permissions. 2.
PAGE 92
MAPS Dashboard Fabric Performance Impact (2) |1 | | |1 | |defALL_PORTS_ |LATENCY_CLEAR | |defALL_PORTS_IO_ |FRAME_LOSS |02/05/16 06:53:02|E-Port 12/27|IO_LATENCY_ | | | |CLEAR | | | | | |02/05/16 06:52:02|E-Port 12/27|IO_FRAME_LOSS| | | | | 4 History Data: =============== Stats(Units) Current 02/04/16 --/--/---/--/---/--/-------------------------------------------------------------------------------CRC(CRCs) ITW(ITWs) LOSS_SYNC(SyncLoss) LF LOSS_SIGNAL(LOS) PE(Errors) STATE_CHG LR C3TXTO(Timeouts)
PAGE 93
MAPS Dashboard TX(%) UTIL(%) BN_SECS(Seconds) 5/16(1.58) 5/1(1.57) 5/3(1.57) 5/2(1.56) 1/23(10.73) 1/19(10.66) 12/24(4.66) 12/25(4.39) 4/18(3.78) 4/17(3.78) 12/26(3.41) 12/27(3.14) 5/16(1.97) 5/17(1.97) 5/18(1.96) 5/2(1.96) 5/63(1.96) 8/33(1.80) 5/1(1.77) 8/34(1.30) 8/32(1.25) 5/3(1.21) 8/35(1.21) 5/61(1.20) 5/19(1.17) 1/19(11.73) 4/17(8.90) 1/23(5.36) 12/24(4.27) 12/25(4.16) 12/26(3.64) 12/27(3.52) 4/18(3.21) 5/16(1.78) 5/17(1.77) 5/18(1.77) 5/63(1.77) 5/2(1.76) 8/33(1.69) 5/1(1.67) 8/34(1.49) 8/32(1.
PAGE 94
MAPS Dashboard - 10/ge5(2) 10/ge6(2) 10/ge7(2) 10/ge8(2) 10/ge9(2) 10/ge1(2) 10/ge11(2) - - - The following example displays historical port performance data for four hours on a chassis-based platform. The History Data section has been trimmed so that the output will display correctly here. Normally there would be additional days of data.
PAGE 95
MAPS Dashboard Clearing MAPS dashboard data To delete the stored data from the MAPS dashboard, enter mapsdb --clear. This command is useful if you want to see only the data logged after you have made a change to a switch (or a rule). The dashboard is also cleared if either a reboot or an HA failover happens. To clear the stored dashboard data from a switch, complete the following steps. 1. Connect to the switch and log in using an account with admin permissions. 2.
PAGE 96
MAPS Dashboard 96 Brocade Monitoring and Alerting Policy Suite Configuration Guide 53-1004121-01
PAGE 97
Port Monitoring Using MAPS ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ Monitoring groups of ports using the same conditions...............................................................................................................................97 Port monitoring using port names...........................................................................................................................................................................97 Port monitoring using device WWNs ..........................................
PAGE 98
Port Monitoring Using MAPS For more information on creating dynamic user-defined groups, refer to User-defined groups on page 44. Port monitoring using device WWNs Fabric OS allows you to monitor ports that are connected to a device that has device World Wide Name (WWN) that follows a certain pattern. This WWN pattern can then be used as part of the criteria for identifying a group. There is no limit on the number of ports that can be in a group.
PAGE 99
Port Monitoring Using MAPS Adding missing ports to a dynamic group You can add ports to a predefined group (for example, ALL_HOST or ALL_TARGET) or user-defined dynamic group that might not have been included automatically. For dynamic groups, you can specify any of the following: ∙ A single port ∙ Multiple ports separated by commas ∙ A range in which the IDs are separated by commas You can create dynamic groups using either port names or WWNs, but you cannot use both in a single group definition.
PAGE 100
Port Monitoring Using MAPS 2. Enter logicalGroup --delmember group_name -members member_list. You can specify either a single port, or specify multiple ports as either individual IDs separated by commas, or a range where the IDs are separated by a hyphen. 3. Optional: Enter logicalGroup --show group_name to confirm that the named ports are no longer part of the group. The following example removes port 5 from the ALL_TARGET_PORTS group, and then shows that it is no longer a member of that group.
PAGE 101
Port Monitoring Using MAPS Using the mapsdb --show command shows any error or rule violation during diagnostics tests on a D_Port.
PAGE 102
Port Monitoring Using MAPS When running the portdporttest --show port_number command to see details for a 32 Gbps QSPF, the output appears similar to the results for a 16 Gbps QSPF, except the Electric loopback and Optical loopback are skipped. switch:admin> portdporttest --show 48 D-Port Information: =================== Port: 48 Remote WWNN: 10:00:00:27:f8:f0:26:41 Remote port index: 52 Mode: Manual No.
PAGE 103
Port Monitoring Using MAPS 2015/06/29-21:40:02, [MAPS-1003], 48, SLOT 6 FID 128, WARNING, dcx_178, BE Port 1/14, Condition=ALL_BE_PORTS(CRC/5MIN>10), Current Value:[CRC,125 CRCs (Conn. port 5/182)], RuleName=defALL_BE_PORTSCRC_5M_10, Dashboard Category=BE Port Health. For more information on back-end health monitoring, refer to Back-end Health on page 33 and Back-end port monitoring thresholds on page 148.
PAGE 104
Port Monitoring Using MAPS Gigabit Ethernet port monitoring NOTE Gigabit Ethernet port monitoring can be performed on the following devices: ∙ 7840 switch ∙ SX6 ∙ FX8-24 blades Fabric OS allows you to monitor GE ports in a switch and receive counter errors reported by ASIC drivers as RASLog, SNMP, and email alerts. This reporting helps you identify the nature of FCIP and IP Extension traffic errors at the Level 2 (L2) link layer. The Ethernet MAC counters are maintained on a 1/10/40 GigE port basis.
PAGE 105
Port Monitoring Using MAPS GE port monitoring CRC rule creation Rule creation for Gigabit Ethernet port monitoring is similar to the procedure for creating rules for other ports. You can enable default policies to monitor the counters or create custom rules and policies to monitor the GE ports. The following example creates a CRC rule for GE port.
PAGE 106
Port Monitoring Using MAPS 106 Brocade Monitoring and Alerting Policy Suite Configuration Guide 53-1004121-01
PAGE 107
Monitoring Flow Vision Flows with MAPS ∙ ∙ ∙ ∙ Monitoring Flow Vision Flow Monitor data with MAPS.............................................................................................................................107 Monitoring traffic performance................................................................................................................................................................................109 Monitoring learned flows...................................................
PAGE 108
Monitoring Flow Vision Flows with MAPS The following example illustrates the flow-monitoring steps. The first command line creates the flow (called myflow_22 for this example), the second command line imports it, and the third command line displays the members of the logical groups. The fourth command line creates a rule for the group and the fifth command line enables the flow with the new rule active.
PAGE 109
Monitoring Flow Vision Flows with MAPS 1. Enter logicalgroup --show to confirm that the flow was correctly imported into MAPS. 2. Define a MAPS rule using the mapsrule --create command (for the supported timebases) and add it to a policy. Refer to MAPS rules overview on page 52 for information on creating and using rules. 3. Enter mapspolicy --enablepolicy policy_name to activate the policy. The following example illustrates the steps to add monitoring flows after importing.
PAGE 110
Monitoring Flow Vision Flows with MAPS Monitoring frames for a specified set of criteria In the following example, MAPS uses the flow “abtsflow” to watch for frames in a flow going through port 128 that contain SCSI ABORT sequence markers. switch246:admin> flow --create abtsflow -feature mon -ingrport 128 -frametype abts switch246:admin> mapsconfig --import abtsflow You can then define rules for this flow (group), and then re-enable the policy so they take effect.
PAGE 111
Monitoring Flow Vision Flows with MAPS The Gen 6 IO Insight metrics are aonly available with a Flow Vision flow, and they can only be monitored by MAPS when the flow is imported into MAPS.
PAGE 112
Monitoring Flow Vision Flows with MAPS 1. Create the flow. flow --create ios_host_flow -srcdev 041900 -dstdev 041b00 -ingrport 25 -fea mo 2. Import the flow using the mapsconfig command. mapsconfig --import ios_host_flow After importing the flow, MAPS automatically monitors the flow, if the rule is already been created. 3. Create the rule.
PAGE 113
Fabric performance impact monitoring using MAPS ∙ ∙ ∙ MAPS latency monitoring.............................................................................................................................................................................................113 MAPS and Bottleneck Detection ............................................................................................................................................................................ 119 Slow Drain Device quarantining....
PAGE 114
Fabric performance impact monitoring using MAPS The following example shows first the MAPS dashboard, displaying the IO_PERF_IMPACT report and then the IO_LATENCY_CLEAR report. The dashboard has been edited to show only Section 3. The back-slash character (\) in the following examples indicates a break inserted because the output is too long to display here as a single line. switch:admin> mapsdb --show (output truncated) 3.
PAGE 115
Fabric performance impact monitoring using MAPS Frame timeout latency monitoring MAPS monitors for Class 3 frame timeout errors (C3TXTO) on individual ports and when a timeout is detected on a port, MAPS reports them by setting the port state to IO_FRAME_LOSS and posting a RASLog message containing the number of frames that have timed out. This state is also reported on the MAPS dashboard. The following example displays a typical RASlog entry for this condition.
PAGE 116
Fabric performance impact monitoring using MAPS For determining transient queue latency, MAPS has two predefined threshold states: IO_PERF_IMPACT and IO_FRAME_LOSS. The IO_PERF_IMPACT state is set for a port when latency is between the pre-defined low threshold and high threshold values, the IO_FRAME_LOSS state is set for a port when latency is greater than the pre-defined high threshold value. The following example displays typical RASLogs created when IO_FRAME_LOSS and IO_PERF_IMPACT states are set.
PAGE 117
Fabric performance impact monitoring using MAPS Buffer credit zero counter monitoring Buffer credit zero counter increments are indirect indications of latency; they indicate when frames were not transmitted through a port due to a delay in receiving R_RDY frames. MAPS monitors this latency using a sliding window algorithm applied over a preset time period. This allows MAPS to monitor the frame delay over multiple window sizes with a different threshold for each time window.
PAGE 118
Fabric performance impact monitoring using MAPS MAPS uses the following rules to support this monitoring.
PAGE 119
Fabric performance impact monitoring using MAPS MAPS and Bottleneck Detection Starting with Fabric OS 8.0.0, bottleneck detection functionality has been replaced by Fabric Performance Impact (FPI) monitoring; the legacy bottleneck monitoring feature is obsolete. The MAPS dashboard displays the stuck virtual channel (VC) on any port. It also identifies the ports on which bottlenecks are seen, and then it sorts them based on the number of seconds that they exceeded the bottleneck threshold.
PAGE 120
Fabric performance impact monitoring using MAPS BN_SECS(Seconds) - - - - - - - 4.
PAGE 121
Fabric performance impact monitoring using MAPS The following example defines a rule that toggles a port offline for 180 seconds (3 minutes) when the number of CRC errors in a minute on the port is greater than 0. switch:admin> mapsrule --config toggle_rule -group DB_PORTS -monitor DEV_LATENCY_IMPACT -timebase none -op eq -value IO_PERF_IMPACT -action TOGGLE –tt 180 When a port has been toggled by a MAPS rule, TOGGLE appears as a notification action in the output of the mapsconfig and mapsdb commands.
PAGE 122
Fabric performance impact monitoring using MAPS FIGURE 1 Slow-drain data flow To remedy this, Brocade created Slow Drain Device Quarantine (SDDQ), a feature which —in conjunction with Quality of Service (QoS) monitoring— allows MAPS to identify a slow-draining device and quarantine it by automatically moving all traffic destined to the F_Port that is connected to the slow-draining device to a low-priority VC so that the traffic in the original VC does not experience backpressure.
PAGE 123
Fabric performance impact monitoring using MAPS Slow Drain Device Quarantine licensing For Slow Drain Device Quarantining (SDDQ) to take effect, the Fabric Vision license must be installed on the switch where the slow draining device is detected, as well as on the switch where the quarantine action is to occur. Intermediate switches do not need the Fabric Vision license for this feature to work, but they must have QoS enabled on all ISLs.
PAGE 124
Fabric performance impact monitoring using MAPS shift to the new priority because it is still marked as a slow-draining flow. It will be kept in the low-priority VC until you remove it from that VC. ∙ If device latency is due to Class 3 timeouts (C3TXTO) and the active C3TXTO rule has port fencing as an action, then the port fencing (disabling) may be performed first and quarantining will not occur.
PAGE 125
Fabric performance impact monitoring using MAPS Confirming the slow-draining status of a device You can use the nsshow, nscamshow, and nodefind commands to verify that a device is slow-draining. The following example shows the output of the nsshow command where there is a slow-draining device. The identifying line has been called out in this example. If there not a slow-draining device, the line would not appear.
PAGE 126
Fabric performance impact monitoring using MAPS The following example shows the output of the nscamshow command where there is a slow-draining device. The identifying line has been called out in this example. If there was not a slow-draining device, the line would not appear.
PAGE 127
Fabric performance impact monitoring using MAPS The following example shows the offline quarantined local ports and the online quarantined device information across the fabric.
PAGE 128
Fabric performance impact monitoring using MAPS The first part of the following example uses the sddquarantine --show command to display the offline quarantined local ports and the online quarantined device information across the fabric. The second part shows using the sddquarantine --clear command to remove the ports from quarantine.
PAGE 129
Fabric performance impact monitoring using MAPS The sddquarantine --show command output displays the list of successfully quarantined ports as well as those that were identified as problematic but were not quarantined, so that you can take any necessary measures to recover these ports. For example, you might choose to disable the ports, or (after correcting what is causing the slow drain) restore the ports.
PAGE 130
Fabric performance impact monitoring using MAPS 130 Brocade Monitoring and Alerting Policy Suite Configuration Guide 53-1004121-01
PAGE 131
Other MAPS monitoring capabilities ∙ ∙ ∙ ∙ ∙ ∙ ∙ Scalability limit monitoring........................................................................................................................................................................................... 131 MAPS Service Availability Module........................................................................................................................................................................
PAGE 132
Other MAPS monitoring capabilities Layer 2 fabric device connection monitoring A pure Layer 2 fabric is a collection of Fibre Channel switches and devices and switches that doesn’t participate in a metaSAN. In such a fabric, rules for device counts are calculated as a percentage of the total number of devices. For example, a Layer 2 fabric with 5500 devices logged in is using 92 percent of the maximum limit of 6000 devices for a Layer 2 fabric.
PAGE 133
Other MAPS monitoring capabilities The following example shows a typical RASLog entry for exceeding the threshold for the number of Fibre Channel routers in the Backbone fabric: 2014/05/27-17:02:00, [MAPS-1003], 14816, SLOT 4 | FID 20, WARNING, switch_20, Switch, Condition=SWITCH(BB_FCR_CNT>12), Current Value:[BB_FCR_CNT,13], RuleName= defSWITCHBB_FCR_CNT_12, Dashboard Category=Fabric State Changes.
PAGE 134
Other MAPS monitoring capabilities NOTE NPIV monitoring is not High Availability (HA)-capable. As a consequence, if there is a reboot or an HA failover, existing NPIV logins are not preserved, and new ones are assigned on a first-come, first-served basis. MAPS supports monitoring all F_Ports in a switch for the number of NPIV logins as part of scalability limit monitoring.
PAGE 135
Other MAPS monitoring capabilities ∙ Scalability limit monitoring (using L2_DEVCNT_PER) occurs only at midnight. Therefor, if a switch is moved from being a part of the Layer 2 fabric to being a part of the edge fabric, the device count metrics (how many devices in the fabric) will not change until the next midnight. ∙ The “LSAN-imported device” metric is only monitored in switches that are a part of a Backbone fabric.
PAGE 136
Other MAPS monitoring capabilities Rule for Fibre Channel router count in Backbone fabric In the following example, when the maximum limit of 12 Fibre Channel routers in the Backbone fabric is reached, MAPS reports the threshold violation using a RASLog message.
PAGE 137
Other MAPS monitoring capabilities Using only “--show” When you enter simply mapsSam --show, the report lists the following information for each port: ∙ Port Number ∙ Port type – – – – – – – – – DIS (disabled port) DIA (D_Port) DP (persistently disabled port) E (E_Port) F (F_Port) G (G_Port) T (Trunk port) TF (F_Port trunk) U (U_Port) NOTE The MAPSSAM report does not include the health status of gigabyte Ethernet (GbE) ports. ∙ Total up time — Percentage of time the port was up.
PAGE 138
Other MAPS monitoring capabilities Using “--show memory” The following example shows the output for mapsSam --show memory. switch:admin> mapssam --show memory Showing Memory Usage: Memory Usage : 22.0% Used Memory : 225301k Free Memory : 798795k Total Memory : 1024096k Using “--show flash” The following example shows the output for mapsSam --show flash.
PAGE 139
Other MAPS monitoring capabilities TABLE 36 Use of Brocade FCIP monitoring groups as metrics Parameter Groups where the parameter is used as a metric State change (STATE_CHG) ALL_TUNNELS Percent utilization (UTIL) ALL_CIRCUIT_HIGH_QOS, ALL_CIRCUIT_MED_QOS, ALL_CIRCUIT_LOW_QOS, ALL_CIRCUIT_F_QOS, ALL_TUNNELS, ALL_TUNNEL_HIGH_QOS, ALL_TUNNEL_MED_QOS, ALL_TUNNEL_LOW_QOS, and ALL_TUNNEL_F_QOS Percentage of packets lost in transmission (PKTLOSS) ALL_CIRCUIT_HIGH_QOS, ALL_CIRCUIT_MED_QOS, ALL_CIRCUIT_LOW_
PAGE 140
Other MAPS monitoring capabilities Quality of Service monitoring example The following MAPS rule states that when the packet loss percentage for ALL_CIRCUIT_HIGH_QOS group members becomes greater than or equal to 0.5 in a given minute, a RASLog entry will be posted. switch:admin> mapsrule --create urule -group ALL_CIRCUIT_HIGH_QOS -monitor PKTLOSS -t min -op ge -value .5 action raslog On triggering the rules, the corresponding RASLogs will appear under the summary section of the dashboard.
PAGE 141
Other MAPS monitoring capabilities TABLE 38 Brocade IPEXT monitoring parameters and groups (continued) Monitor Logical groups ALL_CIRCUIT_IP_MED_QOS ALL_CIRCUIT_IP_LOW_QOS IP_UTIL ALL_TUNNELS ALL_CIRCUITS Percentage of packets lost in transmission (PKTLOSS) ALL_CIRCUITS Round-trip time in milliseconds (RTT) ALL_CIRCUITS Variance in RTT in milliseconds (Jitter) ALL_CIRCUITS IPEXT rule creation The following example shows when circuit Qos utilization for ALL_CIRCUIT_IP_HIGH_QOS group members is gre
PAGE 142
Other MAPS monitoring capabilities Circuit IP monitors dashboard output sample Category |Repeat |Rule Name |Execution Time |Object |Triggered | (Rule Count)|Count | | | |Value(Units) | ---------------------------------------------------------------------------------FCIP Health |8 |ipcktjitter |08/10/15 09:11:01|Port/Cir 24/0 |0 % | (34) | | | |Port/Cir 24/0 |0 % | | | | |Port/Cir 24/0 |0 % | | | | |Port/Cir 24/0 |0 % | | | | |Port/Cir 24/0 |0 % | |8 |ipcktrtt |08/10/15 09:11:01|Port/Cir 24/0 |1 Millisecond
PAGE 143
Other MAPS monitoring capabilities – – The sendmail process cannot be forked by the cron job script due to a memory problem, and therefore, the command cannot be successfully executed. Other internal software errors. Examples of RASLog messages The following is an example of the RASLog which is generated when the sendmail command fails to send an e-mail from the switch. 2016/02/01-19:49:00, [MAPS-1206], 1468, SLOT 6 CHASSIS, INFO, dcx_178, A MAPS notification sent from the switch to abc@brocade.
PAGE 144
Other MAPS monitoring capabilities Dashboard output example for fan air-flow direction monitoring The following is an example of dashboard output when the fan air-flow direction rule has been triggered: From MAPS FS 3-30-16 mapspolicy –show test_1 Policy Name: test_1 Rule Name |Condition |Actions | ------------------------------------------------------------------------------------------test_rule_air_flow_60 | chassis(FAN_AIRFLOW_MISMATCH/none==TRUE) |raslog,sw_marginal | mapsdb --show 1 Dashboard Informat
PAGE 145
Other MAPS monitoring capabilities RASLog example for fan air-flow direction monitoring The following is an example of a RASLog message sent when MAPS detects the condition when there is a mismatch in the air flow direction on switch. Example comes from FS 3-30-16 2016/01/12-17:39:30, [MAPS-1003], 1507, FID 128, WARNING, sw128_top, Chassis, Condition=CHASSIS(FAN_AIRFLOW_MISMATCH==TRUE), Current Value:[ FAN_AIRFLOW_MISMATCH,TRUE], RuleName=test_rule_air_flow_1, Dashboard Category=Switch Resource.
PAGE 146
Other MAPS monitoring capabilities setting to CHASSIS(BAD_PWR/none>=1) . When you change a policy, you must enter all the values for the policy, even if you are changing only one value. In this example, the -policy name is fw_active_policy which you noted earlier in step 3. switch:admin> mapsrule --config fw_CHASSISBAD_PWRCrit_3 -policy fw_active_policy -monitor BAD_PWR group CHASSIS -timebase none -op ge -value 1 -action SW_CRITICAL Associated Policies: fw_active_policy 5.
PAGE 147
MAPS Threshold Values ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ Viewing monitoring thresholds.................................................................................................................................................................................147 Back-end port monitoring thresholds..................................................................................................................................................................148 Fabric state change monitoring thresholds.............
PAGE 148
MAPS Threshold Values Back-end port monitoring thresholds All Back-end port monitors support the Minute, Hour, Day, and Week timebases. The following table lists the errors MAPS monitors for on back-end ports, the trigger thresholds, and the default actions to be taken when the threshold is crossed.
PAGE 149
MAPS Threshold Values Extension monitoring thresholds These FCIP monitors support Minute, Hour, Day, and Week timebases: Tunnel state change, Tunnel throughput, Tunnel QoS throughput, Tunnel QoS Packet loss, FCIP Circuit State Changes, FCIP Circuit Utilization, FCIP Packet loss, FCIP Circuit Round Trip Time, and FCIP connection variance. The following tables list the default monitoring thresholds for Fiber Channel over IP (FCIP) criteria used by MAPS.
PAGE 150
MAPS Threshold Values Port Health monitoring thresholds All Port Health monitoring thresholds used by MAPS are triggered when they exceed the listed value. For thresholds that have both an upper value and a lower value, the threshold is triggered when it exceeds the upper value or drops below the lower value. All thresholds other than RXP, TXP, and Utilization percentage are measured per minute. The RXP, TXP, and Utilization percentage thresholds are measured per hour.
PAGE 151
MAPS Threshold Values TABLE 45 Conservative policy default D_Port Port Health monitoring threshold values and actions Monitoring statistic Unit Threshold Actions CRC Errors (defALL_D_PORTSCRC_3) Min 3 EMAIL, SNMP, RASLOG Invalid Transmit Words (defALL_D_PORTSITW_3) Min 3 EMAIL, SNMP, RASLOG Link Failure (defALL_D_PORTSLF_3) Min 3 EMAIL, SNMP, RASLOG Sync Loss (defALL_D_PORTSLOSS_SYNC_3) Min 3 EMAIL, SNMP, RASLOG CRC Errors (defALL_D_PORTSCRC_H90) Hour 90 EMAIL, SNMP, RASLOG Invalid
PAGE 152
MAPS Threshold Values TABLE 47 Default Host F_Port Port Health monitoring threshold values and actions (continued) Monitoring statistic Host F_Port monitoring threshold values by policy Actions Aggressive Moderate Conservative Low: 0 Low: 10 Low: 21 Low: EMAIL, SNMP, RASLOG High: 2 High: 20 High: 40 High: EMAIL, SNMP, FENCE, DECOM Invalid Transmit Words (ITW) Low: 15 Low: 21 Low: 41 Low: EMAIL, SNMP, RASLOG High: 20 High: 40 High: 80 High: EMAIL, SNMP, FENCE, DECOM Link Reset (LR) L
PAGE 153
MAPS Threshold Values TABLE 49 Default non-F_Port Port Health monitoring threshold values and actions Non-F_Port monitoring threshold values by policy Monitoring statistic Aggressive Moderate Conservative Actions C3 Time out (C3TX_TO) N/A N/A N/A N/A CRC Errors (CRC) Low: 0 Low: 10 Low: 21 Low: EMAIL, SNMP, RASLOG High: 2 High: 20 High: 40 High: EMAIL, SNMP, FENCE, DECOM Invalid Transmit Words (ITW) Low: 15 Low: 21 Low: 41 Low: EMAIL, SNMP, RASLOG High: 20 High: 40 High: 80 High:
PAGE 154
MAPS Threshold Values TABLE 51 Default security monitoring thresholds and actions Security monitoring threshold values by policy Monitoring statistic Aggressive Moderate Conservative Actions DCC violations 0 2 4 RASLOG, SNMP, EMAIL HTTP violation 0 2 4 RASLOG, SNMP, EMAIL Illegal command 0 2 4 RASLOG, SNMP, EMAIL Incompatible security DB 0 2 4 RASLOG, SNMP, EMAIL Login violations 0 2 4 RASLOG, SNMP, EMAIL Invalid certifications 0 2 4 RASLOG, SNMP, EMAIL No-FCS 0 2 4 R
PAGE 155
MAPS Threshold Values Quad SFPs and all other SFP monitoring threshold defaults The following table lists the default threshold and actions for Quad SFPs (QSFPs) and all other SFPs.
PAGE 156
MAPS Threshold Values Switch status policy monitoring thresholds The only timebase the Switch status monitors support is the “None” timebase. The following tables list the default switch status policy monitoring thresholds used by MAPS. All threshold conditions are absolute and actions are triggered when the reported value is greater than or equal to the threshold value.
PAGE 157
MAPS Threshold Values TABLE 57 Moderate policy default Switch Status monitoring thresholds and actions (continued) Monitoring statistic Switch status threshold values (Marginal/ Critical) Actions Critical: SW_CRITICAL, SNMP, EMAIL Absent or faulty fans (BAD_FAN) 1/2 Marginal: SW_MARGINAL, SNMP, EMAIL Flash usage (FLASH_USAGE) 90 RASLOG, SNMP, EMAIL Percentage of marginal ports (MARG_PORTS) 6/10 Marginal: SW_MARGINAL, SNMP, EMAIL Percentage of error ports (ERR_PORTS) 6/10 Percentage of faulty
PAGE 158
MAPS Threshold Values TABLE 58 Conservative policy default Switch Status monitoring thresholds and actions (continued) Monitoring statistic Switch status threshold values (Marginal/ Critical) Actions Critical: SW_CRITICAL, SNMP, EMAIL Core blade monitoring (DOWN_CORE) DCX, DCX+: 1/2 HA Sync (HA_SYNC) DCX, DCX+, : sync=0 Marginal: SW_MARGINAL, SNMP, EMAIL Critical: SW_CRITICAL, SNMP, EMAIL SW_MARGINAL, SNMP, EMAIL Traffic Performance thresholds The following Traffic Performance monitors support the