VRTX Chassis Alert Management Techniques This White paper addresses the various logging and alerting mechanism in the Chassis, which the administrator rely on monitoring and controlling a VRTX Chassis. Author(s) Anto Jesurajan Arun Muthaiyan Michael Brundridge Sheshadri P.R.
Executive summary This white paper explains the various logging and alerting features available in the VRTX Chassis Management Controller (CMC). The CMC logs the events on Chassis Log, SEL Log, Remote syslog, and LCD. It can also be configured for email and SNMP alerts. With the Remote System Logging feature, CMC has the capability to remote logging and alerting, which is more essential for Administrators to easily debug and to monitor the events without being physically present in front of the system.
Contents Introduction ....................................................................................... 4 Terminology ....................................................................................... 4 Logging Types ..................................................................................... 5 Chassis Events ..................................................................................... 5 Chassis Alert Enablement ....................................................................
Introduction Logging is a technique to inform and alert administrators about any Chassis events, which is not normal and requires attention. Logging or alerting can occur through one or more of the following: CMC non-volatile memory, which in turn, reflects on the health of a chassis, on the basis of severity of an event. LCD (where the messages appear on the LCD Display) Remote management station on the basis of configuration such as the Remote System Logging (Syslog).
Logging Types The CMC has the following logging techniques as described in VRTX Logging Techniques. In Addition to the logging, it supports the alerting techniques such as SNMP and email. Figure 1. VRTX Logging Techniques Chassis Events The Critical, Warning, and Informational events are monitored by the VRTX Chassis Management Controller (CMC) firmware and logged through Chassis Log and SEL log, and can be filtered for sending email Alert, SNMP, Remote Syslog.
Figure 2.
Chassis Alert Enablement Enable Chassis Event Alerts can be configured using the Web interface (see Figure 2), RACADM Command Life Interface (CLI), or WS-MAN to send an alert for any registered event. Events can be configured to send alerts through email, SNMP trap, or Remote Syslog option. Events are not enabled until the Enable Chassis Event Alerts option is enabled.
Format Description SeqNumber The chassis log index. Message ID This is the combination of Agent ID and Message Number which is unique for an Agent ID. Severity Critical, Informational and Warning. Timestamp Indicates the time of the event. Message Arg 1 …. Message Arg N Arguments passed to the message.
Table 1 – SEL Record Format lists the format of the SEL. For more information about SEL format, refer to the IPMI Specification. Byte Field Description 1 2 Record ID ID used for SEL Record access. The Record ID values 0000h and FFFFh have special meaning in the Event Access commands and must not be used as Record ID values for stored SEL Event Records.
13 Event Dir | Event Type Event Dir [7] - 0b = Assertion event. 1b = Deassertion event. Event Type Type of trigger for the event. For example, critical threshold going high, state asserted, and so on. Also indicates class of the event. For example, discrete, threshold, or OEM. The Event Type field is encoded using the Event/Reading Type Code. See section 42.1 Event/Reading Type Codes.
SEL logs are parsed and displayed in a user-readable format. Sensor Class threshold Event Data Event Data 1 [7:6] - 00b = unspecified byte 2 01b = trigger reading in byte 2 10b = OEM code in byte 2 11b = sensor-specific event extension code in byte 2 [5:4] unspecified byte 3 01b = trigger threshold value in byte 3 10b = OEM code in byte 3 11b = sensor-specific event extension code in byte 3 00b = [3:0] - Offset from Event/Reading Code for threshold event.
OEM Event Data 1 [7:6] - 00b = unspecified in byte 2 01b = previous state and/or severity in byte 2 10b = OEM code in byte 2 11b = reserved [5:4] - 00b = unspecified byte 3 01b = reserved 10b = OEM code in byte 3 11b = reserved [3:0] - Offset from Event/Reading Type Code Event Data 2 [7:4] - Optional OEM code bits or offset from ‘Severity’ Event/Reading Type Code. (0Fh if unspecified). [3:0] - Optional OEM code or offset from Event/Reading Type Code for previous event state. (0Fh if unspecified).
POP3 is an acronym for Post Office Protocol, version 3, which is used on the mail server. This protocol helps retain the email on the receiving end, till it is read by the recipient. POP3 can be used by administrators to receive the mail from the CMC. IMAP is another popular email retrieval protocol which can also be used. Administrators can also use other email retrieval programs on mail servers other than POP3/IMAP. Figure 3.
1. For the SMTP (Email) Server configuration, enter the SMTP server details using either the dotseparated format (for example, 140.25.122.31) or the DNS name. 2. For the Modify Source Email Name, configure the desired originator email for the alert, or leave it blank to use the default email originator. The default value is cmc@[IP_address], where [IP_address] is the IP address of a CMC.
Figure 4. Email Alert Settings The SMTP (Email) Server Settings can also be set using the RACADM command line as follows: 1. To set the SMTP (Email) Server address, run the command racadm config –g cfgRemoteHosts –o cfgRhostsSmtpServerIpAddr 192.168.0.152 OR racadm config –g cfgRemoteHosts –o cfgRhostsSmtpServerIpAddr domain.name 2. To set Modify Source Email Name, run the command racadm config –g cfgAlerting –o cfgAlertingSourceEmailName user@home.com 3. To set Email Alert Destinations, run the command a.
SNMP Trap SNMP trap is a type of alert similar to an email alert, which complies with the SNMP Protocol. The SNMP trap is received by a management station or console. SNMP components SNMP Manager SNMP Managed devices SNMP agent Management Information Database referred to as Management Information Base (MIB) Figure 5.
SNMP Manager A manager or management system is a separate entity that is responsible for communicating with the SNMP agent implemented network devices. This is typically a computer that is used to run one or more network management systems.
MIB files are the set of queries that a SNMP Manager can query the agent. Agent collects these data locally and stores it, as defined in the MIB. So, the SNMP Manager is aware of these standard and private questions for every type of agent. Figure 6. SNMP Flow Diagram SNMP trap destinations and Agent settings can be set got through RACADM command line interface as follows: 1. To set the SNMP Trap Destination address, run the command racadm config -g cfgTraps -o cfgTrapsAlertDestIpAddr 143.169.25.
racadm getconfig –g cfgOobSnmp To change any of the settings on the Web page, click Chassis Overview > Alerts > Traps Settings. (Example of Chassis Event Page). To perform this task, you must have the Chassis Configuration Administrator privileges. Figure 7. SNMP Trap Settings Remote Syslog Remote Syslog Logging is a standard for computer data logging. It separates the chassis that generates messages from the management station that stores them, and the software that reports and analyzes them.
Figure 8. Remote Syslog Settings Remote Syslog settings can be set or got through RACADM command line interface as follows: 1. To enable remote syslog, run the command racadm config -g cfgRemoteHosts -o cfgRhostsSyslogEnable 1 2. To set the remote syslog port, run the command racadm config -g cfgRemoteHosts -o cfgRhostsSyslogPort 514 3. To set the remote syslog host, run the command racadm config -g cfgRemoteHosts -o cfgRhostsSyslogServer1 10.35.0.122 4.
Enabling the r-syslog in Linux Remote Syslog can be enabled in Linux management station by adding an –r option in the syslog configuration file. The configuration file has to be reloaded after the change, so it requires a restart of syslog service. The syslog server listens on the port configured on the management station or any syslog messages, and then logs on the storage repository configured on the syslog server configuration file. Learn more Visit Dell.