This section presents general guidelines that explain the errors that can appear in an event log and what actions to perform when these errors occur.
Controller event logging provides a mechanism for user processes and kernel modules to report significant events and error conditions. Events are recorded in a log file stored on disk – both on the Management Station and on the local Flash disk. The local event log is a circular 2MB buffer, however there is no limit set on the size of the event log on the management station.
All events are forwarded to the management station that manages the SAN. The management station can forward events as SNMP traps and email. Events that call out failed FRUs will contain the FRU part number as well as the enclosure and controller serial number.
If the IBM ServeRAID Manager is connected to the management station, the event will be presented to the user on the main screen event viewer. Some events will have associated HTML help that offer the user specific information about the event and detailed instructions if needed to resolve the issue that caused the event.
When the IBM ServeRAID Manager is started, events will be generated for all current known problems and presented to the user on the main screen event viewer. When the IBM ServeRAID Manager detects that a logical or physical component has failed, all affected logical and physical component icons will be overlaid with a warning or fatal icon. This allows the user to quickly drill down to the source of the problem. The information recorded for an event includes a long and short textual description of the error, along with a severity level, a time stamp and details of originator.
The event log is a persistent log file stored on disk, which may be interrogated and searched to retrieve historical event information. It is implemented as a circular buffer, so that oldest events get over-written.
Event logs are also accessible from the management station for all connected enclosures and are stored in a text file on the management station system. In addition, the Management Station event log can be accessed by the IBM ServeRAID Manager and displayed to the user. The module can be configured using an administrative interface, to define forwarding actions to be taken, and to configure the event log size.
This is the most important set of logs that you can capture. It will be forwarded to IBM support for in depth analysis. You can analyze the event log, error log and the controller configuration profile.
From ServeRAID Manager right click on desired enclosure and select "Save support archive". The following files are saved:
RaidEvt.log - The event log
RaidEvtX.log – The controller event log where X is either A or B
RaidErr.log - The error log
RaidCfg.log - Subsystem Configuration file: the configuration profile for that enclosure
diagnostics.tgz file - Compressed file containing binary and text files for Engineering analysis
Note: The support archive may take 5 to 10 minutes to save.
From Management Station collect the following files from the C:\WINDOWS\Temp directory or (C\winnt\temp):
mgmtservice.log
mgmtservice.log.old (if exists)
These two files are XML based files that show all of the events generated as the result of communication between the controller and the Management Agent running in the ServeRAID manager GUI. Any time communication is lost or critical alerts are generated, the information will be stored in this log.
From ServeRAID Manager click on the Event button on the tool bar. The Event log displays. You can save the log from the dialog "File" menu. The file Event.txt is saved in the ServeRAID install folder (typically C:\Program Files\IBM\ServeRAID Manager). This log contains all the communication and events generated between the enclosure and the Management station.
This log contains the same info as Raidevt.log.
These logs (RaidEvtX.log –where X is either A or B) can be viewed from the folder where ServeRAID manager is installed. They are local RAID events generated by the onboard ServeRAID controller and the TCP\IP listening service for the management service. From an external storage perspective it will only indicate when the connections were established and dropped by the listening service.
This the configuration profile for the enclosure.
From the ServeRAID console right click on the Host management station and select "Save Printable Configuration". The resulting file - RaidExt1.log - can be found in the ServeRAID install folder.
This file contains the ACL (Access Control List), the logical drives, array and controller information. A common problem in logical drives not being discovered is caused by an improperly set up ACL. Check the logical drive assignment information in the ACL. Another potential cause of undiscovered drives is due to the LUN failover to the alternate controller (change in the array ownership)
This log is the same as the RaidCfg.log that is saved in the Support archive file.
The event levels that can be logged by the ServeRAID Manager are:
■ INFORMATION
■ WARNING
■ ERROR
Table-1 lists the fields in each event record.
IBM ServeRAID Manager can filter on event types to display the frequency of failures.
Field |
Description |
Id |
Unique ID |
event_id |
Identifies type of event |
timestamp |
Date and time |
sequence |
Sequence number |
level |
Event level |
active |
Event is active? |
source |
Source identifier |
info |
Originator specific info |
fru_ids |
FRU ids of failing kit |
short_text |
Short text describing event |
long_text |
Detailed event description |
|
|