CISCO UCS Monitoring


Overview

The Cisco Unified Computing System (CISCO UCS) is a next-generation data center platform that unites compute, network, storage access, and virtualization into a cohesive system designed to reduce total cost of ownership (TCO) and increase business agility.

Applications Manager offers monitoring for Cisco UCS environments, where you can monitor and track various KPIs of the applications and the system. It collects real-time Cisco UCS's data to present it in a dashboard that is easy to understand and helps you to identify the reason why your system has deviated from the ideal performance.

Creating a new monitor

Follow the steps given below to create a new Cisco UCS monitor:

  1. Click on New Monitor link. Select Cisco UCS under Converged Infrastructure category.
  2. Specify the Display Name.
  3. Enter the Hostname / IP address of the server on which Cisco UCS Manager is running.
  4. Specify the Port at which Cisco UCS Manager is running. The default port is 80.
  5. Choose SSL is enabled option, if Cisco UCS Manager is to be accessed via SSL port.
  6. Enter the credential details like username and password of the Cisco UCS Manager for authentication, or select the required credentials from the Credential Manager list after enabling the Select from Credential list option.
  7. Specify the Timeout value in seconds.
  8. Specify the Polling Interval in minutes.
  9. Choose the Monitor Group with which you want to associate the Cisco UCS Manager to, from the combo box (optional). You can choose multiple groups to associate your monitor.
  10. Click Add Monitor(s). This discovers the Cisco UCS Manager from the network and starts monitoring it.

Monitored Parameters

Go to the Monitors Category View by clicking the Monitors tab. Click on Cisco UCS under the Converged Infrastructure table. Displayed is the Cisco UCS bulk configuration view distributed into three tabs:

  • Availability tab gives the Availability history for the past 24 hours or 30 days.
  • Performance tab gives the Health Status and events for the past 24 hours or 30 days.
  • List view enables you to perform bulk admin configurations.

On clicking a monitor from the list, you'll be taken to the Cisco UCS monitor dashboard. It has 9 tabs -

Overview

This tab provides details about the overall count of all the components present in the UCS system.

Parameter Description
Server Response Time:
Response time The response time of the Cisco UCS Manager. (ms)
Components:
Total number of Chassis Servers The total number of Chassis servers in the UCS system.
Total number of Rack Mount Servers The total number of Rack Mount servers in the UCS system.
Total number of Fabric Extenders The total number of fabric extenders in the UCS system.
Total number of Fabric Interconnects The total number of fabric interconnects in the UCS system.
Total number of Processor Units The total number of processor units in the UCS system.
Total number of Adaptor Units The total number of adaptor units in the UCS system.
Total number of I/O Modules The total number of input-output modules in the UCS system.

Chassis

This tab provides details about the performance metrics of various chassis available in the UCS system.

Parameter Description
Chassis:
Chassis Name The name of the chassis.
Chassis Server Count The number of Chassis servers present in the chassis.
I/O Module Count The number of input-output modules present in the chassis.
Fan Count The number of fans available in the chassis.
Power Unit Count The number of power units (PSU) available for the chassis.
Chassis Operational Status The operational status of the chassis. ( Operable / Degraded )
Chassis Server:
Name The name of the blade server.
Chassis Name The name of the chassis.
Model The model name of the blade server.
Operablility Denotes the operability condition of the blade server.
Power The power state of the blade server. (On / Off)
Adaptor count The number of adaptors available.
Network Interface Cards The number of Network Interface Cards present.
Memory and CPU:
Name The name of the blade server.
Chassis Name The name of the chassis.
Core count The number of CPU cores available.
Core Enabled The number of CPU cores that are enabled.
CPU count The number of CPUs available.
Thread count The total number of threads available in the CPUs.
Available Memory The amount of memory available in the server. (GB)
Total Memory The total amount of memory allocated to the server. (GB)
Available Memory % The amount of memory available in the server. (%)
Utilized Memory % The amount of memory utilized by the server. (%)
Motherboard Power:
Name The name of the blade server.
Chassis Name The name of the chassis.
Power Consumed The amount of power currently being consumed by the motherboard. ( Watts )
Max Power Consumed The maximum amount of power consumed by the motherboard. ( Watts )
Min Power Consumed The minimum amount of power consumed by the motherboard. ( Watts )
Input Current The amount of input current flowing to the motherboard currently. ( Ampere )
Max Input Current The maximum amount of input current received by the motherboard. ( Ampere )
Min Input Current The minimum amount of input current received by the motherboard. ( Ampere )
Input Voltage The amount of input voltage currently being fed to the motherboard. ( Voltage )
Max Input Voltage The maximum amount of voltage received by the motherboard. ( Voltage )
Min Input Voltage The minimum amount of voltage received by the motherboard. ( Voltage )
Motherboard Temperature:
Name The name of the blade server.
Chassis Name The name of the chassis.
Front Temperature The temperature value indicated by the front-panel temperature sensor. ( Celsius )
Rear Temperature The temperature value indicated by the rear-panel temperature sensor. ( Celsius )
Rear Temperature Left The temperature value indicated by the left rear-panel temperature sensor. ( Celsius )

Rack Mount

This tab provides details about the performance metrics of various rack mount servers available in the UCS system.

Parameter Description
Rack Mount Server:
Name The name of the blade server.
Model The model name of the rack server.
Operability Denotes the operational condition of the blade server.
Power The power state of the blade server. (On / Off)
Adaptor count The number of adaptors available.
Network Interface Cards The number of Network Interface Cards available.
Memory and CPU:
Name The name of the rack server.
Core Count The number of CPU cores available.
Core Enabled The number of CPU cores enabled.
CPU Count The number of CPUs available.
Thread Count The total number of threads available in the CPUs.
Available Memory The amount of memory available in the server. (GB)
Total Memory The total amount of memory allocated to the server. (GB)
Available Memory % The amount of memory available in the server. (%)
Utilized Memory % The amount of memory utilized by the server. (%)
Motherboard Power:
Name The name of the rack server.
Power Consumed The amount of power currently being consumed by the motherboard. ( Watts )
Max Power Consumed The maximum amount of power consumed by the motherboard. ( Watts )
Min Power Consumed The minimum amount of power consumed by the motherboard. ( Watts )
Input Current The amount of input current flowing to the motherboard currently. ( Ampere )
Max Input Current The maximum amount of input current received by the motherboard. ( Ampere )
Min Input Current The minimum amount of input current received by the motherboard. ( Ampere )
Input Voltage The amount of input voltage currently being fed to the motherboard. ( Voltage )
Max Input Voltage The maximum amount of voltage received by the motherboard. ( Voltage )
Min Input Voltage The minimum amount of voltage received by the motherboard. ( Voltage )
Motherboard Temperature:
Name The name of the rack server.
Front Temperature The temperature value indicated by the front- panel temperature sensor. ( Celsius )
Rear Temperature The temperature value indicated by the rear - panel temperature sensor. ( Celsius )
Ambient Temperature The ambient temperature value of the motherboard. ( Celsius )
IO Hub1 Temperature Right The temperature value of I/O Hub1. ( Celsius )
IO Hub2 Temperature Right The temperature value of I/O Hub2. ( Celsius )

Fabric Interconnect

This tab provides details about the performance metrics of various Fabric Interconnects available in the UCS system.

Parameter Description
Fabric Interconnect:
Name The name of the Fabric Interconnect ( FI ).
Fan The number of fans associated with the FI.
Power Supply Unit Count The number of Power Supply Units (PSU) available in the FI.
Fabric Interconnect Software:
Name Name of the Fabric interconnect ( FI )
Available Memory The amount of memory available in the server. (GB)
Total Memory The total amount of memory allocated to the server. (GB)
Cached Memory The amount of cached memory of the server. (GB)
Available Memory % The amount of memory available in the server. (%)
CPU utilization % The current CPU utilization of the server. (%)
Fabric Interconnect Power Unit (PSU):
Name The name of the PSU.
Fabric Interconnect The name of the Fabric Interconnect ( FI ).
Power Consumed The amount of power currently being consumed by the PSU. ( Watts )
Max Power Consumed The maximum amount of power consumed by the PSU. ( Watts )
Min Power Consumed The minimum amount of power consumed by the PSU. ( Watts )
Input Current The amount of input current flowing to the PSU currently. ( Ampere )
Max Input Current The maximum amount of input current received by the PSU. ( Ampere )
Min Input Current The minimum amount of input current received by the PSU. ( Ampere )
Input Voltage The amount of input voltage currently being fed to the PSU. ( Voltage )
Max Input Voltage The maximum amount of voltage received by the PSU. ( Voltage )
Min Input Voltage The minimum amount voltage received by the PSU. ( Voltage )
Fabric Extender:
Name The name of the Fabric Extender ( FEX ).
Fan The number of fans associated with the FEX.
I/O Module The number of I/O modules present in the FEX.
Power Supply Unit Count The number of Power Supply Units (PSU) present in the FEX.

Processors

This tab provides details about the performance metrics of various processors available in the UCS system.

Parameter Description
Processors
Name The name of the processor.
Equipment The equipment at which the processor is present.
Model The model name of the processor.
Speed The speed of the processor.
Core Count The number of cores available.
Core Enabled The number of cores enabled.
Thread Count The number of threads available.
CPU Temperature The current temperature value of the CPU.
CPU Input current The current input current value of the CPU.

Fans

This tab provides details about the performance metrics of various fans available in the UCS system.

Parameter Description
Fan Module:
Name The name of the fan module.
Equipment The equipment at which the fan module is present.
Fans The number of fans available in the module.
Thermal Condition The thermal condition of the fan module.
Fan Module Power The power state of the fan module. (On/Off)
Fan Module Operability Denotes the operability of the fan module.
Fans:
Name The name of the fan.
Fan Module The name of the fan module.
Equipment The equipment at which the fan module is present.
Model The model name of the fan.
Thermal Condition The thermal condition of the fan.
Fan Power The power state of the fan. (On/Off)
Fan Operability Denotes the operability of the fan.
Drive Percentage The drive performance of the fan. (%)
Speed The speed of the fan. (RPM)
Max Speed The maximum speed of the fan. (RPM)
Min Speed The minimum speed of the fan. (RPM)

I/O Module

This tab provides details about the performance metrics of various I/O modules available in the UCS system.

Parameter Description
I/O Module:
Name The name of the I/O module.
Equipment The equipment at which the I/O module is present.
Model The model name of the I/O module.
Thermal Condition The thermal condition of the I/O module.
Operability Denotes the operability of the I/O module.

Ports

This tab provides details about the performance metrics of various ports available in the UCS system.

Parameter Description
Ethernet ports:
Name The name of the Ethernet port.
Equipment The equipment at which the Ethernet port is present.
Mac Address The MAC address of the Ethernet port.
Interface Role The interface role of the Ethernet port.
Interface Type The interface type of the Ethernet port.
Ethernet port status The operability status of the Ethernet port.
Ethernet Admin State The admin state of the Ethernet port.
Slot ID The slot ID associated with the Ethernet port.
Operational Speed The operational speed of the Ethernet port.
Backplane ports:
Name The name of the backplane port.
Equipment The equipment at which the backplane port is present.
Slot ID The slot ID associated with the backplane port.
Mac Address The MAC address of the backplane port.
Interface Role The interface role of the backplane port.
Interface Type The interface type of the backplane port.
BackPlane Port status The operability status of the backplane port.
BackPlane Admin State The admin state of the backplane port.
Fabric ports:
Name The name of the fabric port.
Equipment The equipment at which the fabric port is present.
Slot ID The slot ID associated with the fabric port.
Mac Address The MAC address of the fabric port.
Interface Role The interface role of the fabric port.
Interface Type The interface type of the fabric port.
Fabric Port status The operability status of the fabric port.
Fabric Admin State The admin state of the fabric port.

Faults

This tab provides details about the faults that are available in the UCS system.

Parameter Description
Fault Statistics:
Critical Faults The number of faults that are of Critical severity.
Major Faults The number of faults that are of Major severity.
Minor Faults The number of faults that are of Minor severity.
Warning Faults The number of faults that are of Warning severity.
Faults between consecutive polls:
Fault Code The fault code which describes the fault.
Fault Id The ID of the fault occurred.
Type The severity type of the fault occurred. (Critical/Major/Minor/Warning)
Fault Affected object The hardware object that was affected by the fault. 
Fault Cause The cause for the fault to occur.
Fault Created Time The time at which the fault was created.
Last transition Time The time at which the state of fault was changed.
Fault Description The description of the fault occurred.
Show All Faults Displays all the faults that are currently present in the system.