What is your first reaction on a network fault? A direct login to the device is not a recommended approach to start troubleshooting an alert. Providing an access to every team member for all the devices in the network has associated risks and is an option that does not scale in expanding teams. That said, you would like the team to be equipped with the required utilities that lets them quickly troubleshoot the network performance problems. Instant accesses to the network troubleshooting tools help isolate a fault quickly and embark on the actions needed to resolve a fault as early as possible.
Network monitoring and diagnostic tools provide the basic core monitoring requirements like bandwidth, availability and usage. The specifics to look in an efficient network monitoring tool are extensive support for common protocols (SNMP, WMI, CLI) and technologies (NetFlow, sFlow, jFlow and Packet Sniffing). Configurable alert notifications, reporting capabilities, and customizable dashboards are features to keep in mind for ease of use and accessibility. Understanding the basic requirements and factoring in on the essentials is necessary when it comes to choosing the right network monitoring solution. But apart from the features, there are several more critical aspects to keep in mind when selecting the right network monitoring tool.
Network monitor tools have the potential to incorporate Artificial Intelligence and Machine Learning as both thrive on data. With machine learning, network monitoring tools can adapt to the networking environment and provide suggestions based on the data available.
Possibilities of network monitoring with AI and ML:
With rapid development in artificial intelligence, automation is at a tipping point. In network monitoring, automation helps network monitor tools to react based on thresholds or a set of rules/ criteria being met. With automation, the monitoring tool can automatically detect and troubleshoot problems (proactive monitoring), send alert notifications and also provide suggestions for better network performance and maintenance based on usage and priority.
Benefits of Automation in network monitoring:
A common problem faced by IT operators and system administrators is the lack of visibility. A network monitoring tool that provides comprehensive, detailed visibility into various monitoring aspects of the network in a consolidated manner that also provides the flexibility of choosing what you want to see, will help you to stay on top of your network. The information from these various tools must be presented in a common screen with at-a-glance charts and intuitive graphs. A network monitoring tool can be enhanced to perform more advanced operations. It is important to have support for add-ons and integrations to monitor a wider aspect of the network.
Network scalability is an important aspect when it comes to selecting an efficient network monitoring tool in recent times where networks are perpetual. A network monitoring tool is described as scalable when it is more adaptable to the changing needs or demands of the business or users. Scalability helps a network to stay in par with increased productivity, trends, changing needs and new adaptations and ensures that the overall network performance may not significantly degrade, regardless the size of the network increasing.
The following tools help you to perform the first and second level troubleshooting based on the nature of the network fault, thus making it robust enough to be chosen as an enterprise network monitoring tool.
When you receive a 'device down' alert, the first condition that you might want to assess is if the device is reachable. From the device snapshot page in OpManager, do an instant ping and check for response. You can troubleshoot further using the other network monitoring tools if a ping to the device fails, or if the response time is very high.
When you troubleshoot a device down alert using Ping and the device fails to respond, you can Traceroute to determine if the device is not reachable because of a failure in the path. Trace the route from OpManager to the destination device, check the number of hops to the monitored device and spot the exact point of delay or outage. Again, this serves as a first level troubleshooting and based on the response, you can switch to other monitoring tools to drill down a fault.
This tool helps you see the port-wise connectivity of devices to the network switches which is necessary to help troubleshoot high traffic issues. The Switch Port Mapper is a network monitoring tool that gives you the MAC address, IP Address and DNS names of the devices connected to the switch.
When there is high system resource utilization, an instant check on the current resource performance helps in assessing how severe the performance impact is. An unattended resource crunch can lead to severe downtimes. Let's assume you receive a threshold violation alert for memory utilization on a critical server. The first step would be to determine if it is a transient spike or it has been that way for some time. OpManager's Real time network monitoring tool comes handy in such cases. The administrator can instantly access and resolve them quickly.
This network monitoring tool is very specific to server performance monitoring. Use this tool to launch the list of Top 10 processes by CPU or Memory utilization from the device snapshot page. This option lets you terminate the offending process immediately and avert a server crash.
The MIB Browser tool is a complete SNMP MIB Browser that enables loading and browsing MIBs and allows you to perform all SNMP-related operations. You can also view, operate, and set the data available through the SNMP agent running on a managed device. The in-built trap viewer lets you view all the incoming traps including the devices that are not managed in OpManager. The MibBrowser is a comprehensive network monitoring tool to troubleshoot all SNMP related monitoring issues.
Check out this article for steps to troubleshoot an SNMP monitor.
The Syslog Viewer in OpManager lets you view the syslog packets sent by the devices to the OpManager server. This network tool helps an administrator find out if the monitored devices are correctly forwarding the messages to the configured syslog server (OpManager server in this case). You can choose to monitor specific syslog messages by configuring syslog monitoring rules by filtering the rules that match specific criteria.
Establish a CLI session with the Unix devices to troubleshoot quickly. You might want to execute some CLI commands on the device to check what is causing a high CPU utilization on the device and decide to terminate a process or kill a service to free-up the resource. This tool serves as a first and second level troubleshooting utility as it lets you act immediately on certain alerts using CLI commands.
Similar to CLI sessions for the Unix-based devices, you can authenticate to remote Windows devices from OpManager using the Remote Desktop Connection tool and perform the allowed operations on the device.
Yet another option to check if a device is reachable- a quick way to know if a device, your web-server for instance, and is responding to a http/https request. The new generation network devices come with a built-in GUI to telnet to the devices. You can connect to the GUI using the 'http/https' access.
Access your OpManager anytime from anywhere using the new SmartPhone GUI. This lets you visualize your infrastructure, act on the alerts, drill-down to the root cause of the problem without having to be physically present in your server room to resolve a fault! Here is a short video on the network monitoring tasks this tool can perform.