Chapter 1: Introduction

Log management basics

What are logs?

Logs serve as a digital trail of events that occur within your IT environment, providing a comprehensive overview of what's happening in real time. They are records of events that occur in a computer system, such as software or hardware errors, system events, security events, user activity, and network activity. They are stored in a file format that can be easily parsed and analyzed. Logs can be generated by various sources such as applications, operating systems, databases, network devices, and security devices.

Logs are an essential component for businesses of all sizes. They are used to track the performance and health of IT systems, troubleshoot issues, comply with regulations, and improve security. As we cover the basics of log management, we'll provide an overview of IT log management, including what logs are, the types of logs, the challenges involved in managing logs, the benefits of log management, and the tools used to manage logs.

Types of logs

Log type

Origin

Uses

Access log

When a user clicks on a link or makes a request to access a server.

To track activity in an IT environment.

Application log

Comes from applications and contains information about their behavior, errors, and usage.

To track and debug an application's operations.

System log

Comes from the operating system and contains information about system events, such as the start-up and shutdown of the system, hardware errors, and software errors.

To record communications about programs and system functions.

Security log

Comes from security devices such as firewalls, intrusion detection systems, and anti-virus software.

To analyze security-related events, such as attempted attacks, access violations, and malware detections.

Network log

Comes from network devices such as routers, switches, and load balancers.

To analyze network traffic such as packet drops, latency, and bandwidth usage.

Additionally, custom logs, such as garbage collection logs, are generated based on specific events. Furthermore, developers can create entirely new, custom logs and define the log types and fields.

Why log management?

Imagine you're the CTO of a rapidly growing IT organization. You must ensure that the critical applications supporting your company's operations run seamlessly to provide customers with the best possible experience. However, you must also stay informed about the intricate technical events unfolding behind the scenes so that you can identify and resolve issues quickly. This is where log management comes into play.

A well-established log management system provides valuable insights into how your systems are functioning and helps your IT teams troubleshoot issues. Each activity that occurs in your IT environment generates logs that allow you to dive into the nuances of the process.

Log management can provide several benefits for an organization:

Troubleshooting

Logs can help you identify and troubleshoot issues in IT systems, reducing downtime and improving system availability.

Security

IT security teams can leverage logs to detect and respond to security threats, such as malware injections and attempted attacks.

Compliance

IT teams can use logs to demonstrate compliance with regulatory requirements, such as PCI DSS and HIPAA.

Performance

Logs can help application teams to optimize system performance, such as identifying bottlenecks and improving resource utilization.

Challenges in log management

Log management can be challenging due to the sheer volume of logs generated by different IT components. Some of the main challenges include:

Volume

Large IT systems generate a significant volume of logs, making it difficult to manage and analyze them manually.

Complexity

Logs are generated from different sources and in different formats, making it difficult to consolidate and analyze them.

Security

Logs contain sensitive information, such as user credentials and network topology, making it essential to protect them from unauthorized access.

Establishing a well-structured log management system can help organizations tackle these challenges and leverage the full potential of logs.

Tools used in log management

Tools are necessary to create a comprehensive log management system. Several tools are available for log management, ranging from open-source solutions to enterprise-grade software. Some of the commonly used tools include:

Log analysis tools

These tools analyze logs to identify patterns, anomalies, and trends, helping administrators to troubleshoot issues and improve system performance.

Log collection tools

These tools collect logs from different IT components, consolidate them, and store them in a central repository.

Log search tools

These tools enable administrators to search logs based on different criteria, such as time range, severity, and source.

Log visualization tools

These tools present logs in graphical formats, such as charts and dashboards, making it easier to understand log data.

What to expect from this e-book

This e-book features ManageEngine's log management journey, from having a disparate logging system to creating an efficient and semi-automated log management system that empowers our IT environment. You'll learn about how we manage and leverage our logs to improve our IT operations, plus real-world use cases to provide context and insights.

ManageEngine's log management journey in a nutshell

2012 - Present

Semi-automated log management system

Highlights

  • Dedicated modules for collection, storage and processing
  • Improved automation, security and efficiency
  • Better compliance and reliability of logs

Scope for improvement

  • Complete automation of log managment
  • Incorporate log managment into IT observability
2010 - 2012

Centralized log management

Highlights

  • Logs stored a central server the local server machines, no memory overload on local servers
  • Organized and safer access controls

Challenges

  • Memory overload in central server
  • Room for improvement in security and reliabitiy
2000 - 2010

Disparate logging

Highlights

  • Logs stored in the local server machines
  • No proper access controls

Challenges

  • Memory overload in local server machines
  • Increased security risk due to poor accesss controls
  • Poor reliability of logs during audits

Figure 1: ManageEngine’s log management journey in a nutshell

Early days: Disparate logging

In the beginning, our customer base was small, and we stored logs in respective application servers:

Application server 1

Logs of application server 1

Application server 2

Logs of application server 2

And so on...

Figure 2 : Disparate logging during the early days

As our organization grew, we faced the following challenges in managing the logs effectively.

  • The rising log data consumed more storage space, leading to recurring memory overload and poor application performance. We dealt with memory overload by configuring alerts when the memory threshold was reached, then we'd delete or purge the log files. This method was both time-consuming and inefficient.
  • There was a lack of security while accessing the logs. Developers had to access the servers directly to retrieve logs, increasing the security risk to our IT environment. Despite thorough background checks on all developers, there was always a risk of a disgruntled developer misusing sensitive information with the frequent access they had to production servers.
  • The third challenge was the reliability of logs, which came under scrutiny during IT audits, as developers had direct access to the servers and could tamper with the logs.

We re-evaluated our log management strategy to tackle these challenges and establish a more effective system.

The next step forward: Centralized logging

In 2010, we took a significant step forward by introducing a centralized logging system. With this new system, we no longer stored logs on individual application servers. Instead, we collected logs from all sources and stored them in a single, dedicated server. This approach helped us address many problems we faced with disparate logging, including security and reliability concerns.

Centralized log server

Figure 3: Centralized logging introduced at ManageEngine

The central logging server provided a secure and streamlined way for developers to access logs without directly accessing the individual application servers. The central security team, responsible for monitoring and auditing access to secure servers, now had a more reliable system in place to monitor all activities on the logging server.

However, as our user base expanded, we encountered newer challenges. The increased number of users logging in to the central server resulted in higher memory usage and subsequent memory overload. Even though central logging reduced the risk of tampering, we still had room for improvement in terms of security and scalability of our log management system.

Current approach:
A semi-automated log management engine

Our IT leaders were determined to find a solution that would put an end to memory overloads, security risks, and cumbersome log audits. Consequently, in 2012, we established an enhanced logging system—the result of years of hard work, innovation, and continual improvement.

Let's dive into the details of our system and how it solved the challenges we once faced.

Overcoming the memory overload challenge

As explained earlier, we faced frequent memory overloads when logs were stored directly on the application servers. Later, after centralizing the storage of logs, we faced another challenge: increased user activity in the central server causing memory usage to skyrocket, which led to system crashes and unavailability.

In our current system, we've implemented a distributed file system for back-end storage and a separate console for user activity to overcome this challenge. This system has enabled developers to access the console securely from their browser and search for logs (similar to searching for other files on their computer) without the risk of memory overload. This solution has proven to be scalable, reliable, and secure, providing a seamless experience for our developers.

Tackling the security risks challenge

Our earlier logging system had inherent security risks, as developers had to log in to either the application servers or the central server to access logs. This resulted in the possibility for log tampering, and it also made our system vulnerable to potential security breaches.

Our current, semi-automated logging system has resolved these concerns by providing secure access to logs via the browser. Our developers can now use their existing login credentials to easily search for and fetch logs, with the system designed to securely store, index, and retrieve logs, without the need for separate login credentials.

Resolving the cumbersome log
audits challenge

Our earlier systems allowed developers to modify the log audit trail by logging in to the application servers or centralized controls, making the log audit process time-consuming and complex. Our current system incorporates a log agent that runs parallel to the builds created by developers to address this challenge.

The log agent records all the activities of developers, including the data logged and the data sent to the log servers. As a result, log audits are now simple and streamlined, saving us valuable time and resources.

How did we develop a system that overcame these challenges? And, how does this system operate today, helping us improve our IT environment? Continue reading for the answers to these questions and more.

Get fresh content in your inbox

By clicking 'keep me in the loop', you agree to processing of personal data according to the Privacy Policy.