Metrics to measure your threat hunting techniques

Metrics are essential for evaluating the performance and success of any program; for threat hunting programs, they are even more critical due to the lack of standard hunting procedures. So it becomes essential to develop common metrics with which you can measure the ROI of your hunting program. In general, the metrics should:

Answer the question, "Is my hunting program successful?"
Help track what you're doing and optimize your hunting program.
Prove the positive business value of your hunting program.

Therefore, before you operationalize the hunting techniques and procedures stated in the previous chapter, you should formulate the metrics and tracking system to measure the performance of your hunting program. Because, you can't manage what you can't measure.

Different metrics that you can adopt

To measure the success of your threat hunting program and ensure a better ROI, you need to track two types of metrics:

Actionable metrics: Provide you data-driven feedback on the hunting program.
Performance indicators: Tell you whether the hunting program is successful or not.

It's essential for you to utilize both these kinds of metrics to continuously identify gaps, fix them, and optimize your hunting program.

The actionable metrics are empirical, and so they are easier to track than performance indicators. These should be updated in real-time and should be accessible to everyone in your security operations center (SOC).

On the other hand, the performance indicators (PIs) measure the success of the hunting program against your core principles and mission. These metrics do not take into account the number of threat hunts completed or the techniques used by analysts. These metrics track only the effectiveness of your hunting program and the percentage of analysts who are improving their hunting skills.

Below are the basic actionable metrics and performance indicators that you can use to measure the performance of your hunting program.

Actionable metrics

No.	Metric	What to look for and why
1	Data coverage and retention time	Data sets are the main source of input for hunting. Concentrate on measuring the availability of different data types based on network segment or organizational unit. You can even measure this granularly at the endpoint level. While data coverage is the wide variety of data sets that you need to bring into your threat hunting program, the data retention time specifies how long you're going to make them available for hunting techniques. If you have a lower data retention time, then your data miss ratio will be higher. These metrics not only help you set the basics of hunting techniques but also ascertain if your hunting techniques are biased or not to one data set.
2	Data hit or miss ratio	When you formulate a technique and the data source for it is readily available, it is a data hit. A miss is when a data source needed for a hunt is not readily available. Tracking these ratios helps you fix the gaps in data coverage and retention.
3	Number of compromised hosts detected by severity	Measuring the trend of compromised hosts over time can help analysts identify the gaps and/or misconfigured security settings on the devices. Further, it helps you to focus on developing hunts for the most affected and crucial components of your network, such as internet-facing resources.
4	Mean time to detect (MTTD) a threat	In an ideal cyber space, the time taken to hunt an incident should be near zero. But we don't always live in an ideal world, do we? MTTD has two parameters—time from infection until detection and time from detection to investigation. Develop a plan to measure both these parameters. Please note that MTTD does not exactly express the efficiency of your threat hunting. Instead, it helps you normalize your hunting procedures. It determines if there are steps in the kill chain (or attack model) you may be focusing on too much.
5	Number of backlogged hunts	This metric will provide you insights on which area you need to focus on, including whether the hunting is on hold due to non-availability of a data source, threat intelligence, or people, among other examples. This metric is one of the basic units for measuring the gaps in your threat hunting program. Analyzing the trend of this metric will help you plan the utilization of resources effectively, and channel efforts to the areas that needs the most attention.
6	Time taken to complete a hunt	This metric measures the total time taken to complete a hunt, starting with creating the hypothesis. It measures the total time taken from pulling the data sources needed for the hunt and searching through the security data to analyzing the impact of the threat. This will be one of the best parameters to measure the learning curve of your hunters. When you see the time taken to hunt a specific threat is getting lower, you know that the analyst is becoming more proficient in this technique and it's time to look at optimizing the plan, such as bringing in automations.
7	Number of hunts that are automated	This parameter greatly helps in assessing the maturity of your hunting program. Automating more hunts gives your hunters the bandwidth to research and come up with new hunting hypotheses.

Performance indicators (PIs)

Below are some of the tested and trusted performance indicators that help you determine whether your hunting program is a success or needs some work.

No.	Performance indicators	What to look for, and how to do it
1	Number of successfully executed hunts that are mapped to: Data sets ATT&CK attack model or cyber kill chain	Use various metrics such as number of hunts and detections, or analysis time spent on: Data sets from hosts, applications, cloud resources, and more. Each element of the ATT&CK attack matrix or the cyber kill chain stages ( reconnaissance, intrusion, weaponization, etc. ) For instance, if you're an enterprise that deals with hosting and storing customer data, then your hunting strategy focuses on detecting threats pertaining to data security and intrusions. This PI therefore helps you to align your hunting program with your goals by mapping the hunts with corresponding resources/ attack model stages.
2	False positives for a hunt	Every detected threat is a win for the enterprise; we hunt threats because other detection systems can fail to discover them. Therefore, this indicator is essential to measure the performance of a hunting technique. Automation is an indication that your hunting techniques are becoming efficient. However, even after a hunt is automated, it's essential to keep track of how many false positives have been created by those automated techniques and to check if they require improvements.
3	Number of gaps spotted and filled in the threat detection capability of traditional security solution	One of the goals of hunting is to create new automations that can hunt for threats that are missed by the existing security solutions deployed in your environment. Therefore, measuring the number of detection gaps identified and filled by a hunting technique is essential to meet the mission of a hunting program.
4	Insecure practices identified and corrected	Similar to the identification and fixing of detection gaps, a good hunting program should also uncover any security loopholes and bad security hygiene practices that may give attackers an opening to attack your network. While measuring the number of bad practices detected by a hunting technique helps determine the effectiveness of the hunting program, fixing those bad practices increases the effectiveness of the hunting technique.

Making the metrics accessible

Your job doesn't end with measuring the performance in terms of metrics. These metrics need to be constantly monitored and made accessible to the entire security and operations center (SOC) team. This is where solutions with security orchestration, automation, and response (SOAR) capabilities play a crucial role. A security information and event management (SIEM) solution that has SOAR capabilities will be able to display these metrics, especially the actionable metrics, on dashboards using a combination of total counts, bar charts, pie graphs, etc. to help you assess the status of the hunting program at any given time.

It is vital to not only view the statistics of actionable metrics over a short period (daily, weekly, and monthly basis) but also monitor them long term to assess the overall performance of a hunting program.

Analyzing this dashboard facilitates easy decision-making and helps with realigning the hunting strategy wherever needed.

Regularly review the actionable metrics dashboard for the following, and perform the recommended actions.

More hunting backlogs: Increase the number of threat analysts, and realign tools and techniques to shorten the hunting cycle.
Frequent hunts on a specific data source: This could be an indication that your technique is biased towards one environment. It's time to balance your focus.
Increased number of false positives: Check if the automation is implemented correctly.
Hunting hypotheses are more than implemented hunts: Check if you have the right processes in place to implement the hunting techniques, track the findings, and remediate the threat condition. Fine-tune wherever necessary so that you can implement more hunts efficiently and quickly.

Remember, these are just a few examples of how you can interpret the metrics. What's more important is that you try to measure the metrics in whatever way possible so that you gain maximum visibility in your threat hunting process.