Blog
/
Proactive Security
/
June 25, 2024

Let the Dominos Fall! SOC and IR Metrics for ROI

Vendors are scrambling to compare MTTD metrics laid out in the latest MITRE Engenuity ATT&CK® Evaluations. But this analysis is reductive, ignoring the fact that in cybersecurity, there are far more metrics that matter.
Inside the SOC
Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.
Written by
John Bradshaw
Sr. Director, Technical Marketing
Default blog imageDefault blog imageDefault blog imageDefault blog imageDefault blog imageDefault blog image
25
Jun 2024

One of the most enjoyable discussions (and debates) I engage in is the topic of Security Operations Center (SOC) and Incident Response (IR) metrics to measure and validate an organization’s Return on Investment (ROI). The debate part comes in when I hear vendor experts talking about “the only” SOC metrics that matter, and only list the two most well-known, while completely ignoring metrics that have a direct causal relationship.

In this blog, I will discuss what I believe are the SOC/IR metrics that matter, how each one has a direct impact on the others, and why organizations should ensure they are working towards the goal of why these metrics are measured in the first place: Reduction of Risk and Costs.

Reduction of Risk and Costs

Every security solution and process an organization puts in place should reduce the organization’s risk of a breach, exposure by an insider threat, or loss of productivity. How an organization realizes net benefits can be in several ways:

  • Improved efficiencies can result in SOC/IR staff focusing on other areas such as advanced threat hunting rather than churning through alerts on their security consoles. It may also help organizations dealing with the lack of skilled security staff by using Artificial Intelligence (AI) and automated processes.
  • A well-oiled SOC/IR team that has greatly reduced or even eliminated mundane tasks attracts, motivates, and retains talent resulting in reduced hiring and training costs.
  • The direct impact of a breach such as a ransomware attack can be devastating. According to the 2024 Data Breach Investigations Report by Verizon, MGM Resorts International reported the ALPHV ransomware cost the company approximately $100 million[1].
  • Failure to take appropriate steps to protect the organization can result in regulatory fines; and if an organization has, or is considering, purchasing Cyber Insurance, can result in declined coverage or increased premiums.

How does an organization demonstrate they are taking proactive measures to prevent breaches? That is where it's important to understand the nine (yes, nine) key metrics, and how each one directly influences the others, play their roles.

Metrics in the Incident Response Timeline

Let’s start with a review of the key steps in the Incident Response Timeline:

Seven of the nine key metrics are in the IR timeline, while two of the metrics occur before you ever have an incident. They occur in the Pre-Detection Stage.

Pre-Detection stage metrics are:

  • Preventions Per Intrusion Attempt (PPIA)
  • False Positive Reduction Rate (FPRR)

Next is the Detect and Investigate stage, there are three metrics to consider:

  • Mean Time to Detection (MTTD)
  • Mean Time to Triage (MTTT)
  • Mean Time to Understanding (MTTU)

This is followed by the Remediation stage, there are two metrics here:

  • Mean Time to Containment (MTTC)
  • Mean Time to Remediation / Recovery (MTTR)

Finally, there is the Risk Reduction stage, there are two metrics:

  • Mean Time to Advice (MTTA)
  • Mean Time to Implementation (MTTI)

Pre-Detection Stage

Preventions Per Intrusion Attempt

PPIA is defined as stopping any intrusion attempt at the earliest possible stage. Your network Intrusion Prevention System (IPS) blocks vulnerability exploits, your e-mail security solution intercepts and removes messages with malicious attachments or links, your egress firewall blocks unauthorized login attempts, etc. The adversary doesn’t get beyond Step 1 in the attack life cycle.

This metric is the first domino. Every organization should strive to improve on this metric every day. Why? For every intrusion attempt you stop right out of the gate, you eliminate the actions for every other metric. There is no incident to detect, triage, investigate, remediate, or analyze post-incident for ways to improve your security posture.

When I think about PPIA, I always remember back to a discussion with a former mentor, Tim Crothers, who discussed the benefits of focusing on Prevention Failure Detection.

The concept is that as you layer your security defenses, your PPIA moves ever closer to 100% (no one has ever reached 100%). This narrows the field of fire for adversaries to breach into your organization. This is where novel, unknown, and permuted threats live and breathe. This is where solutions utilizing Unsupervised Machine Learning excel in raising anomalous alerts – indications of potential compromise involving one of these threats. Unsupervised ML also raises alerts on anomalous activity generated by known threats and can raise detections before many signature-based solutions. Most organizations struggle to find strong permutations of known threats, insider threats, supply chain attacks, attacks utilizing n-day and 0-day exploits. Moving PPIA ever closer to 100% also frees your team up for conducting threat hunting activities – utilizing components of your SOC that collect and store telemetry to query for potential compromises based on hypothesis the team raises. It also significantly reduces the alerts your team must triage and investigate – solving many of the issues outlined at the start of this paper.

False Positive Reduction Rate

Before we discuss FPRR, I should clarify how I define False Positives (FPs). Many define FPs as an alert that is in error (i.e.: your EDR alerts on malware that turns out to be AV signature files). While that is a FP, I extend the definition to include any alert that did not require triage / investigation and distracts the SOC/IR team (meaning they conducted some level of triage / investigation).

This metric is the second domino. Why is this metric important? Every alert your team exerts time and effort on that is a non-issue distracts them from alerts that matter. One of the major issues that has resonated in the security industry for decades is that SOCs are inundated with alerts and cannot clear the backlog. When it comes to PPIA + FPRR, I have seen analysts spend time investigating alerts that were blocked out of the gate while their screen continued to fill up with more. You must focus on Prevention Failure Detection to get ahead of the backlog.

Detect and Investigate Stages

Mean Time to Detection

MTTD, or “Dwell Time”, has decreased dramatically over the past 12 years. From well over a year to 16 days in 2023[2]. MTTD is measured from the earliest possible point you could detect the intrusion to the moment you actually detect it.

This third domino is important because the longer an adversary remains undetected, the more the odds increase they will complete their mission objective. It also makes the tasks of triage and investigation more difficult as analysts must piece together more activity and adversaries may be erasing evidence along the way – or your storage retention does not cover the breach timeline.

Many solutions focusing solely on MTTD can actually create the very problem SOCs are looking to solve.  That is, they generate so much alerting that they flood the console, email, or text messaging app causing an unmanageable queue of alerts (this is the problem XDR solutions were designed to resolve by focusing on incidents rather than alerts).

Mean Time to Triage

MTTT involves SOCs that utilize Level 1 (aka Triage) analysts to render an “escalate / do not escalate” alert verdict accurately. Accuracy is important because Triage Analysts typically are staff new to cyber security (recent grad / certification) and may over escalate (afraid to miss something important) or under escalate (not recognize signs of a successful breach). Because of this, a small MTTT does not always equate to successful handling of incidents.

This metric is important because keeping your senior staff focused on progressing incidents in a timely manner (and not expending time on false positives) should reduce stress and required headcount.

Mean Time to Understanding

MTTU deals with understanding the complete nature of the incident being investigated. This is different than MTTT which only deals with whether the issue merits escalation to senior analysts. It is then up to the senior analysts to determine the scope of the incident, and if you are a follower of my UPSET Investigation Framework, you know understanding the full scope involves:

U = All compromised accounts

P = Persistence Mechanisms used

S = All systems involved (organization, adversary, and intermediaries)

E = Endgame (or mission objective)

T = Techniques, Tactics, Procedures (TTPs) utilized by the adversary

MTTU is important because this information is critical before any containment or remediation actions are taken. Leave a stone unturned, and you alert the adversary that you are onto them and possibly fail to close an avenue of access.

Remediation Stages

Mean Time to Containment

MTTC deals with neutralizing the threat. You may not have kicked the adversary out, but you have halted their progress to their mission objective and ability to inflict further damage. This may be through use of isolation capabilities, termination of malicious processes, or firewall blocks.

MTTC is important, especially with ransomware attacks where every second counts. Faster containment responses can result in reduced / eliminated disruption to business operations or loss of data.

Mean Time to Remediation / Recovery

The full scope of the incident is understood, the adversary has been halted in their tracks, no malicious processes are running on any systems in your organization. Now is the time to put things back to right. MTTR deals with the time involved in restoring business operations to pre-incident stage. It means all remnants of changes made by the adversary (persistence, account alterations, programs installed, etc.) are removed; all disrupted systems are restored to operations (i.e.: ransomware encrypted systems are recovered from backups / snapshots), compromised user accounts are reset, etc.

MTTR is important because it informs senior management of how fast the organization can recover from an incident. Disaster Recovery and Business Continuity plans play a major role in improving this score.

Risk Reduction Stages

Mean Time to Advice

After the dust has settled from the incident, the job is not done. MTTA deals with identifying and assessing the specific areas (vulnerabilities, misconfigurations, lack of security controls) that permitted the adversary to advance to the point where detection occurred (and any actions beyond). The SOC and IR teams should then compile a list of recommendations to present to management to improve the security posture of the organization so the same attack path cannot be used.

Mean Time to Implement

Once recommendations are delivered to management, how long does it take to implement them? MTTI tracks this timeline because none of it matters if you don’t fix the holes that led to the breach.

Nine Dominos

There are the nine dominos of SOC / IR metrics I recommend helping organizations know if they are on the right track to reduce risk, costs and improve morale / retention of the security teams. You may not wish to track all nine, but understanding how each metric impacts the others can provide visibility into why you are not seeing expected improvements when you implement a new security solution or change processes.

Improving prevention and reducing false positives can make huge positive impacts on your incident response timeline. Utilizing solutions that get you to resolution quicker allows the team to focus on recommendations and risk reduction strategies.

Whichever metrics you choose to track, just be sure the dominos fall in your favor.

References

[1] 2024 Verizon Data Breach Investigations Report, p83

[2] Mandiant M-Trends 2023

Inside the SOC
Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.
Written by
John Bradshaw
Sr. Director, Technical Marketing

More in this series

No items found.

Blog

/

/

April 16, 2025

Why Data Classification Isn’t Enough to Prevent Data Loss

women looking at laptopDefault blog imageDefault blog image

Why today’s data is fundamentally difficult to protect

Data isn’t what it used to be. It’s no longer confined to neat rows in a database, or tucked away in a secure on-prem server. Today, sensitive information moves freely between cloud platforms, SaaS applications, endpoints, and a globally distributed workforce – often in real time. The sheer volume and diversity of modern data make it inherently harder to monitor, classify, and secure. And the numbers reflect this challenge – 63% of breaches stem from malicious insiders or human error.

This complexity is compounded by an outdated reliance on manual data management. While data classification remains critical – particularly to ensure compliance with regulations like GDPR or HIPAA – the burden of managing this data often falls on overstretched security teams. Security teams are expected to identify, label, and track data across sprawling ecosystems, which can be time-consuming and error-prone. Even with automation, rigid policies that depend on pre-defined data classification miss the mark.

From a data protection perspective, if manual or basic automated classification is the sole methodology for preventing data loss, critical data will likely slip through the cracks. Security teams are left scrambling to fill the gaps, facing compliance risks and increasing operational overhead. Over time, the hidden costs of these inefficiencies pile up, draining resources and reducing the effectiveness of your entire security posture.

What traditional data classification can’t cover

Data classification plays an important role in data loss prevention, but it's only half the puzzle. It’s designed to spot known patterns and apply labels, yet the most common causes of data breaches don’t follow rules. They stem from something far harder to define: human behavior.

When Darktrace began developing its data loss detection capabilities, the question wasn’t what data to protect — it was how to understand the people using it. The numbers pointed clearly to where AI could make the biggest difference: 22% of email data breaches stem directly from user error, while malicious insider threats remain the most expensive, costing organizations an average of $4.99 million per incident.

Data classification is blind to nuance – it can’t grasp intent, context, or the subtle red flags that often precede a breach. And no amount of labeling, policy, or training can fully account for the reality that humans make mistakes. These problems require a system that sees beyond the data itself — one that understands how it’s being used, by whom, and in what context. That’s why Darktrace leans into its core strength: detecting the subtle symptoms of data loss by interpreting human behavior, not just file labels.

Achieving autonomous data protection with behavioral AI

Rather than relying on manual processes to understand what’s important, Darktrace uses its industry-leading AI to learn how your organization uses data — and spot when something looks wrong.

Its understanding of business operations allows it to detect subtle anomalies around data movement for your use cases, whether that’s a misdirected email, an insecure cloud storage link, or suspicious activity from an insider. Crucially, this detection is entirely autonomous, with no need for predefined rules or static labels.

Darktrace uses its contextual understanding of each user to stop all types of sensitive or misdirected data from leaving the organization
Fig 1: Darktrace uses its contextual understanding of each user to stop all types of sensitive or misdirected data from leaving the organization

Darktrace / EMAIL’s DLP add-on continuously learns in real time, enabling:

  • Automatic detection: Identifies risky data behavior to catch threats that traditional approaches miss – from human error to sophisticated insider threats.
  • A dynamic range of actions: Darktrace always aims to avoid business disruption in its blocking actions, but this can be adjusted according to the unique risk appetite of each customer – taking the most appropriate response for that business from a whole scale of possibilities.
  • Enhanced context: While Darktrace doesn’t require sensitivity data labeling, it integrates with Microsoft Purview to ingest sensitivity labels and enrich its understanding of the data – for even more accurate decision-making.

Beyond preventing data loss, Darktrace uses DLP activity to enhance its contextual understanding of the user itself. In other words, outbound activity can be a useful symptom in identifying a potential account compromise, or can be used to give context to that user’s inbound activity. Because Darktrace sees the whole picture of a user across their inbound, outbound, and lateral mail, as well as messaging (and into collaboration tools with Darktrace / IDENTITY), every interaction informs its continuous learning of normal.

With Darktrace, you can achieve dynamic data loss prevention for the most challenging human-related use cases – from accidental misdirected recipients to malicious insiders – that evade detection from manual classification. So don’t stand still on data protection – make the switch to autonomous, adaptive DLP that understands your business, data, and people.

[related-resource]

Continue reading
About the author
Carlos Gray
Senior Product Marketing Manager, Email

Blog

/

Email

/

April 14, 2025

Email bombing exposed: Darktrace’s email defense in action

picture of a computer screen showing a password loginDefault blog imageDefault blog image

What is email bombing?

An email bomb attack, also known as a "spam bomb," is a cyberattack where a large volume of emails—ranging from as few as 100 to as many as several thousand—are sent to victims within a short period.

How does email bombing work?

Email bombing is a tactic that typically aims to disrupt operations and conceal malicious emails, potentially setting the stage for further social engineering attacks. Parallels can be drawn to the use of Domain Generation Algorithm (DGA) endpoints in Command-and-Control (C2) communications, where an attacker generates new and seemingly random domains in order to mask their malicious connections and evade detection.

In an email bomb attack, threat actors typically sign up their targeted recipients to a large number of email subscription services, flooding their inboxes with indirectly subscribed content [1].

Multiple threat actors have been observed utilizing this tactic, including the Ransomware-as-a-Service (RaaS) group Black Basta, also known as Storm-1811 [1] [2].

Darktrace detection of email bombing attack

In early 2025, Darktrace detected an email bomb attack where malicious actors flooded a customer's inbox while also employing social engineering techniques, specifically voice phishing (vishing). The end goal appeared to be infiltrating the customer's network by exploiting legitimate administrative tools for malicious purposes.

The emails in these attacks often bypass traditional email security tools because they are not technically classified as spam, due to the assumption that the recipient has subscribed to the service. Darktrace / EMAIL's behavioral analysis identified the mass of unusual, albeit not inherently malicious, emails that were sent to this user as part of this email bombing attack.

Email bombing attack overview

In February 2025, Darktrace observed an email bombing attack where a user received over 150 emails from 107 unique domains in under five minutes. Each of these emails bypassed a widely used and reputable Security Email Gateway (SEG) but were detected by Darktrace / EMAIL.

Graph showing the unusual spike in unusual emails observed by Darktrace / EMAIL.
Figure 1: Graph showing the unusual spike in unusual emails observed by Darktrace / EMAIL.

The emails varied in senders, topics, and even languages, with several identified as being in German and Spanish. The most common theme in the subject line of these emails was account registration, indicating that the attacker used the victim’s address to sign up to various newsletters and subscriptions, prompting confirmation emails. Such confirmation emails are generally considered both important and low risk by email filters, meaning most traditional security tools would allow them without hesitation.

Additionally, many of the emails were sent using reputable marketing tools, such as Mailchimp’s Mandrill platform, which was used to send almost half of the observed emails, further adding to their legitimacy.

 Darktrace / EMAIL’s detection of an email being sent using the Mandrill platform.
Figure 2: Darktrace / EMAIL’s detection of an email being sent using the Mandrill platform.
Darktrace / EMAIL’s detection of a large number of unusual emails sent during a short period of time.
Figure 3: Darktrace / EMAIL’s detection of a large number of unusual emails sent during a short period of time.

While the individual emails detected were typically benign, such as the newsletter from a legitimate UK airport shown in Figure 3, the harmful aspect was the swarm effect caused by receiving many emails within a short period of time.

Traditional security tools, which analyze emails individually, often struggle to identify email bombing incidents. However, Darktrace / EMAIL recognized the unusual volume of new domain communication as suspicious. Had Darktrace / EMAIL been enabled in Autonomous Response mode, it would have automatically held any suspicious emails, preventing them from landing in the recipient’s inbox.

Example of Darktrace / EMAIL’s response to an email bombing attack taken from another customer environment.
Figure 4: Example of Darktrace / EMAIL’s response to an email bombing attack taken from another customer environment.

Following the initial email bombing, the malicious actor made multiple attempts to engage the recipient in a call using Microsoft Teams, while spoofing the organizations IT department in order to establish a sense of trust and urgency – following the spike in unusual emails the user accepted the Teams call. It was later confirmed by the customer that the attacker had also targeted over 10 additional internal users with email bombing attacks and fake IT calls.

The customer also confirmed that malicious actor successfully convinced the user to divulge their credentials with them using the Microsoft Quick Assist remote management tool. While such remote management tools are typically used for legitimate administrative purposes, malicious actors can exploit them to move laterally between systems or maintain access on target networks. When these tools have been previously observed in the network, attackers may use them to pursue their goals while evading detection, commonly known as Living-off-the-Land (LOTL).

Subsequent investigation by Darktrace’s Security Operations Centre (SOC) revealed that the recipient's device began scanning and performing reconnaissance activities shortly following the Teams call, suggesting that the user inadvertently exposed their credentials, leading to the device's compromise.

Darktrace’s Cyber AI Analyst was able to identify these activities and group them together into one incident, while also highlighting the most important stages of the attack.

Figure 5: Cyber AI Analyst investigation showing the initiation of the reconnaissance/scanning activities.

The first network-level activity observed on this device was unusual LDAP reconnaissance of the wider network environment, seemingly attempting to bind to the local directory services. Following successful authentication, the device began querying the LDAP directory for information about user and root entries. Darktrace then observed the attacker performing network reconnaissance, initiating a scan of the customer’s environment and attempting to connect to other internal devices. Finally, the malicious actor proceeded to make several SMB sessions and NTLM authentication attempts to internal devices, all of which failed.

Device event log in Darktrace / NETWORK, showing the large volume of connections attempts over port 445.
Figure 6: Device event log in Darktrace / NETWORK, showing the large volume of connections attempts over port 445.
Darktrace / NETWORK’s detection of the number of the login attempts via SMB/NTLM.
Figure 7: Darktrace / NETWORK’s detection of the number of the login attempts via SMB/NTLM.

While Darktrace’s Autonomous Response capability suggested actions to shut down this suspicious internal connectivity, the deployment was configured in Human Confirmation Mode. This meant any actions required human approval, allowing the activities to continue until the customer’s security team intervened. If Darktrace had been set to respond autonomously, it would have blocked connections to port 445 and enforced a “pattern of life” to prevent the device from deviating from expected activities, thus shutting down the suspicious scanning.

Conclusion

Email bombing attacks can pose a serious threat to individuals and organizations by overwhelming inboxes with emails in an attempt to obfuscate potentially malicious activities, like account takeovers or credential theft. While many traditional gateways struggle to keep pace with the volume of these attacks—analyzing individual emails rather than connecting them and often failing to distinguish between legitimate and malicious activity—Darktrace is able to identify and stop these sophisticated attacks without latency.

Thanks to its Self-Learning AI and Autonomous Response capabilities, Darktrace ensures that even seemingly benign email activity is not lost in the noise.

Credit to Maria Geronikolou (Cyber Analyst and SOC Shift Supervisor) and Cameron Boyd (Cyber Security Analyst), Steven Haworth (Senior Director of Threat Modeling), Ryan Traill (Analyst Content Lead)

Appendices

[1] https://www.microsoft.com/en-us/security/blog/2024/05/15/threat-actors-misusing-quick-assist-in-social-engineering-attacks-leading-to-ransomware/

[2] https://thehackernews.com/2024/12/black-basta-ransomware-evolves-with.html

Darktrace Models Alerts

Internal Reconnaissance

·      Device / Suspicious SMB Scanning Activity

·      Device / Anonymous NTLM Logins

·      Device / Network Scan

·      Device / Network Range Scan

·      Device / Suspicious Network Scan Activity

·      Device / ICMP Address Scan

·      Anomalous Connection / Large Volume of LDAP Download

·      Device / Suspicious LDAP Search Operation

·      Device / Large Number of Model Alerts

Continue reading
About the author
Maria Geronikolou
Cyber Analyst
Your data. Our AI.
Elevate your network security with Darktrace AI