Blog
/
No items found.
/
June 25, 2024
No items found.

Let the Dominos Fall! SOC and IR Metrics for ROI

Default blog imageDefault blog imageDefault blog imageDefault blog imageDefault blog imageDefault blog image
25
Jun 2024
Vendors are scrambling to compare MTTD metrics laid out in the latest MITRE Engenuity ATT&CK® Evaluations. But this analysis is reductive, ignoring the fact that in cybersecurity, there are far more metrics that matter.

One of the most enjoyable discussions (and debates) I engage in is the topic of Security Operations Center (SOC) and Incident Response (IR) metrics to measure and validate an organization’s Return on Investment (ROI). The debate part comes in when I hear vendor experts talking about “the only” SOC metrics that matter, and only list the two most well-known, while completely ignoring metrics that have a direct causal relationship.

In this blog, I will discuss what I believe are the SOC/IR metrics that matter, how each one has a direct impact on the others, and why organizations should ensure they are working towards the goal of why these metrics are measured in the first place: Reduction of Risk and Costs.

Reduction of Risk and Costs

Every security solution and process an organization puts in place should reduce the organization’s risk of a breach, exposure by an insider threat, or loss of productivity. How an organization realizes net benefits can be in several ways:

  • Improved efficiencies can result in SOC/IR staff focusing on other areas such as advanced threat hunting rather than churning through alerts on their security consoles. It may also help organizations dealing with the lack of skilled security staff by using Artificial Intelligence (AI) and automated processes.
  • A well-oiled SOC/IR team that has greatly reduced or even eliminated mundane tasks attracts, motivates, and retains talent resulting in reduced hiring and training costs.
  • The direct impact of a breach such as a ransomware attack can be devastating. According to the 2024 Data Breach Investigations Report by Verizon, MGM Resorts International reported the ALPHV ransomware cost the company approximately $100 million[1].
  • Failure to take appropriate steps to protect the organization can result in regulatory fines; and if an organization has, or is considering, purchasing Cyber Insurance, can result in declined coverage or increased premiums.

How does an organization demonstrate they are taking proactive measures to prevent breaches? That is where it's important to understand the nine (yes, nine) key metrics, and how each one directly influences the others, play their roles.

Metrics in the Incident Response Timeline

Let’s start with a review of the key steps in the Incident Response Timeline:

Seven of the nine key metrics are in the IR timeline, while two of the metrics occur before you ever have an incident. They occur in the Pre-Detection Stage.

Pre-Detection stage metrics are:

  • Preventions Per Intrusion Attempt (PPIA)
  • False Positive Reduction Rate (FPRR)

Next is the Detect and Investigate stage, there are three metrics to consider:

  • Mean Time to Detection (MTTD)
  • Mean Time to Triage (MTTT)
  • Mean Time to Understanding (MTTU)

This is followed by the Remediation stage, there are two metrics here:

  • Mean Time to Containment (MTTC)
  • Mean Time to Remediation / Recovery (MTTR)

Finally, there is the Risk Reduction stage, there are two metrics:

  • Mean Time to Advice (MTTA)
  • Mean Time to Implementation (MTTI)

Pre-Detection Stage

Preventions Per Intrusion Attempt

PPIA is defined as stopping any intrusion attempt at the earliest possible stage. Your network Intrusion Prevention System (IPS) blocks vulnerability exploits, your e-mail security solution intercepts and removes messages with malicious attachments or links, your egress firewall blocks unauthorized login attempts, etc. The adversary doesn’t get beyond Step 1 in the attack life cycle.

This metric is the first domino. Every organization should strive to improve on this metric every day. Why? For every intrusion attempt you stop right out of the gate, you eliminate the actions for every other metric. There is no incident to detect, triage, investigate, remediate, or analyze post-incident for ways to improve your security posture.

When I think about PPIA, I always remember back to a discussion with a former mentor, Tim Crothers, who discussed the benefits of focusing on Prevention Failure Detection.

The concept is that as you layer your security defenses, your PPIA moves ever closer to 100% (no one has ever reached 100%). This narrows the field of fire for adversaries to breach into your organization. This is where novel, unknown, and permuted threats live and breathe. This is where solutions utilizing Unsupervised Machine Learning excel in raising anomalous alerts – indications of potential compromise involving one of these threats. Unsupervised ML also raises alerts on anomalous activity generated by known threats and can raise detections before many signature-based solutions. Most organizations struggle to find strong permutations of known threats, insider threats, supply chain attacks, attacks utilizing n-day and 0-day exploits. Moving PPIA ever closer to 100% also frees your team up for conducting threat hunting activities – utilizing components of your SOC that collect and store telemetry to query for potential compromises based on hypothesis the team raises. It also significantly reduces the alerts your team must triage and investigate – solving many of the issues outlined at the start of this paper.

False Positive Reduction Rate

Before we discuss FPRR, I should clarify how I define False Positives (FPs). Many define FPs as an alert that is in error (i.e.: your EDR alerts on malware that turns out to be AV signature files). While that is a FP, I extend the definition to include any alert that did not require triage / investigation and distracts the SOC/IR team (meaning they conducted some level of triage / investigation).

This metric is the second domino. Why is this metric important? Every alert your team exerts time and effort on that is a non-issue distracts them from alerts that matter. One of the major issues that has resonated in the security industry for decades is that SOCs are inundated with alerts and cannot clear the backlog. When it comes to PPIA + FPRR, I have seen analysts spend time investigating alerts that were blocked out of the gate while their screen continued to fill up with more. You must focus on Prevention Failure Detection to get ahead of the backlog.

Detect and Investigate Stages

Mean Time to Detection

MTTD, or “Dwell Time”, has decreased dramatically over the past 12 years. From well over a year to 16 days in 2023[2]. MTTD is measured from the earliest possible point you could detect the intrusion to the moment you actually detect it.

This third domino is important because the longer an adversary remains undetected, the more the odds increase they will complete their mission objective. It also makes the tasks of triage and investigation more difficult as analysts must piece together more activity and adversaries may be erasing evidence along the way – or your storage retention does not cover the breach timeline.

Many solutions focusing solely on MTTD can actually create the very problem SOCs are looking to solve.  That is, they generate so much alerting that they flood the console, email, or text messaging app causing an unmanageable queue of alerts (this is the problem XDR solutions were designed to resolve by focusing on incidents rather than alerts).

Mean Time to Triage

MTTT involves SOCs that utilize Level 1 (aka Triage) analysts to render an “escalate / do not escalate” alert verdict accurately. Accuracy is important because Triage Analysts typically are staff new to cyber security (recent grad / certification) and may over escalate (afraid to miss something important) or under escalate (not recognize signs of a successful breach). Because of this, a small MTTT does not always equate to successful handling of incidents.

This metric is important because keeping your senior staff focused on progressing incidents in a timely manner (and not expending time on false positives) should reduce stress and required headcount.

Mean Time to Understanding

MTTU deals with understanding the complete nature of the incident being investigated. This is different than MTTT which only deals with whether the issue merits escalation to senior analysts. It is then up to the senior analysts to determine the scope of the incident, and if you are a follower of my UPSET Investigation Framework, you know understanding the full scope involves:

U = All compromised accounts

P = Persistence Mechanisms used

S = All systems involved (organization, adversary, and intermediaries)

E = Endgame (or mission objective)

T = Techniques, Tactics, Procedures (TTPs) utilized by the adversary

MTTU is important because this information is critical before any containment or remediation actions are taken. Leave a stone unturned, and you alert the adversary that you are onto them and possibly fail to close an avenue of access.

Remediation Stages

Mean Time to Containment

MTTC deals with neutralizing the threat. You may not have kicked the adversary out, but you have halted their progress to their mission objective and ability to inflict further damage. This may be through use of isolation capabilities, termination of malicious processes, or firewall blocks.

MTTC is important, especially with ransomware attacks where every second counts. Faster containment responses can result in reduced / eliminated disruption to business operations or loss of data.

Mean Time to Remediation / Recovery

The full scope of the incident is understood, the adversary has been halted in their tracks, no malicious processes are running on any systems in your organization. Now is the time to put things back to right. MTTR deals with the time involved in restoring business operations to pre-incident stage. It means all remnants of changes made by the adversary (persistence, account alterations, programs installed, etc.) are removed; all disrupted systems are restored to operations (i.e.: ransomware encrypted systems are recovered from backups / snapshots), compromised user accounts are reset, etc.

MTTR is important because it informs senior management of how fast the organization can recover from an incident. Disaster Recovery and Business Continuity plans play a major role in improving this score.

Risk Reduction Stages

Mean Time to Advice

After the dust has settled from the incident, the job is not done. MTTA deals with identifying and assessing the specific areas (vulnerabilities, misconfigurations, lack of security controls) that permitted the adversary to advance to the point where detection occurred (and any actions beyond). The SOC and IR teams should then compile a list of recommendations to present to management to improve the security posture of the organization so the same attack path cannot be used.

Mean Time to Implement

Once recommendations are delivered to management, how long does it take to implement them? MTTI tracks this timeline because none of it matters if you don’t fix the holes that led to the breach.

Nine Dominos

There are the nine dominos of SOC / IR metrics I recommend helping organizations know if they are on the right track to reduce risk, costs and improve morale / retention of the security teams. You may not wish to track all nine, but understanding how each metric impacts the others can provide visibility into why you are not seeing expected improvements when you implement a new security solution or change processes.

Improving prevention and reducing false positives can make huge positive impacts on your incident response timeline. Utilizing solutions that get you to resolution quicker allows the team to focus on recommendations and risk reduction strategies.

Whichever metrics you choose to track, just be sure the dominos fall in your favor.

References

[1] 2024 Verizon Data Breach Investigations Report, p83

[2] Mandiant M-Trends 2023

Inside the SOC
Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.
Author
John Bradshaw
Sr. Director, Technical Marketing

John Bradshaw is Sr. Director, Technical Marketing at Darktrace. He is a security practitioner at heart having built a Customer Security/SOC operations team for (then) the largest ISP on the planet. In his vendor roles he has worked with various security solutions utilized by SOC / IR teams and conducted advanced incident investigation workshops to help organizations understand the benefits and limitations of the solutions they are using. He holds a Bachelor of Business Administration from Averett University and a Master of Science in Network Security from Capitol College.

Book a 1-1 meeting with one of our experts
Share this article

More in this series

No items found.

Blog

/

September 26, 2024

/

Inside the SOC

Thread Hijacking: How Attackers Exploit Trusted Conversations to Infiltrate Networks

Default blog imageDefault blog image

What is Thread Hijacking?

Cyberattacks are becoming increasingly stealthy and targeted, with malicious actors focusing on high-value individuals to gain privileged access to their organizations’ digital environments. One technique that has gained prominence in recent years is thread hijacking. This method allows attackers to infiltrate ongoing conversations, exploiting the trust within these threads to access sensitive systems.

Thread hijacking typically involves attackers gaining access to a user’s email account, monitoring ongoing conversations, and then inserting themselves into these threads. By replying to existing emails, they can send malicious links, request sensitive information, or manipulate the conversation to achieve their goals, such as redirecting payments or stealing credentials. Because such emails appear to come from a trusted source, they often bypass human security teams and traditional security filters.

How does threat hijacking work?

  1. Initial Compromise: Attackers first gain access to a user’s email account, often through phishing, malware, or exploiting weak passwords.
  2. Monitoring: Once inside, they monitor the user’s email threads, looking for ongoing conversations that can be exploited.
  3. Infiltration: The attacker then inserts themselves into these conversations, often replying to existing emails. Because the email appears to come from a trusted source within an ongoing thread, it bypasses many traditional security filters and raises less suspicion.
  4. Exploitation: Using the trust established in the conversation, attackers can send malicious links, request sensitive information, or manipulate the conversation to achieve their goals, such as redirecting payments or stealing credentials.

A recent incident involving a Darktrace customer saw a malicious actor attempt to manipulate trusted email communications, potentially exposing critical data. The attacker created a new mailbox rule to forward specific emails to an archive folder, making it harder for the customer to notice the malicious activity. This highlights the need for advanced detection and robust preventive tools.

Darktrace’s Self-Learning AI is able to recognize subtle deviations in normal behavior, whether in a device or a Software-as-a-Service (SaaS) user. This capability enables it to detect emerging attacks in their early stages. In this post, we’ll delve into the attacker’s tactics and illustrate how Darktrace / IDENTITY™ successfully identified and mitigated a thread hijacking attempt, preventing escalation and potential disruption to the customer’s network.

Threat hijacking attack overview & Darktrace coverage

On August 8, 2024, Darktrace detected an unusual email received by a SaaS account on a customer’s network. The email appeared to be a reply to a previous chain discussing tax and payment details, likely related to a transaction between the customer and one of their business partners.

Headers of the suspicious email received.
Figure 1: Headers of the suspicious email received.

A few hours later, Darktrace detected the same SaaS account creating a new mailbox rule named “.”, a tactic commonly used by malicious actors to evade detection when setting up new email rules [2]. This rule was designed to forward all emails containing a specific word to the user’s “Archives” folder. This evasion technique is typically used to move any malicious emails or responses to a rarely opened folder, ensuring that the genuine account holder does not see replies to phishing emails or other malicious messages sent by attackers [3].

Darktrace recognized the newly created email rule as suspicious after identifying the following parameters:

  • AlwaysDeleteOutlookRulesBlob: False
  • Force: False
  • MoveToFolder: Archive
  • Name: “.”
  • FromAddressContainsWords: [Redacted]
  • MarkAsRead: True
  • StopProcessingRules: True

Darktrace also noted that the user attempting to create this new email rule had logged into the SaaS environment from an unusual IP address. Although the IP was located in the same country as the customer and the ASN used by the malicious actor was typical for the customer’s network, the rare IP, coupled with the anomalous behavior, raised suspicions.

Figure 2: Hijacked SaaS account creating the new mailbox rule.

Given the suspicious nature of this activity, Darktrace’s Security Operations Centre (SOC) investigated the incident and alerted the customer’s security team of this incident.

Due to a public holiday in the customer's location (likely an intentional choice by the threat actor), their security team did not immediately notice or respond to the notification. Fortunately, the customer had Darktrace's Autonomous Response capability enabled, which allowed it to take action against the suspicious SaaS activity without human intervention.

In this instance, Darktrace swiftly disabled the seemingly compromised SaaS user for 24 hours. This action halted the spread of the compromise to other accounts on the customer’s SaaS platform and prevented any sensitive data exfiltration. Additionally, it provided the security team with ample time to investigate the threat and remove the user from their environment. The customer also received detailed incident reports and support through Darktrace’s Security Operations Support service, enabling direct communication with Darktrace’s expert Analyst team.

Conclusion

Ultimately, Darktrace’s anomaly-based detection allowed it to identify the subtle deviations from the user’s expected behavior, indicating a potential compromise on the customer’s SaaS platform. In this case, Darktrace detected a login to a SaaS platform from an unusual IP address, despite the attacker’s efforts to conceal their activity by using a known ASN and logging in from the expected country.

Despite the attempted SaaS hijack occurring on a public holiday when the customer’s security team was likely off-duty, Darktrace autonomously detected the suspicious login and the creation of a new email rule. It swiftly blocked the compromised SaaS account, preventing further malicious activity and safeguarding the organization from data exfiltration or escalation of the compromise.

This highlights the growing need for AI-driven security capable of responding to malicious activity in the absence of human security teams and detect subtle behavioral changes that traditional security tools.

Credit to: Ryan Traill, Threat Content Lead for his contribution to this blog

Appendices

Darktrace Model Detections

SaaS / Compliance / Anomalous New Email Rule

Experimental / Antigena Enhanced Monitoring from SaaS Client Block

Antigena / SaaS / Antigena Suspicious SaaS Activity Block

Antigena / SaaS / Antigena Email Rule Block

References

[1] https://blog.knowbe4.com/whats-the-best-name-threadjacking-or-man-in-the-inbox-attacks

[2] https://darktrace.com/blog/detecting-attacks-across-email-saas-and-network-environments-with-darktraces-combined-ai-approach

[3] https://learn.microsoft.com/en-us/defender-xdr/alert-grading-playbook-inbox-manipulation-rules

Continue reading
About the author
Maria Geronikolou
Cyber Analyst

Blog

/

September 26, 2024

/
No items found.

How AI can help CISOs navigate the global cyber talent shortage

Default blog imageDefault blog image

The global picture

4 million cybersecurity professionals are needed worldwide to protect and defend the digital world – twice the number currently in the workforce.1

Innovative technologies are transforming business operations, enabling access to new markets, personalized customer experiences, and increased efficiency. However, this digital transformation also challenges Security Operations Centers (SOCs) with managing and protecting a complex digital environment without additional resources or advanced skills.

At the same time, the cybersecurity industry is suffering a severe global skills shortage, leaving many SOCs understaffed and under-skilled. With a 72% increase in data breaches from 2021-20232, SOCs are dealing with overwhelming alert volumes from diverse security tools. Nearly 60% of cybersecurity professionals report burnout3, leading to high turnover rates. Consequently, only a fraction of alerts are thoroughly investigated, increasing the risk of undetected breaches. More than half of organizations that experienced breaches in 2024 admitted to having short-staffed SOCs.4

How AI can help organizations do more with less

Cyber defense needs to evolve at the same pace as cyber-attacks, but the global skills shortage is making that difficult. As threat actors increasingly abuse AI for malicious purposes, using defensive AI to enable innovation and optimization at scale is reshaping how organizations approach cybersecurity.

The value of AI isn’t in replacing humans, but in augmenting their efforts and enabling them to scale their defense capabilities and their value to the organization. With AI, cybersecurity professionals can operate at digital speed, analyzing vast data sets, identifying more vulnerabilities with higher accuracy, responding and triaging faster, reducing risks, and implementing proactive measures—all without additional staff.

Research indicates that organizations leveraging AI and automation extensively in security functions—such as prevention, detection, investigation, or response—reduced their average mean time to identify (MTTI) and mean time to contain (MTTC) data breaches by 33% and 43%, respectively. These organizations also managed to contain breaches nearly 100 days faster on average compared to those not using AI and automation.5

First, you've got to apply the right AI to the right security challenge. We dig into how different AI technologies can bridge specific skills gaps in the CISO’s Guide to Navigating the Cybersecurity Skills Shortage.

Cases in point: AI as a human force multiplier

Let’s take a look at just some of the cybersecurity challenges to which AI can be applied to scale defense efforts and relieve the burden on the SOC. We go further into real-life examples in our white paper.

Automated threat detection and response

AI enables 24/7 autonomous response, eliminating the need for after-hours SOC shifts and providing security leaders with peace of mind. AI can scale response efforts by analyzing vast amounts of data in real time, identifying anomalies, and initiating precise autonomous actions to contain incidents, which buys teams time for investigation and remediation.  

Triage and investigation

AI enhances the triage process by automatically categorizing and prioritizing security alerts, allowing cybersecurity professionals to focus on the most critical threats. It creates a comprehensive picture of an attack, helps identify its root cause, and generates detailed reports with key findings and recommended actions.  

Automation also significantly reduces overwhelming alert volumes and high false positive rates, enabling analysts to concentrate on high-priority threats and engage in more proactive and strategic initiatives.

Eliminating silos and improving visibility across the enterprise

Security and IT teams are overwhelmed by the technological complexity of operating multiple tools, resulting in manual work and excessive alerts. AI can correlate threats across the entire organization, enhancing visibility and eliminating silos, thereby saving resources and reducing complexity.

With 88% of organizations favoring a platform approach over standalone solutions, many are consolidating their tech stacks in this direction. This consolidation provides native visibility across clouds, devices, communications, locations, applications, people, and third-party security tools and intelligence.

Upskilling your existing talent in AI

As revealed in the State of AI Cybersecurity Survey 2024, only 26% of cybersecurity professionals say they have a full understanding of the different types of AI in use within security products.6

Understanding AI can upskill your existing staff, enhancing their expertise and optimizing business outcomes. Human expertise is crucial for the effective and ethical integration of AI. To enable true AI-human collaboration, cybersecurity professionals need specific training on using, understanding, and managing AI systems. To make this easier, the Darktrace ActiveAI Security Platform is designed to enable collaboration and reduce the learning curve – lowering the barrier to entry for junior or less skilled analysts.  

However, to bridge the immediate expertise gap in managing AI tools, organizations can consider expert managed services that take the day-to-day management out of the SOC’s hands, allowing them to focus on training and proactive initiatives.

Conclusion

Experts predict the cybersecurity skills gap will continue to grow, increasing operational and financial risks for organizations. AI for cybersecurity is crucial for CISOs to augment their teams and scale defense capabilities with speed, scalability, and predictive insights, while human expertise remains vital for providing the intuition and problem-solving needed for responsible and efficient AI integration.

If you’re thinking about implementing AI to solve your own cyber skills gap, consider the following:

  • Select an AI cybersecurity solution tailored to your specific business needs
  • Review and streamline existing workflows and tools – consider a platform-based approach to eliminate inefficiencies
  • Make use of managed services to outsource AI expertise
  • Upskill and reskill existing talent through training and education
  • Foster a knowledge-sharing culture with access to knowledge bases and collaboration tools

Interested in how AI could augment your SOC to increase efficiency and save resources? Read our longer CISO’s Guide to Navigating the Cybersecurity Skills Shortage.

And to better understand cybersecurity practitioners' attitudes towards AI, check out Darktrace’s State of AI Cybersecurity 2024 report.

References

  1. https://www.isc2.org/research  
  2. https://www.forbes.com/advisor/education/it-and-tech/cybersecurity-statistics/  
  3. https://www.informationweek.com/cyber-resilience/the-psychology-of-cybersecurity-burnout  
  4. https://www.ibm.com/downloads/cas/1KZ3XE9D  
  5. https://www.ibm.com/downloads/cas/1KZ3XE9D  
  6. https://darktrace.com/resources/state-of-ai-cyber-security-2024
Continue reading
About the author
The Darktrace Community
Your data. Our AI.
Elevate your network security with Darktrace AI