Blog
/
AI
/
January 30, 2025

Reimagining Your SOC: Overcoming Alert Fatigue with AI-Led Investigations  

Reimagining your SOC Part 2/3: This blog explores how the challenges facing the modern SOC can be addressed by transforming the investigation process, unlocking efficiency and scalability in SOC operations with AI.
Inside the SOC
Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.
Written by
Brittany Woodsmall
Product Marketing Manager, AI & Attack Surface
Default blog imageDefault blog imageDefault blog imageDefault blog imageDefault blog imageDefault blog image
30
Jan 2025

The efficiency of a Security Operations Center (SOC) hinges on its ability to detect, analyze and respond to threats effectively. With advancements in AI and automation, key early SOC team metrics such as Mean Time to Detect (MTTD) have seen significant improvements:

  • 96% of defenders believing AI-powered solutions significantly boost the speed and efficiency of prevention, detection, response, and recovery.
  • Organizations leveraging AI and automation can shorten their breach lifecycle by an average of 108 days compared to those without these technologies.

While tool advances have improved performance and effectiveness in the detection phase, this has not been as beneficial to the next step of the process where initial alerts are investigated further to determine their relevance and how they relate to other activities. This is often measured with the metric Mean Time to Analysis (MTTA), although some SOC teams operate a two-level process with teams for initial triage to filter out more obviously uninteresting alerts and for more detailed analysis of the remainder. SOC teams continue to grapple with alert fatigue, overwhelmed analysts, and inefficient triage processes, preventing them from achieving the operational efficiency necessary for a high-performing SOC.

Addressing this core inefficiency requires extending AI's capabilities beyond detection to streamline and optimize the following investigative workflows that underpin effective analysis.

Challenges with SOC alert investigation

Detecting cyber threats is only the beginning of a much broader challenge of SOC efficiency. The real bottleneck often lies in the investigation process.

Detection tools and techniques have evolved significantly with the use of machine learning methods, improving early threat detection. However, after a detection pops up, human analysts still typically step in to evaluate the alert, gather context, and determine whether it’s a true threat or a false alarm and why. If it is a threat, further investigation must be performed to understand the full scope of what may be a much larger problem. This phase, measured by the mean time to analysis, is critical for swift incident response.

Challenges with manual alert investigation:

  • Too many alerts
  • Alerts lack context
  • Cognitive load sits with analysts
  • Insufficient talent in the industry
  • Fierce competition for experienced analysts

For many organizations, investigation is where the struggle of efficiency intensifies. Analysts face overwhelming volumes of alerts, a lack of consolidated context, and the mental strain of juggling multiple systems. With a worldwide shortage of 4 million experienced level two and three SOC analysts, the cognitive burden placed on teams is immense, often leading to alert fatigue and missed threats.

Even with advanced systems in place not all potential detections are investigated. In many cases, only a quarter of initial alerts are triaged (or analyzed). However, the issue runs deeper. Triaging occurs after detection engineering and alert tuning, which often disable many alerts that could potentially reveal true threats but are not accurate enough to justify the time and effort of the security team. This means some potential threats slip through unnoticed.

Understanding alerts in the SOC: Stopping cyber incidents is hard

Let’s take a look at the cyber-attack lifecycle and the steps involved in detecting and stopping an attack:

First we need a trace of an attack…

The attack will produce some sort of digital trace. Novel attacks, insider threats, and attacker techniques such as living-off-the-land can make attacker activities extremely hard to distinguish.

A detection is created…

Then we have to detect the trace, for example some beaconing to a rare domain. Initial detection alerts being raised underpin the MTTD (mean time to detection). Reducing this initial unseen duration is where we have seen significant improvement with modern threat detection tools.

When it comes to threat detection, the possibilities are vast. Your initial lead could come from anything: an alert about unusual network activity, a potential known malware detection, or an odd email. Once that lead comes in, it’s up to your security team to investigate further and determine if this is this a legitimate threat or a false alarm and what the context is behind the alert.

Investigation begins…

It doesn’t just stop at a detection. Typically, humans also need to look at the alert, investigate, understand, analyze, and conclude whether this is a genuine threat that needs a response. We normally measure this as MTTA (mean time to analyze).

Conducting the investigation effectively requires a high degree of skill and efficiency, as every second counts in mitigating potential damage. Security teams must analyze the available data, correlate it across multiple sources, and piece together the timeline of events to understand the full scope of the incident. This process involves navigating through vast amounts of information, identifying patterns, and discerning relevant details. All while managing the pressure of minimizing downtime and preventing further escalation.

Containment begins…

Once we confirm something as a threat, and the human team determines a response is required and understand the scope, we need to contain the incident. That's normally the MTTC (mean time to containment) and can be further split into immediate and more permanent measures.

For more about how AI-led solutions can help in the containment stage read here: Autonomous Response: Streamlining Cybersecurity and Business Operations

The challenge is not only in 1) detecting threats quickly, but also 2) triaging and investigating them rapidly and with precision, and 3) prioritizing the most critical findings to avoid missed opportunities. Effective investigation demands a combination of advanced tools, robust workflows, and the expertise to interpret and act on the insights they generate. Without these, organizations risk delaying critical containment and response efforts, leaving them vulnerable to greater impacts.

While there are further steps (remediation, and of course complete recovery) here we will focus on investigation.

Developing an AI analyst: How Darktrace replicates human investigation

Darktrace has been working on understanding the investigative process of a skilled analyst since 2017. By conducting internal research between Darktrace expert SOC analysts and machine learning engineers, we developed a formalized understanding of investigative processes. This understanding formed the basis of a multi-layered AI system that systematically investigates data, taking advantage of the speed and breadth afforded by machine systems.

With this research we found that the investigative process often revolves around iterating three key steps: hypothesis creation, data collection, and results evaluation.

All these details are crucial for an analyst to determine the nature of a potential threat. Similarly, they are integral components of our Cyber AI Analyst which is an integral component across our product suite. In doing so, Darktrace has been able to replicate the human-driven approach to investigating alerts using machine learning speed and scale.

Here’s how it works:

  • When an initial or third-party alert is triggered, the Cyber AI Analyst initiates a forensic investigation by building multiple hypotheses and gathering relevant data to confirm or refute the nature of suspicious activity, iterating as necessary, and continuously refining the original hypothesis as new data emerges throughout the investigation.
  • Using a combination of machine learning including supervised and unsupervised methods, NLP and graph theory to assess activity, this investigation engine conducts a deep analysis with incidents raised to the human team only when the behavior is deemed sufficiently concerning.
  • After classification, the incident information is organized and processed to generate the analysis summary, including the most important descriptive details, and priority classification, ensuring that critical alerts are prioritized for further action by the human-analyst team.
  • If the alert is deemed unimportant, the complete analysis process is made available to the human team so that they can see what investigation was performed and why this conclusion was drawn.
Darktrace cyber ai analyst workflow, how it works

To illustrate this via example, if a laptop is beaconing to a rare domain, the Cyber AI Analyst would create hypotheses including whether this could be command and control traffic, data exfiltration, or something else. The AI analyst then collects data, analyzes it, makes decisions, iterates, and ultimately raises a new high-level incident alert describing and detailing its findings for human analysts to review and follow up.

Learn more about Darktrace's Cyber AI Analyst

  • Cost savings: Equivalent to adding up to 30 full-time Level 2 analysts without increasing headcount
  • Minimize business risk: Takes on the busy work from human analysts and elevates a team’s overall decision making
  • Improve security outcomes: Identifies subtle, sophisticated threats through holistic investigations

Unlocking an efficient SOC

To create a mature and proactive SOC, addressing the inefficiencies in the alert investigation process is essential. By extending AI's capabilities beyond detection, SOC teams can streamline and optimize investigative workflows, reducing alert fatigue and enhancing analyst efficiency.

This holistic approach not only improves Mean Time to Analysis (MTTA) but also ensures that SOCs are well-equipped to handle the evolving threat landscape. Embracing AI augmentation and automation in every phase of threat management will pave the way for a more resilient and proactive security posture, ultimately leading to a high-performing SOC that can effectively safeguard organizational assets.

Every relevant alert is investigated

The Cyber AI Analyst is not a generative AI system, or an XDR or SEIM aggregator that simply prompts you on what to do next. It uses a multi-layered combination of many different specialized AI methods to investigate every relevant alert from across your enterprise, native, 3rd party, and manual triggers, operating at machine speed and scale. This also positively affects detection engineering and alert tuning, because it does not suffer from fatigue when presented with low accuracy but potentially valuable alerts.

Retain and improve analyst skills

Transferring most analysis processes to AI systems can risk team skills if they don't maintain or build them and if the AI doesn't explain its process. This can reduce the ability to challenge or build on AI results and cause issues if the AI is unavailable. The Cyber AI Analyst, by revealing its investigation process, data gathering, and decisions, promotes and improves these skills. Its deep understanding of cyber incidents can be used for skill training and incident response practice by simulating incidents for security teams to handle.

Create time for cyber risk reduction

Human cybersecurity professionals excel in areas that require critical thinking, strategic planning, and nuanced decision-making. With alert fatigue minimized and investigations streamlined, your analysts can avoid the tedious data collection and analysis stages and instead focus on critical decision-making tasks such as implementing recovery actions and performing threat hunting.

Stay tuned for part 3/3

Part 3/3 in the Reimagine your SOC series explores the preventative security solutions market and effective risk management strategies.

Coming soon!

Inside the SOC
Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.
Written by
Brittany Woodsmall
Product Marketing Manager, AI & Attack Surface

More in this series

No items found.

Blog

/

/

May 7, 2025

Anomaly-based threat hunting: Darktrace's approach in action

person working on laptopDefault blog imageDefault blog image

What is threat hunting?

Threat hunting in cybersecurity involves proactively and iteratively searching through networks and datasets to detect threats that evade existing automated security solutions. It is an important component of a strong cybersecurity posture.

There are several frameworks that Darktrace analysts use to guide how threat hunting is carried out, some of which are:

  • MITRE Attack
  • Tactics, Techniques, Procedures (TTPs)
  • Diamond Model for Intrusion Analysis
  • Adversary, Infrastructure, Victims, Capabilities
  • Threat Hunt Model – Six Steps
  • Purpose, Scope, Equip, Plan, Execute, Feedback
  • Pyramid of Pain

These frameworks are important in baselining how to run a threat hunt. There are also a combination of different methods that allow defenders diversity– regardless of whether it is a proactive or reactive threat hunt. Some of these are:

  • Hypothesis-based threat hunting
  • Analytics-driven threat hunting
  • Automated/machine learning hunting
  • Indicator of Compromise (IoC) hunting
  • Victim-based threat hunting

Threat hunting with Darktrace

At its core, Darktrace is an anomaly-based detection tool. It combines various machine learning types that allows it to characterize what constitutes ‘normal’, based on the analysis of many different measures of a device or actor’s behavior. Those types of learning are then curated into what are called models.

Darktrace models leverage anomaly detection and integrate outputs from Darktrace Deep Packet Inspection, telemetry inputs, and additional modules, creating tailored activity detection.

This dynamic understanding allows Darktrace to identify, with a high degree of precision, events or behaviors that are both anomalous and unlikely to be benign.  On top of machine learning models for detection, there is also the ability to change and create models showcasing the tool’s diversity. The Model Editor allows security teams to specify values, priorities, thresholds, and actions they want to detect. That means a team can create custom detection models based on specific use cases or business requirements. Teams can also increase the priority of existing detections based on their own risk assessments to their environment.

This level of dexterity is particularly useful when conducting a threat hunt. As described above, and in previous ‘Inside the SOC’ blogs such a threat hunt can be on a specific threat actor, specific sector, or a  hypothesis-based threat hunt combined with ‘experimenting’ with some of Darktrace’s models.

Conducting a threat hunt in the energy sector with experimental models

In Darktrace’s recent Threat Research report “AI & Cybersecurity: The state of cyber in UK and US energy sectors” Darktrace’s Threat Research team crafted hypothesis-driven threat hunts, building experimental models and investigating existing models to test them and detect malicious activity across Darktrace customers in the energy sector.

For one of the hunts, which hypothesised utilization of PerfectData software and multi-factor authentication (MFA) bypass to compromise user accounts and destruct data, an experimental model was created to detect a Software-as-a-Service (SaaS) user performing activity relating to 'PerfectData Software’, known to allow a threat actor to exfiltrate whole mailboxes as a PST file. Experimental model alerts caused by this anomalous activity were analyzed, in conjunction with existing SaaS and email-related models that would indicate a multi-stage attack in line with the hypothesis.

Whilst hunting, Darktrace researchers found multiple model alerts for this experimental model associated with PerfectData software usage, within energy sector customers, including an oil and gas investment company, as well as other sectors. Upon further investigation, it was also found that in June 2024, a malicious actor had targeted a renewable energy infrastructure provider via a PerfectData Software attack and demonstrated intent to conduct an Operational Technology (OT) attack.

The actor  logged into Azure AD from a rare US IP address. They then granted Consent to ‘eM Client’ from the same IP. Shortly after, the actor granted ‘AddServicePrincipal’ via Azure  to PerfectData Software. Two days later, the actor created a  new email rule from a London IP to move emails to an RSS Feed Folder, stop processing rules, and mark emails as read. They then accessed mail items in the “\Sent” folder from a malicious IP belonging to anonymization network,  Private Internet Access Virtual Private Network (PIA VPN). The actor then conducted mass email deletions , deleting multiple instances of emails with subject “[Name] shared "[Company Name] Proposal" With You” from the  “\Sent folder”. The emails’ subject suggests the email likely contains a link to file storage for phishing purposes. The mass deletion likely represented an attempt to obfuscate a potential outbound phishing email campaign.

The Darktrace Model Alert that triggered for the mass deletes of the likely phishing email containing a file storage link.
Figure 1: The Darktrace Model Alert that triggered for the mass deletes of the likely phishing email containing a file storage link.

A month later, the same user was observed downloading mass mLog CSV files related to proprietary and Operational Technology information. In September, three months after the initial attack, another mass download of operational files occurred by this actor, pertaining to operating instructions and measurements, The observed patience and specific file downloads seemingly demonstrated an intent to conduct or research possible OT attack vectors. An attack on OT could have significant impacts including operational downtime, reputational damage, and harm to everyday operations. Darktrace alerted the impacted customer once findings were verified, and subsequent actions were taken by the internal security team to prevent further malicious activity.

Conclusion

Harnessing the power of different tools in a security stack is a key element to cyber defense. The above hypothesis-based threat hunt and custom demonstrated intent to conduct an experimental model creation demonstrates different threat hunting approaches, how Darktrace’s approach can be operationalized, and that proactive threat hunting can be a valuable complement to traditional security controls and is essential for organizations facing increasingly complex threat landscapes.

Credit to Nathaniel Jones (VP, Security & AI Strategy, Field CISO at Darktrace) and Zoe Tilsiter (EMEA Consultancy Lead)

Continue reading
About the author
Nathaniel Jones
VP, Security & AI Strategy, Field CISO

Blog

/

/

May 6, 2025

Combatting the Top Three Sources of Risk in the Cloud

woman working on laptopDefault blog imageDefault blog image

With cloud computing, organizations are storing data like intellectual property, trade secrets, Personally Identifiable Information (PII), proprietary code and statistics, and other sensitive information in the cloud. If this data were to be accessed by malicious actors, it could incur financial loss, reputational damage, legal liabilities, and business disruption.

Last year data breaches in solely public cloud deployments were the most expensive type of data breach, with an average of $5.17 million USD, a 13.1% increase from the year before.

So, as cloud usage continues to grow, the teams in charge of protecting these deployments must understand the associated cybersecurity risks.

What are cloud risks?

Cloud threats come in many forms, with one of the key types consisting of cloud risks. These arise from challenges in implementing and maintaining cloud infrastructure, which can expose the organization to potential damage, loss, and attacks.

There are three major types of cloud risks:

1. Misconfigurations

As organizations struggle with complex cloud environments, misconfiguration is one of the leading causes of cloud security incidents. These risks occur when cloud settings leave gaps between cloud security solutions and expose data and services to unauthorized access. If discovered by a threat actor, a misconfiguration can be exploited to allow infiltration, lateral movement, escalation, and damage.

With the scale and dynamism of cloud infrastructure and the complexity of hybrid and multi-cloud deployments, security teams face a major challenge in exerting the required visibility and control to identify misconfigurations before they are exploited.

Common causes of misconfiguration come from skill shortages, outdated practices, and manual workflows. For example, potential misconfigurations can occur around firewall zones, isolated file systems, and mount systems, which all require specialized skill to set up and diligent monitoring to maintain

2. Identity and Access Management (IAM) failures

IAM has only increased in importance with the rise of cloud computing and remote working. It allows security teams to control which users can and cannot access sensitive data, applications, and other resources.

Cybersecurity professionals ranked IAM skills as the second most important security skill to have, just behind general cloud and application security.

There are four parts to IAM: authentication, authorization, administration, and auditing and reporting. Within these, there are a lot of subcomponents as well, including but not limited to Single Sign-On (SSO), Two-Factor Authentication (2FA), Multi-Factor Authentication (MFA), and Role-Based Access Control (RBAC).

Security teams are faced with the challenge of allowing enough access for employees, contractors, vendors, and partners to complete their jobs while restricting enough to maintain security. They may struggle to track what users are doing across the cloud, apps, and on-premises servers.

When IAM is misconfigured, it increases the attack surface and can leave accounts with access to resources they do not need to perform their intended roles. This type of risk creates the possibility for threat actors or compromised accounts to gain access to sensitive company data and escalate privileges in cloud environments. It can also allow malicious insiders and users who accidentally violate data protection regulations to cause greater damage.

3. Cross-domain threats

The complexity of hybrid and cloud environments can be exploited by attacks that cross multiple domains, such as traditional network environments, identity systems, SaaS platforms, and cloud environments. These attacks are difficult to detect and mitigate, especially when a security posture is siloed or fragmented.  

Some attack types inherently involve multiple domains, like lateral movement and supply chain attacks, which target both on-premises and cloud networks.  

Challenges in securing against cross-domain threats often come from a lack of unified visibility. If a security team does not have unified visibility across the organization’s domains, gaps between various infrastructures and the teams that manage them can leave organizations vulnerable.

Adopting AI cybersecurity tools to reduce cloud risk

For security teams to defend against misconfigurations, IAM failures, and insecure APIs, they require a combination of enhanced visibility into cloud assets and architectures, better automation, and more advanced analytics. These capabilities can be achieved with AI-powered cybersecurity tools.

Such tools use AI and automation to help teams maintain a clear view of all their assets and activities and consistently enforce security policies.

Darktrace / CLOUD is a Cloud Detection and Response (CDR) solution that makes cloud security accessible to all security teams and SOCs by using AI to identify and correct misconfigurations and other cloud risks in public, hybrid, and multi-cloud environments.

It provides real-time, dynamic architectural modeling, which gives SecOps and DevOps teams a unified view of cloud infrastructures to enhance collaboration and reveal possible misconfigurations and other cloud risks. It continuously evaluates architecture changes and monitors real-time activity, providing audit-ready traceability and proactive risk management.

Real-time visibility into cloud assets and architectures built from network, configuration, and identity and access roles. In this unified view, Darktrace / CLOUD reveals possible misconfigurations and risk paths.
Figure 1: Real-time visibility into cloud assets and architectures built from network, configuration, and identity and access roles. In this unified view, Darktrace / CLOUD reveals possible misconfigurations and risk paths.

Darktrace / CLOUD also offers attack path modeling for the cloud. It can identify exposed assets and highlight internal attack paths to get a dynamic view of the riskiest paths across cloud environments, network environments, and between – enabling security teams to prioritize based on unique business risk and address gaps to prevent future attacks.  

Darktrace’s Self-Learning AI ensures continuous cloud resilience, helping teams move from reactive to proactive defense.

[related-resource]

Continue reading
About the author
Pallavi Singh
Product Marketing Manager, OT Security & Compliance
Your data. Our AI.
Elevate your network security with Darktrace AI