Why Data Classification Isn’t Enough to Prevent Data Loss
In a world of growing data volume and diversity, protecting and keeping track of your organization’s sensitive information is increasingly complex – particularly when 63% of breaches stem from malicious insiders or human error. This blog explores how security teams can achieve visibility beyond the limits of data classification, without adding to the burden of data management.
Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.
Written by
Carlos Gray
Senior Product Marketing Manager, Email
Share
15
Apr 2025
Why today’s data is fundamentally difficult to protect
Data isn’t what it used to be. It’s no longer confined to neat rows in a database, or tucked away in a secure on-prem server. Today, sensitive information moves freely between cloud platforms, SaaS applications, endpoints, and a globally distributed workforce – often in real time. The sheer volume and diversity of modern data make it inherently harder to monitor, classify, and secure. And the numbers reflect this challenge – 63% of breaches stem from malicious insiders or human error.
This complexity is compounded by an outdated reliance on manual data management. While data classification remains critical – particularly to ensure compliance with regulations like GDPR or HIPAA – the burden of managing this data often falls on overstretched security teams. Security teams are expected to identify, label, and track data across sprawling ecosystems, which can be time-consuming and error-prone. Even with automation, rigid policies that depend on pre-defined data classification miss the mark.
From a data protection perspective, if manual or basic automated classification is the sole methodology for preventing data loss, critical data will likely slip through the cracks. Security teams are left scrambling to fill the gaps, facing compliance risks and increasing operational overhead. Over time, the hidden costs of these inefficiencies pile up, draining resources and reducing the effectiveness of your entire security posture.
What traditional data classification can’t cover
Data classification plays an important role in data loss prevention, but it's only half the puzzle. It’s designed to spot known patterns and apply labels, yet the most common causes of data breaches don’t follow rules. They stem from something far harder to define: human behavior.
Data classification is blind to nuance – it can’t grasp intent, context, or the subtle red flags that often precede a breach. And no amount of labeling, policy, or training can fully account for the reality that humans make mistakes. These problems require a system that sees beyond the data itself — one that understands how it’s being used, by whom, and in what context. That’s why Darktrace leans into its core strength: detecting the subtle symptoms of data loss by interpreting human behavior, not just file labels.
Achieving autonomous data protection with behavioral AI
Its understanding of business operations allows it to detect subtle anomalies around data movement for your use cases, whether that’s a misdirected email, an insecure cloud storage link, or suspicious activity from an insider. Crucially, this detection is entirely autonomous, with no need for predefined rules or static labels.
Fig 1: Darktrace uses its contextual understanding of each user to stop all types of sensitive or misdirected data from leaving the organization
Darktrace / EMAIL’s DLP add-on continuously learns in real time, enabling:
Automatic detection: Identifies risky data behavior to catch threats that traditional approaches miss – from human error to sophisticated insider threats.
A dynamic range of actions: Darktrace always aims to avoid business disruption in its blocking actions, but this can be adjusted according to the unique risk appetite of each customer – taking the most appropriate response for that business from a whole scale of possibilities.
Enhanced context: While Darktrace doesn’t require sensitivity data labeling, it integrates with Microsoft Purview to ingest sensitivity labels and enrich its understanding of the data – for even more accurate decision-making.
Beyond preventing data loss, Darktrace uses DLP activity to enhance its contextual understanding of the user itself. In other words, outbound activity can be a useful symptom in identifying a potential account compromise, or can be used to give context to that user’s inbound activity. Because Darktrace sees the whole picture of a user across their inbound, outbound, and lateral mail, as well as messaging (and into collaboration tools with Darktrace / IDENTITY), every interaction informs its continuous learning of normal.
With Darktrace, you can achieve dynamic data loss prevention for the most challenging human-related use cases – from accidental misdirected recipients to malicious insiders – that evade detection from manual classification. So don’t stand still on data protection – make the switch to autonomous, adaptive DLP that understands your business, data, and people.
[related-resource]
Interested in finding out more?
Read the full solution brief to see how Darktrace's AI-driven approach to DLP stops data loss across email and Teams
Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.
Email bombing exposed: Darktrace’s email defense in action
Darktrace detected an email bomb attack flooding inboxes with high volumes of messages, uncovering unusual email patterns and subsequent network anomalies.
FedRAMP High-compliant email security protects federal agencies from nation-state attacks
Not only has Darktrace Federal achieved its FedRAMP High Authority to Operate, one of the few cybersecurity vendors to do this, but we have also released Darktrace Commercial Government Cloud High/Email, a FedRAMP High-compliant email security solution for customers using Microsoft Government Community Cloud High.
Marcus Fowler
CEO of Darktrace Federal and SVP of Strategic Engagements and Threats
Global Technology Provider Transforms Email Threat Detection with Darktrace
To strengthen its distributed and complex operations, this global technology leader implemented Darktrace / EMAIL to monitor, detect, and mitigate potential email threats. Read the blog to discover their results.
Combatting the Top Three Sources of Risk in the Cloud
With cloud computing, organizations are storing data like intellectual property, trade secrets, Personally Identifiable Information (PII), proprietary code and statistics, and other sensitive information in the cloud. If this data were to be accessed by malicious actors, it could incur financial loss, reputational damage, legal liabilities, and business disruption.
So, as cloud usage continues to grow, the teams in charge of protecting these deployments must understand the associated cybersecurity risks.
What are cloud risks?
Cloud threats come in many forms, with one of the key types consisting of cloud risks. These arise from challenges in implementing and maintaining cloud infrastructure, which can expose the organization to potential damage, loss, and attacks.
There are three major types of cloud risks:
1. Misconfigurations
As organizations struggle with complex cloud environments, misconfiguration is one of the leading causes of cloud security incidents. These risks occur when cloud settings leave gaps between cloud security solutions and expose data and services to unauthorized access. If discovered by a threat actor, a misconfiguration can be exploited to allow infiltration, lateral movement, escalation, and damage.
With the scale and dynamism of cloud infrastructure and the complexity of hybrid and multi-cloud deployments, security teams face a major challenge in exerting the required visibility and control to identify misconfigurations before they are exploited.
Common causes of misconfiguration come from skill shortages, outdated practices, and manual workflows. For example, potential misconfigurations can occur around firewall zones, isolated file systems, and mount systems, which all require specialized skill to set up and diligent monitoring to maintain
IAM has only increased in importance with the rise of cloud computing and remote working. It allows security teams to control which users can and cannot access sensitive data, applications, and other resources.
There are four parts to IAM: authentication, authorization, administration, and auditing and reporting. Within these, there are a lot of subcomponents as well, including but not limited to Single Sign-On (SSO), Two-Factor Authentication (2FA), Multi-Factor Authentication (MFA), and Role-Based Access Control (RBAC).
Security teams are faced with the challenge of allowing enough access for employees, contractors, vendors, and partners to complete their jobs while restricting enough to maintain security. They may struggle to track what users are doing across the cloud, apps, and on-premises servers.
When IAM is misconfigured, it increases the attack surface and can leave accounts with access to resources they do not need to perform their intended roles. This type of risk creates the possibility for threat actors or compromised accounts to gain access to sensitive company data and escalate privileges in cloud environments. It can also allow malicious insiders and users who accidentally violate data protection regulations to cause greater damage.
3. Cross-domain threats
The complexity of hybrid and cloud environments can be exploited by attacks that cross multiple domains, such as traditional network environments, identity systems, SaaS platforms, and cloud environments. These attacks are difficult to detect and mitigate, especially when a security posture is siloed or fragmented.
Some attack types inherently involve multiple domains, like lateral movement and supply chain attacks, which target both on-premises and cloud networks.
Challenges in securing against cross-domain threats often come from a lack of unified visibility. If a security team does not have unified visibility across the organization’s domains, gaps between various infrastructures and the teams that manage them can leave organizations vulnerable.
Adopting AI cybersecurity tools to reduce cloud risk
For security teams to defend against misconfigurations, IAM failures, and insecure APIs, they require a combination of enhanced visibility into cloud assets and architectures, better automation, and more advanced analytics. These capabilities can be achieved with AI-powered cybersecurity tools.
Such tools use AI and automation to help teams maintain a clear view of all their assets and activities and consistently enforce security policies.
Darktrace / CLOUD is a Cloud Detection and Response (CDR) solution that makes cloud security accessible to all security teams and SOCs by using AI to identify and correct misconfigurations and other cloud risks in public, hybrid, and multi-cloud environments.
It provides real-time, dynamic architectural modeling, which gives SecOps and DevOps teams a unified view of cloud infrastructures to enhance collaboration and reveal possible misconfigurations and other cloud risks. It continuously evaluates architecture changes and monitors real-time activity, providing audit-ready traceability and proactive risk management.
Figure 1: Real-time visibility into cloud assets and architectures built from network, configuration, and identity and access roles. In this unified view, Darktrace / CLOUD reveals possible misconfigurations and risk paths.
Darktrace / CLOUD also offers attack path modeling for the cloud. It can identify exposed assets and highlight internal attack paths to get a dynamic view of the riskiest paths across cloud environments, network environments, and between – enabling security teams to prioritize based on unique business risk and address gaps to prevent future attacks.
Darktrace’s Self-Learning AI ensures continuous cloud resilience, helping teams move from reactive to proactive defense.
Product Marketing Manager, OT Security & Compliance
Blog
/
/
May 2, 2025
SocGholish: From loader and C2 activity to RansomHub deployment
Over the past year, a clear pattern has emerged across the threat landscape: ransomware operations are increasingly relying on compartmentalized affiliate models. In these models, initial access brokers (IABs) [6], malware loaders, and post-exploitation operators work together.
Due to those specialization roles, a new generation of loader campaigns has risen. Threat actors increasingly employ loader operators to quietly establish footholds on the target network. These entities then hand off access to ransomware affiliates. One loader that continues to feature prominently in such campaigns is SocGholish.
What is SocGholish?
SocGholish is a loader malware that has been utilized since at least 2017 [7]. It has long been associated with fake browser updates and JavaScript-based delivery methods on infected websites.
Threat actors often target outdated or poorly secured CMS-based websites like WordPress. Through unpatched plugins, or even remote code execution flaws, they inject malicious JavaScript into the site’s HTML, templates or external JS resources [8]. Historically, SocGholish has functioned as a first-stage malware loader, ultimately leading to deployment of Cobalt Strike beacons [9], and further facilitating access persistence to corporate environments. More recently, multiple security vendors have reported that infections involving SocGholish frequently lead to the deployment of RansomHub ransomware [3] [5].
This blog explores multiple instances within Darktrace's customer base where SocGholish deployment led to subsequent network compromises. Investigations revealed indicators of compromise (IoCs) similar to those identified by external security researchers, along with variations in attacker behavior post-deployment. Key innovations in post-compromise activities include credential access tactics targeting authentication mechanisms, particularly through the abuse of legacy protocols like WebDAV and SCF file interactions over SMB.
Initial access and execution
Since January 2025, Darktrace’s Threat Research team observed multiple cases in which threat actors leveraged the SocGholish loader for initial access. Malicious actors commonly deliver SocGholish by compromising legitimate websites by injecting malicious scripts into the HTML of the affected site. When the visitor lands on an infected site, they are typically redirected to a fake browser update page, tricking them into downloading a ZIP file containing a JavaScript-based loader [1] [2]. In one case, a targeted user appears to have visited the compromised website garagebevents[.]com (IP: 35.203.175[.]30), from which around 10 MB of data was downloaded.
Figure 1: Device Event Log showing connections to the compromised website, following by connections to the identified Keitaro TDS instances.
Within milliseconds of the connection establishment, the user’s device initiated several HTTPS sessions over the destination port 443 to the external endpoint 176.53.147[.]97, linked to the following Keitaro TDS domains:
packedbrick[.]com
rednosehorse[.]com
blackshelter[.]org
blacksaltys[.]com
To evade detection, SocGholish uses highly obfuscated code and relies on traffic distribution systems (TDS) [3]. TDS is a tool used in digital and affiliate marketing to manage and distribute incoming web traffic based on predefined rules. More specifically, Keitaro is a premium self-hosted TDS frequently utilized by attackers as a payload repository for malicious scripts following redirects from compromised sites. In the previously noted example, it appears that the device connected to the compromised website, which then retrieved JavaScript code from the aforementioned Keitaro TDS domains. The script served by those instances led to connections to the endpoint virtual.urban-orthodontics[.]com (IP: 185.76.79[.]50), successfully completing SocGholish’s distribution.
Figure 2: Advanced Search showing connections to the compromised website, following by those to the identified Keitaro TDS instances.
Persistence
During some investigations, Darktrace researchers observed compromised devices initiating HTTPS connections to the endpoint files.pythonhosted[.]org (IP: 151.101.1[.]223), suggesting Python package downloads. External researchers have previously noted how attackers use Python-based backdoors to maintain access on compromised endpoints following initial access via SocGholish [5].
Credential access and lateral movement
Credential access – external
Darktrace researchers identified observed some variation in kill chain activities following initial access and foothold establishment. For example, Darktrace detected interesting variations in credential access techniques. In one such case, an affected device attempted to contact the rare external endpoint 161.35.56[.]33 using the Web Distributed Authoring and Versioning (WebDAV) protocol. WebDAV is an extension of the HTTP protocol that allows users to collaboratively edit and manage files on remote web servers. WebDAV enables remote shares to be mounted over HTTP or HTTPS, similar to how SMB operates, but using web-based protocols. Windows supports WebDAV natively, which means a UNC path pointing to an HTTP or HTTPS resource can trigger system-level behavior such as authentication.
In this specific case, the system initiated outbound connections using the ‘Microsoft-WebDAV-MiniRedir/10.0.19045’ user-agent, targeting the URI path of /s on the external endpoint 161.35.56[.]33. During these requests, the host attempted to initiate NTML authentication and even SMB sessions over the web, both of which failed. Despite the session failures, these attempts also indicate a form of forced authentication. Forced authentication exploits a default behavior in Windows where, upon encountering a UNC path, the system will automatically try to authenticate to the resource using NTML – often without any user interaction. Although no files were directly retrieved, the WebDAV server was still likely able to retrieve the user’s NTLM hash during the session establishment requests, which can later be used by the adversary to crack the password offline.
Credential access – internal
In another investigated incident, Darktrace observed a related technique utilized for credential access and lateral movement. This time, the infected host uploaded a file named ‘Thumbs.scf’ to multiple internal SMB network shares. Shell Command File ( SCF) is a legacy Windows file format used primarily for Windows Explorer shortcuts. These files contain instructions for rendering icons or triggering shell commands, and they can be executed implicitly when a user simply opens a folder containing the file – no clicks required.
The ‘Thumbs.scf’ file dropped by the attacker was crafted to exploit this behavior. Its contents included a [Shell] section with the Command=2 directive and an IconFile path pointing to a remote UNC resource on the same external endpoint, 161.35.56[.]33, seen in the previously described case – specifically, ‘\\161.35.56[.]33\share\icon.ico’. When a user on the internal network navigates to the folder containing the SCF file, their system will automatically attempt to load the icon. In doing so, the system issues a request to the specified UNC path, which again prompts Windows to initiate NTML authentication.
This pattern of activity implies that the attacker leveraged passive internal exposure; users who simply browsed a compromised share would unknowingly send their NTML hashes to an external attacker-controlled host. Unlike the WebDAV approach, which required initiating outbound communication from the infected host, this SCF method relies on internal users to interact with poisoned folders.
Figure 3: Contents of the file 'Thumbs.scf' showing the UNC resource hosted on the external endpoint.
Command-and-control
Following initial compromise, affected devices would then attempt outbound connections using the TLS/SSL protocol over port 443 to different sets of command-and-control (C2) infrastructure associated with SocGholish. The malware frequently uses obfuscated JavaScript loaders to initiate its infection chain, and once dropped, the malware communicates back to its infrastructure over standard web protocols, typically using HTTPS over port 443. However, this set of connections would precede a second set of outbound connections, this time to infrastructure linked to RansomHub affiliates, possibly facilitating the deployed Python-based backdoor.
Connectivity to RansomHub infrastructure relied on defense evasion tactics, such as port-hopping. The idea behind port-hopping is to disguise C2 traffic by avoiding consistent patterns that might be caught by firewalls, and intrusion detection systems. By cycling through ephemeral ports, the malware increases its chances of slipping past basic egress filtering or network monitoring rules that only scrutinize common web traffic ports like 443 or 80. Darktrace analysts identified systems connecting to destination ports such as 2308, 2311, 2313 and more – all on the same destination IP address associated with the RansomHub C2 environment.
Figure 4: Advanced Search connection logs showing connections over destination ports that change rapidly.
Conclusion
Since the beginning of 2025, Darktrace analysts identified a campaign whereby ransomware affiliates leveraged SocGholish to establish network access in victim environments. This activity enabled multiple sets of different post exploitation activity. Credential access played a key role, with affiliates abusing WebDAV and NTML over SMB to trigger authentication attempts. The attackers were also able to plant SCF files internally to expose NTML hashes from users browsing shared folders. These techniques evidently point to deliberate efforts at early lateral movement and foothold expansion before deploying ransomware. As ransomware groups continue to refine their playbooks and work more closely with sophisticated loaders, it becomes critical to track not just who is involved, but how access is being established, expanded, and weaponized.
Credit to Chrisina Kreza (Cyber Analyst) and Adam Potter (Senior Cyber Analyst)
Appendices
Darktrace / NETWORK model alerts
· Anomalous Connection / SMB Enumeration
· Anomalous Connection / Multiple Connections to New External TCP Port
· Anomalous Connection / Multiple Failed Connections to Rare Endpoint
· Anomalous Connection / New User Agent to IP Without Hostname
· Compliance / External Windows Communication
· Compliance / SMB Drive Write
· Compromise / Large DNS Volume for Suspicious Domain
· Compromise / Large Number of Suspicious Failed Connections
· Device / Anonymous NTML Logins
· Device / External Network Scan
· Device / New or Uncommon SMB Named Pipe
· Device / SMB Lateral Movement
· Device / Suspicious SMB Activity
· Unusual Activity / Unusual External Activity
· User / Kerberos Username Brute Force
MITRE ATT&CK mapping
· Credential Access – T1187 Forced Authentication
· Credential Access – T1110 Brute Force
· Command and Control – T1071.001 Web Protocols
· Command and Control – T1571 Non-Standard Port
· Discovery – T1083 File and Directory Discovery
· Discovery – T1018 Remote System Discovery
· Discovery – T1046 Network Service Discovery
· Discovery – T1135 Network Share Discovery
· Execution – T1059.007 JavaScript
· Lateral Movement – T1021.002 SMB/Windows Admin Shares