Blog
Comparing different AI approaches to email security



Innovations in artificial intelligence (AI) have fundamentally changed the email security landscape in recent years, but it can often be hard to determine what makes one system different to the next. In reality, under that umbrella term there exists a significant distinction in approach which may determine whether the technology provides genuine protection or simply a perceived notion of defense.
One backward-looking approach involves feeding a machine thousands of emails that have already been deemed to be malicious, and training it to look for patterns in these emails in order to spot future attacks. The second approach uses an AI system to analyze the entirety of an organization’s real-world data, enabling it to establish a notion of what is ‘normal’ and then spot subtle deviations indicative of an attack.
In the below, we compare the relative merits of each approach, with special consideration to novel attacks that leverage the latest news headlines to bypass machine learning systems trained on data sets. Training a machine on previously identified ‘known bads’ is only advantageous in certain, specific contexts that don’t change over time: to recognize the intent behind an email, for example. However, an effective email security solution must also incorporate a self-learning approach that understands ‘normal’ in the context of an organization in order to identify unusual and anomalous emails and catch even the novel attacks.
Signatures – a backward-looking approach
Over the past few decades, cyber security technologies have looked to mitigate risk by preventing previously seen attacks from occurring again. In the early days, when the lifespan of a given strain of malware or the infrastructure of an attack was in the range of months and years, this method was satisfactory. But the approach inevitably results in playing catch-up with malicious actors: it always looks to the past to guide detection for the future. With decreasing lifetimes of attacks, where a domain could be used in a single email and never seen again, this historic-looking signature-based approach is now being widely replaced by more intelligent systems.
Training a machine on ‘bad’ emails
The first AI approach we often see in the wild involves harnessing an extremely large data set with thousands or millions of emails. Once these emails have come through, an AI is trained to look for common patterns in malicious emails. The system then updates its models, rules set, and blacklists based on that data.
This method certainly represents an improvement to traditional rules and signatures, but it does not escape the fact that it is still reactive, and unable to stop new attack infrastructure and new types of email attacks. It is simply automating that flawed, traditional approach – only instead of having a human update the rules and signatures, a machine is updating them instead.
Relying on this approach alone has one basic but critical flaw: it does not enable you to stop new types of attacks that it has never seen before. It accepts that there has to be a ‘patient zero’ – or first victim – in order to succeed.
The industry is beginning to acknowledge the challenges with this approach, and huge amounts of resources – both automated systems and security researchers – are being thrown into minimizing its limitations. This includes leveraging a technique called “data augmentation” that involves taking a malicious email that slipped through and generating many “training samples” using open-source text augmentation libraries to create “similar” emails – so that the machine learns not only the missed phish as ‘bad’, but several others like it – enabling it to detect future attacks that use similar wording, and fall into the same category.
But spending all this time and effort into trying to fix an unsolvable problem is like putting all your eggs in the wrong basket. Why try and fix a flawed system rather than change the game altogether? To spell out the limitations of this approach, let us look at a situation where the nature of the attack is entirely new.
The rise of ‘fearware’
When the global pandemic hit, and governments began enforcing travel bans and imposing stringent restrictions, there was undoubtedly a collective sense of fear and uncertainty. As explained previously in this blog, cyber-criminals were quick to capitalize on this, taking advantage of people’s desire for information to send out topical emails related to COVID-19 containing malware or credential-grabbing links.
These emails often spoofed the Centers for Disease Control and Prevention (CDC), or later on, as the economic impact of the pandemic began to take hold, the Small Business Administration (SBA). As the global situation shifted, so did attackers’ tactics. And in the process, over 130,000 new domains related to COVID-19 were purchased.
Let’s now consider how the above approach to email security might fare when faced with these new email attacks. The question becomes: how can you train a model to look out for emails containing ‘COVID-19’, when the term hasn’t even been invented yet?
And while COVID-19 is the most salient example of this, the same reasoning follows for every single novel and unexpected news cycle that attackers are leveraging in their phishing emails to evade tools using this approach – and attracting the recipient’s attention as a bonus. Moreover, if an email attack is truly targeted to your organization, it might contain bespoke and tailored news referring to a very specific thing that supervised machine learning systems could never be trained on.
This isn’t to say there’s not a time and a place in email security for looking at past attacks to set yourself up for the future. It just isn’t here.
Spotting intention
Darktrace uses this approach for one specific use which is future-proof and not prone to change over time, to analyze grammar and tone in an email in order to identify intention: asking questions like ‘does this look like an attempt at inducement? Is the sender trying to solicit some sensitive information? Is this extortion?’ By training a system on an extremely large data set collected over a period of time, you can start to understand what, for instance, inducement looks like. This then enables you to easily spot future scenarios of inducement based on a common set of characteristics.
Training a system in this way works because, unlike news cycles and the topics of phishing emails, fundamental patterns in tone and language don’t change over time. An attempt at solicitation is always an attempt at solicitation, and will always bear common characteristics.
For this reason, this approach only plays one small part of a very large engine. It gives an additional indication about the nature of the threat, but is not in itself used to determine anomalous emails.
Detecting the unknown unknowns
In addition to using the above approach to identify intention, Darktrace uses unsupervised machine learning, which starts with extracting and extrapolating thousands of data points from every email. Some of these are taken directly from the email itself, while others are only ascertainable by the above intention-type analysis. Additional insights are also gained from observing emails in the wider context of all available data across email, network and the cloud environment of the organization.
Only after having a now-significantly larger and more comprehensive set of indicators, with a more complete description of that email, can the data be fed into a topic-indifferent machine learning engine to start questioning the data in millions of ways in order to understand if it belongs, given the wider context of the typical ‘pattern of life’ for the organization. Monitoring all emails in conjunction allows the machine to establish things like:
- Does this person usually receive ZIP files?
- Does this supplier usually send links to Dropbox?
- Has this sender ever logged in from China?
- Do these recipients usually get the same emails together?
The technology identifies patterns across an entire organization and gains a continuously evolving sense of ‘self’ as the organization grows and changes. It is this innate understanding of what is and isn’t ‘normal’ that allows AI to spot the truly ‘unknown unknowns’ instead of just ‘new variations of known bads.’
This type of analysis brings an additional advantage in that it is language and topic agnostic: because it focusses on anomaly detection rather than finding specific patterns that indicate threat, it is effective regardless of whether an organization typically communicates in English, Spanish, Japanese, or any other language.
By layering both of these approaches, you can understand the intention behind an email and understand whether that email belongs given the context of normal communication. And all of this is done without ever making an assumption or having the expectation that you’ve seen this threat before.
Years in the making
It’s well established now that the legacy approach to email security has failed – and this makes it easy to see why existing recommendation engines are being applied to the cyber security space. On first glance, these solutions may be appealing to a security team, but highly targeted, truly unique spear phishing emails easily skirt these systems. They can’t be relied on to stop email threats on the first encounter, as they have a dependency on known attacks with previously seen topics, domains, and payloads.
An effective, layered AI approach takes years of research and development. There is no single mathematical model to solve the problem of determining malicious emails from benign communication. A layered approach accepts that competing mathematical models each have their own strengths and weaknesses. It autonomously determines the relative weight these models should have and weighs them against one another to produce an overall ‘anomaly score’ given as a percentage, indicating exactly how unusual a particular email is in comparison to the organization’s wider email traffic flow.
It is time for email security to well and truly drop the assumption that you can look at threats of the past to predict tomorrow’s attacks. An effective AI cyber security system can identify abnormalities with no reliance on historical attacks, enabling it to catch truly unique novel emails on the first encounter – before they land in the inbox.
Like this and want more?
Blog
Inside the SOC
How Abuse of ‘PerfectData Software’ May Create a Perfect Storm: An Emerging Trend in Account Takeovers


Amidst the ever-changing threat landscape, new tactics, techniques, and procedures (TTPs) seem to emerge daily, creating extreme challenges for security teams. The broad range of attack methods utilized by attackers seems to present an insurmountable problem: how do you defend against a playbook that does not yet exist?
Faced with the growing number of novel and uncommon attack methods, it is essential for organizations to adopt a security solution able to detect threats based on their anomalies, rather than relying on threat intelligence alone.
In March 2023, Darktrace observed an emerging trend in the use of an application known as ‘PerfectData Software’ for probable malicious purposes in several Microsoft 365 account takeovers.
Using its anomaly-based detection, Darktrace DETECT™ was able to identify the activity chain surrounding the use of this application, potentially uncovering a novel piece of threat actor tradecraft in the process.
Microsoft 365 Intrusions
In recent years, Microsoft’s Software-as-a-Service (SaaS) suite, Microsoft 365, along with its built-in identity and access management (IAM) service, Azure Active Directory (Azure AD), have been heavily targeted by threat actors due to their near-ubiquitous usage across industries. Four out of every five Fortune 500 companies, for example, use Microsoft 365 services [1].
Malicious actors typically gain entry to organizations’ Microsoft 365 environments by abusing either stolen account credentials or stolen session cookies [2]. Once inside, actors can access sensitive data within mailboxes or SharePoint repositories, and send out emails or Teams messages. This activity can often result in serious financial harm, especially in cases where the malicious actor’s end-goal is to elicit fraudulent transactions.
Darktrace regularly observes malicious actors behaving in predictable ways once they gain access to customer Microsoft 365 environment. One typical example is the creation of new inbox rules and sending deceitful emails intended to convince recipients to carry out subsequent actions, such as following a malicious link or providing sensitive information. It is also common for actors to register new applications in Azure AD so that they can be used to conduct follow-up activities, like mass-mailing or data theft. The registration of applications in Azure AD therefore seems to be a relatively predictable threat actor behavior [3][4]. Darktrace DETECT understands that unusual application registrations in Azure AD may constitute a deviation in expected behavior, and therefore a possible indicator of account compromise.
These registrations of applications in Azure AD are evidenced by creations of, as well as assignments of permissions to, Service Principals in Azure AD. Darktrace has detected a growing trend in actors creating and assigning permissions to a Service Principal named ‘PerfectData Software’. Further investigation of this Azure AD activity revealed it to be part of an ongoing account takeover.
‘PerfectData Software’ Activity
Darktrace observed variations of the following pattern of activity relating to an application named ‘PerfectData Software’ within its customer base:
- Actor signs in to a Microsoft 365 account from an endpoint associated with a Virtual Private Server (VPS) or Virtual Private Network (VPN) service
- Actor registers an application called 'PerfectData Software' with Azure AD, and then grants permissions to the application
- Actor accesses mailbox data and creates inbox rule
In two separate incidents, malicious actors were observed conducting their activities from endpoints associated with VPN services (HideMyAss (HMA) VPN and Surfshark VPN, respectively) and from endpoints within the Autonomous System AS396073 MAJESTIC-HOSTING-01.
In March 2023, Darktrace observed a malicious actor signing in to a Microsoft 365 account from a Kuwait-based IP address within the Autonomous System, AS198605 AVAST Software s.r.o. This IP address is associated with the VPN service, HMA VPN. Over the next couple of days, an actor (likely the same malicious actor) signed in to the account several more times from two different Nigeria-based endpoints, as well as a VPS-related endpoint and a HMA VPN endpoint.
During their login sessions, the actor performed a variety of actions. First, they created and assigned permissions to a Service Principal named ‘PerfectData Software’. This Service Principal creation represents the registration of an application called ‘PerfectData Software’ in Azure AD. Although the reason for registering this application is unclear, within a few days the actor registered and granted permission to another application, ‘Newsletter Software Supermailer’, and created a new inbox rule names ‘s’ on the mailbox of the hijacked account. This inbox rule moved emails meeting certain conditions to a folder named ‘RSS Subscription. The ‘Newsletter Software Supermailer’ application was likely registered by the actor to facilitate mass-mailing activity.
Immediately after these actions, Darktrace detected the actor sending out thousands of malicious emails from the account. The emails included an attachment named ‘Credit Transfer Copy.html’, which contained a suspicious link. Further investigation revealed that the customer’s network had received several fake invoice emails prior to this initial intrusion activity. Additionally, there was an unusually high volume of failed logins to the compromised account around the time of the initial access.

In a separate case also observed by Darktrace in March 2023, a malicious actor was observed signing in to a Microsoft 365 account from an endpoint within the Autonomous System, AS397086 LAYER-HOST-HOUSTON. The endpoint appears to be related to the VPN service, Surfshark VPN. This login was followed by several failed and successful logins from a VPS-related within the Autonomous System, AS396073 MAJESTIC-HOSTING-01. The actor was then seen registering and assigning permissions to an application called ‘PerfectData Software’. As with the previous example, the motives for this registration are unclear. The actor proceeded to log in several more times from a Surfshark VPN endpoint, however, they were not observed carrying out any further suspicious activity.

It was not clear in either of these examples, nor in fact any of cases observed by Darktrace, why actors had registered and assigned permissions to an application called ‘PerfectData Software’, and there do not appear to be any open-source intelligence (OSINT) resources or online literature related to the malicious usage of an application by that name. That said, there are several websites which appear to provide email migration and data recovery/backup tools under the moniker ‘PerfectData Software’.
It is unclear whether the use of ‘PerfectData Software’ by malicious actors observed on the networks of Darktrace customers was one of these tools. However, given the nature of the tools, it is possible that the actors intended to use them to facilitate the exfiltration of email data from compromises mailboxes.
If the legitimate software ‘PerfectData’ is the application in question in these incidents, it is likely being purchased and misused by attackers for malicious purposes. It is also possible the application referenced in the incidents is a spoof of the legitimate ‘PerfectData’ software designed to masquerade a malicious application as legitimate.
Darktrace Coverage
Cases of ‘PerfectData Software’ activity chains detected by Darktrace typically began with an actor signing into an internal user’s Microsoft 365 account from a VPN or VPS-related endpoint. These login events, along with the suspicious email and/or brute-force activity which preceded them, caused the following DETECT models to breach:
- SaaS / Access / Unusual External Source for SaaS Credential Use
- SaaS / Access / Suspicious Login Attempt
- SaaS / Compromise / Login From Rare Following Suspicious Login Attempt(s)
- SaaS / Email Nexus / Unusual Location for SaaS and Email Activity
Subsequent activities, including inbox rule creations, registration of applications in Azure AD, and mass-mailing activity, resulted in breaches of the following DETECT models.
- SaaS / Admin / OAuth Permission Grant
- SaaS / Compromise / Unusual Logic Following OAuth Grant
- SaaS / Admin / New Application Service Principal
- IaaS / Admin / Azure Application Administration Activities
- SaaS / Compliance / New Email Rule
- SaaS / Compromise / Unusual Login and New Email Rule
- SaaS / Email Nexus / Suspicious Internal Exchange Activity
- SaaS / Email Nexus / Possible Outbound Email Spam
- SaaS / Compromise / Unusual Login and Outbound Email Spam
- SaaS / Compromise / Suspicious Login and Suspicious Outbound Email(s)

In cases where Darktrace RESPOND™ was enabled in autonomous response mode, ‘PerfectData Software’ activity chains resulted in breaches of the following RESPOND models:
• Antigena / SaaS / Antigena Suspicious SaaS Activity Block
• Antigena / SaaS / Antigena Significant Compliance Activity Block
In response to these model breaches, Darktrace RESPOND took immediate action, performing aggressive, inhibitive actions, such as forcing the actor to log out of the SaaS platform, and disabling the user entirely. When applied autonomously, these RESPOND actions would seriously impede an attacker’s progress and minimize network disruption.

In addition, Darktrace Cyber AI Analyst was able to autonomously investigate registrations of the ‘PerfectData Software’ application and summarized its findings into digestible reports.

Conclusion
Due to the widespread adoption of Microsoft 365 services in the workplace and continued emphasis on a remote workforce, account hijackings now pose a more serious threat to organizations around the world than ever before. The cases discussed here illustrate the tendency of malicious actors to conduct their activities from endpoints associated with VPN services, while also registering new applications, like PerfectData Software, with malicious intent.
While it was unclear exactly why the malicious actors were using ‘PerfectData Software’ as part of their account hijacking, it is clear that either the legitimate or spoofed version of the application is becoming an very likely emergent piece of threat actor tradecraft.
Darktrace DETECT’s anomaly-based approach to threat detection allowed it to recognize that the use of ‘PerfectData Software’ represented a deviation in the SaaS user’s expected behavior. While Darktrace RESPOND, when enabled in autonomous response mode, was able to quickly take preventative action against threat actors, blocking the potential use of the application for data exfiltration or other nefarious purposes.
Appendices
MITRE ATT&CK Mapping
Reconnaissance:
• T1598 – Phishing for Information
Credential Access:
• T1110 – Brute Force
Initial Access:
• T1078.004 – Valid Accounts: Cloud Accounts
Command and Control:
• T1105 – Ingress Tool Transfer
Persistence:
• T1098.003 – Account Manipulation: Additional Cloud Roles
Collection:
• T1114 – Email Collection
Defense Evasion:
• T1564.008 – Hide Artifacts: Email Hiding Rules
Lateral Movement:
• T1534 – Internal Spearphishing
Unusual Source IPs
• 5.62.60[.]202 (AS198605 AVAST Software s.r.o.)
• 160.152.10[.]215 (AS37637 Smile-Nigeria-AS)
• 197.244.250[.]155 (AS37705 TOPNET)
• 169.159.92[.]36 (AS37122 SMILE)
• 45.62.170[.]237 (AS396073 MAJESTIC-HOSTING-01)
• 92.38.180[.]49 (AS202422 G-Core Labs S.A)
• 129.56.36[.]26 (AS327952 AS-NATCOM)
• 92.38.180[.]47 (AS202422 G-Core Labs S.A.)
• 107.179.20[.]214 (AS397086 LAYER-HOST-HOUSTON)
• 45.62.170[.]31 (AS396073 MAJESTIC-HOSTING-01)
References
[1] https://www.investing.com/academy/statistics/microsoft-facts/
[2] https://intel471.com/blog/countering-the-problem-of-credential-theft
[3] https://darktrace.com/blog/business-email-compromise-to-mass-phishing-campaign-attack-analysis
[4] https://darktrace.com/blog/breakdown-of-a-multi-account-compromise-within-office-365
Blog
Cloud
Darktrace Integrates Self-Learning AI with Amazon Security Lake to Support Security Investigations
.jpeg)


Darktrace has deepened its relationship with AWS by integrating its detection and response capabilities with Amazon Security Lake.
This development will allow mutual customers to seamlessly combine Darktrace AI’s bespoke understanding of their organization with the Threat Intelligence offered by other security tools, and investigate all of their alerts in one central location.
This integration will improve the value security teams get from both products, streamlining analyst workflows and improving their ability to detect and respond to the full spectrum of known and unknown cyber-threats.
How Darktrace and Amazon Security Lake augment security teams
Amazon Security Lake is a newly-released service that automatically centralizes an organization’s security data from cloud, on-premises, and custom sources into a customer owned purpose-built data lake. Both Darktrace and Amazon Security Lake support the Open Cybersecurity Schema Framework (OCSF), an open standard to simplify, combine, and analyze security logs.
Customers can store security logs, events, alerts, and other relevant data generated by various AWS services and security tools. By consolidating security data in a central lake, organizations can gain a holistic view of their security posture, perform advanced analytics, detect anomalies and open investigations to improve their security practices.
With Darktrace DETECT and RESPOND AI engines covering all assets across IT, OT, network, endpoint, IoT, email and cloud, organizations can augment the value of their security data lakes by feeding Darktrace’s rich and context-aware datapoints to Amazon Security Lake.
Amazon Security Lake empowers security teams to improve the protection of your digital estate:
- Quick and painless data normalization
- Fast-tracks ability to investigate, triage and respond to security events
- Broader visibility aids more effective decision-making
- Surfaces and prioritizes anomalies for further investigation
- Single interface for seamless data management
How will Darktrace customers benefit?
Across the Cyber AI Loop, all Darktrace solutions have been architected with AWS best practices in mind. With this integration, Darktrace is bringing together its understanding of ‘self’ for every organization with the centralized data visibility of the Amazon Security Lake. Darktrace’s unique approach to cyber security, powered by groundbreaking AI research, delivers a superior dataset based on a deep and interconnected understanding of the enterprise.
Where other cyber security solutions are trained to identify threats based on historical attack data and techniques, Darktrace DETECT gains a bespoke understanding of every digital environment, continuously analyzing users, assets, devices and the complex relationships between them. Our AI analyzes thousands of metrics to reveal subtle deviations that may signal an evolving issue – even unknown techniques and novel malware. It distinguishes between malicious and benign behavior, identifying harmful activity that typically goes unnoticed. This rich dataset is fed into RESPOND, which takes precise action to neutralize threats against any and every asset, no matter where data resides.
Both DETECT and RESPOND are supported by Darktrace Self-Learning AI, which provides full, real-time visibility into an organization’s systems and data. This always-on threat analysis already makes humans better at cyber security, improving decisions and outcomes based on total visibility of the digital ecosystem, supporting human performance with AI coverage and empowering security teams to proactively protect critical assets.
Converting Darktrace alerts to the Amazon Security Lake Open Cybersecurity Schema Framework (OCSF) supplies the Security Operations Center (SOC) and incident response team with contextualized data, empowering them to accelerate their investigation, triage and response to potential cyber threats.
Darktrace is available for purchase on the AWS Marketplace.
Learn more about how Darktrace provides full-coverage, AI-powered cloud security for AWS, or see how our customers use Darktrace in their AWS cloud environments.
