Blog
/
No items found.
/
June 21, 2018

Unsupervised Machine Learning and JA3 for Enhanced Security

Default blog imageDefault blog imageDefault blog imageDefault blog imageDefault blog imageDefault blog image
21
Jun 2018
Unlock the true power of Darktrace's algorithms. Learn how JA3 enhances cybersecurity defenses with unique TLS/SSL fingerprints & unsupervised machine learning.

Introducing JA3

JA3 is a methodology for fingerprinting Transport Layer Security applications. It was first posted on GitHub in June 2017 and is the work of Salesforce researchers John Althouse, Jeff Atkinson, and Josh Atkins. The JA3 TLS/SSL fingerprints created can overlap between applications but are still a great Indicator of Compromise (IoC). Fingerprinting is achieved by creating a hash of 5 decimal fields of the Client Hello message that is sent in the initial stages of an TLS/SSL session.

JA3 is an interesting approach to the increasing usage of encryption in networks. There is also a clear uptick in cyber-attacks using encrypted command and control (C2) channels – such as HTTPS – for malware communication.

The benefits of JA3 for enhancing rules-and-signatures security

These near-unique fingerprints can be used to enhance traditional cyber security approaches such as whitelisting, deny-listing, and searching for IoCs.

Let’s take the following JA3 hash for example: 3e860202fc555b939e83e7a7ab518c38. According to one of the public lists that maps JA3s to applications, this JA3 hash is associated with the ‘hola_svc’ application. This is the infamous Hola VPN solution that is non-compliant in most enterprise networks. On the other hand, the following hash is associated with the popular messenger software Slack: a5aa6e939e4770e3b8ac38ce414fd0d5. Traditional cyber security tools can use these hashes like traditional signatures to search for instances of them in data sets or trying to deny-list malicious ones.

While there is some merit to this approach, it comes with all the known limitations of rules-and-signatures defenses, such as the overlaps in signatures, the inability to detect unknown threats, as well as the added complexity of having to maintain a database of known signatures.

JA3 in Darktrace

Darktrace creates JA3 hashes for every TLS/SSL connection it encounters. This is incredibly powerful in a number of ways. First, the JA3 can add invaluable context to a threat hunt. Second, Darktrace can also be queried to see if a particular JA3 was encountered in the network, thus providing actionable intelligence during incident response if JA3 IoCs are known to the incident responders.

Things become much more interesting once we apply our unsupervised machine learning to JA3: Darktrace’s AI algorithms autonomously detect which JA3s are anomalous for the network as a whole and which JA3s are unusual for specific devices.

It basically tells a cyber security expert: This JA3 (3e860202fc555b939e83e7a7ab518c38) has never been seen in the network before and it is only used by one device. It indicates that an application, which is used by nobody else on the network, is initiating TLS/SSL connections. In our experience, this is most often the case for malware or non-compliant software. At this stage, we are observing anomalous behavior.

Darktrace’s AI combines these IoCs (Unusual Network JA3, Unusual Device JA3, …) with many other weak indicators to detect the earliest signs of an emerging threat, including previously unknown threats, without using rules or hard-coded thresholds.

Catching Red-Teams and domain fronting with JA3

The following is an example where Darktrace detected a Red-Team’s C2 communication by observing anomalous JA3 behavior.

The unsupervised machine learning algorithms identified a desktop device using a JA3 that was 100% unusual for the network connecting to an external domain using a Let’s Encrypt certificate, which, along with self-signed certificates, is often abused by malicious actors. As well as the JA3, the domain was also 100% rare for the network – nobody else visited it:

It turned out that a Red-Team had registered a domain that was very similar to the victim’s legitimate domain: www.companyname[.]com (legitimate domain) vs. www.companyname[.]online (malicious domain). This was intentionally done to avoid suspicion and human analysis. Over a 7-day period in a 2,000-device environment, this was the only time that Darktrace flagged unusual behavior of this kind.

As the C2 traffic was encrypted (therefore no intrusion detection was possible on the payload) and the domain was non-suspicious (no reputation-based deny-listing worked), this C2 had remained undetected by the rest of the security stack.

Combining unsupervised machine learning with JA3 is incredibly powerful for the detection of domain fronting. Domain fronting is a popular technique to circumvent censorship and to hide C2 traffic. While some infrastructure providers take action to prevent domain fronting on their end, it is still prevalent and actively used by attackers.

The only agreed-upon method within wide parts of the cyber-security community to detect domain fronting appears to be TLS/SSL inspection. This usually involved breaking up encrypted communication to inspect the clear-text payloads. While this works, it commonly involves additional infrastructure, network restructuring and comes with privacy issues – especially in the context of GDPR.

Unsupervised machine learning makes the detection of domain fronting without having to break up encrypted traffic possible by combining unusual JA3 detection with other anomalies such as beaconing. A good start for a domain fronting threat hunt? A device beaconing to an anomalous CDN with an unusual JA3 hash.

Conclusion

JA3 is not a silver bullet to pre-empt malware compromise. As a signature-based solution, it shares the same limitations of all other defenses that rely on pre-identified threats or deny-lists: having to play a constant game of catch-up with innovative attackers. However, as a novel means of identifying TLS/SSL applications, JA3 hashing can be leveraged as a powerful network behavioral indicator, an additional metric that can flag the use of unauthorized or risky software, or as a means of identifying emerging malware compromises in the initial stages of C2 communication. This is made possible through the power of unsupervised machine learning.

Inside the SOC
Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.
Author
Max Heinemeyer
Global Field CISO

Max is a cyber security expert with over a decade of experience in the field, specializing in a wide range of areas such as Penetration Testing, Red-Teaming, SIEM and SOC consulting and hunting Advanced Persistent Threat (APT) groups. At Darktrace, Max is closely involved with Darktrace’s strategic customers & prospects. He works with the R&D team at Darktrace, shaping research into new AI innovations and their various defensive and offensive applications. Max’s insights are regularly featured in international media outlets such as the BBC, Forbes and WIRED. Max holds an MSc from the University of Duisburg-Essen and a BSc from the Cooperative State University Stuttgart in International Business Information Systems.

Book a 1-1 meeting with one of our experts
Share this article

More in this series

No items found.

Blog

/

January 29, 2025

/

Inside the SOC

Bytesize Security: Insider Threats in Google Workspace

Default blog imageDefault blog image

What is an insider threat?

An insider threat is a cyber risk originating from within an organization. These threats can involve actions such as an employee inadvertently clicking on a malicious link (e.g., a phishing email) or an employee with malicious intent conducting data exfiltration for corporate sabotage.

Insiders often exploit their knowledge and access to legitimate corporate tools, presenting a continuous risk to organizations. Defenders must protect their digital estate against threats from both within and outside the organization.

For example, in the summer of 2024, Darktrace / IDENTITY successfully detected a user in a customer environment attempting to steal sensitive data from a trusted Google Workspace service. Despite the use of a legitimate and compliant corporate tool, Darktrace identified anomalies in the user’s behavior that indicated malicious intent.

Attack overview: Insider threat

In June 2024, Darktrace detected unusual activity involving the Software-as-a-Service (SaaS) account of a former employee from a customer organization. This individual, who had recently left the company, was observed downloading a significant amount of data in the form of a “.INDD” file (an Adobe InDesign document typically used to create page layouts [1]) from Google Drive.

While the use of Google Drive and other Google Workspace platforms was not unexpected for this employee, Darktrace identified that the user had logged in from an unfamiliar and suspicious IPv6 address before initiating the download. This anomaly triggered a model alert in Darktrace / IDENTITY, flagging the activity as potentially malicious.

A Model Alert in Darktrace / IDENTITY showing the unusual “.INDD” file being downloaded from Google Workspace.
Figure 1: A Model Alert in Darktrace / IDENTITY showing the unusual “.INDD” file being downloaded from Google Workspace.

Following this detection, the customer reached out to Darktrace’s Security Operations Center (SOC) team via the Security Operations Support service for assistance in triaging and investigating the incident further. Darktrace’s SOC team conducted an in-depth investigation, enabling the customer to identify the exact moment of the file download, as well as the contents of the stolen documents. The customer later confirmed that the downloaded files contained sensitive corporate data, including customer details and payment information, likely intended for reuse or sharing with a new employer.

In this particular instance, Darktrace’s Autonomous Response capability was not active, allowing the malicious insider to successfully exfiltrate the files. If Autonomous Response had been enabled, Darktrace would have immediately acted upon detecting the login from an unusual (in this case 100% rare) location by logging out and disabling the SaaS user. This would have provided the customer with the necessary time to review the activity and verify whether the user was authorized to access their SaaS environments.

Conclusion

Insider threats pose a significant challenge for traditional security tools as they involve internal users who are expected to access SaaS platforms. These insiders have preexisting knowledge of the environment, sensitive data, and how to make their activities appear normal, as seen in this case with the use of Google Workspace. This familiarity allows them to avoid having to use more easily detectable intrusion methods like phishing campaigns.

Darktrace’s anomaly detection capabilities, which focus on identifying unusual activity rather than relying on specific rules and signatures, enable it to effectively detect deviations from a user’s expected behavior. For instance, an unusual login from a new location, as in this example, can be flagged even if the subsequent malicious activity appears innocuous due to the use of a trusted application like Google Drive.

Credit to Vivek Rajan (Cyber Analyst) and Ryan Traill (Analyst Content Lead)

Appendices

Darktrace Model Detections

SaaS / Resource::Unusual Download Of Externally Shared Google Workspace File

References

[1]https://www.adobe.com/creativecloud/file-types/image/vector/indd-file.html

MITRE ATT&CK Mapping

Technqiue – Tactic – ID

Data from Cloud Storage Object – COLLECTION -T1530

Continue reading
About the author
Vivek Rajan
Cyber Analyst

Blog

/

January 28, 2025

/
No items found.

Reimagining Your SOC: How to Achieve Proactive Network Security

Default blog imageDefault blog image

Introduction: Challenges and solutions to SOC efficiency

For Security Operation Centers (SOCs), reliance on signature or rule-based tools – solutions that are always chasing the latest update to prevent only what is already known – creates an excess of false positives. SOC analysts are therefore overwhelmed by a high volume of context-lacking alerts, with human analysts able to address only about 10% due to time and resource constraints. This forces many teams to accept the risks of addressing only a fraction of the alerts while novel threats go completely missed.

74% of practitioners are already grappling with the impact of an AI-powered threat landscape, which amplifies challenges like tool sprawl, alert fatigue, and burnout. Thus, achieving a resilient network, where SOC teams can spend most of their time getting proactive and stopping threats before they occur, feels like an unrealistic goal as attacks are growing more frequent.

Despite advancements in security technology (advanced detection systems with AI, XDR tools, SIEM aggregators, etc...), practitioners are still facing the same issues of inefficiency in their SOC, stopping them from becoming proactive. How can they select security solutions that help them achieve a proactive state without dedicating more human hours and resources to managing and triaging alerts, tuning rules, investigating false positives, and creating reports?

To overcome these obstacles, organizations must leverage security technology that is able to augment and support their teams. This can happen in the following ways:

  1. Full visibility across the modern network expanding into hybrid environments
  2. Have tools that identifies and stops novel threats autonomously, without causing downtime
  3. Apply AI-led analysis to reduce time spent on manual triage and investigation

Your current solutions might be holding you back

Traditional cybersecurity point solutions are reliant on using global threat intelligence to pattern match, determine signatures, and consequently are chasing the latest update to prevent only what is known. This means that unknown threats will evade detection until a patient zero is identified. This legacy approach to threat detection means that at least one organization needs to be ‘patient zero’, or the first victim of a novel attack before it is formally identified.

Even the point solutions that claim to use AI to enhance threat detection rely on a combination of supervised machine learning, deep learning, and transformers to

train and inform their systems. This entails shipping your company’s data out to a large data lake housed somewhere in the cloud where it gets blended with attack data from thousands of other organizations. The resulting homogenized dataset gets used to train AI systems — yours and everyone else’s — to recognize patterns of attack based on previously encountered threats.

While using AI in this way reduces the workload of security teams who would traditionally input this data by hand, it emanates the same risk – namely, that AI systems trained on known threats cannot deal with the threats of tomorrow. Ultimately, it is the unknown threats that bring down an organization.

The promise and pitfalls of XDR in today's threat landscape

Enter Extended Detection and Response (XDR): a platform approach aimed at unifying threat detection across the digital environment. XDR was developed to address the limitations of traditional, fragmented tools by stitching together data across domains, providing SOC teams with a more cohesive, enterprise-wide view of threats. This unified approach allows for improved detection of suspicious activities that might otherwise be missed in siloed systems.

However, XDR solutions still face key challenges: they often depend heavily on human validation, which can aggravate the already alarmingly high alert fatigue security analysts experience, and they remain largely reactive, focusing on detecting and responding to threats rather than helping prevent them. Additionally, XDR frequently lacks full domain coverage, relying on EDR as a foundation and are insufficient in providing native NDR capabilities and visibility, leaving critical gaps that attackers can exploit. This is reflected in the current security market, with 57% of organizations reporting that they plan to integrate network security products into their current XDR toolset[1].

Why settling is risky and how to unlock SOC efficiency

The result of these shortcomings within the security solutions market is an acceptance of inevitable risk. From false positives driving the barrage of alerts, to the siloed tooling that requires manual integration, and the lack of multi-domain visibility requiring human intervention for business context, security teams have accepted that not all alerts can be triaged or investigated.

While prioritization and processes have improved, the SOC is operating under a model that is overrun with alerts that lack context, meaning that not all of them can be investigated because there is simply too much for humans to parse through. Thus, teams accept the risk of leaving many alerts uninvestigated, rather than finding a solution to eliminate that risk altogether.

Darktrace / NETWORK is designed for your Security Operations Center to eliminate alert triage with AI-led investigations , and rapidly detect and respond to known and unknown threats. This includes the ability to scale into other environments in your infrastructure including cloud, OT, and more.

Beyond global threat intelligence: Self-Learning AI enables novel threat detection & response

Darktrace does not rely on known malware signatures, external threat intelligence, historical attack data, nor does it rely on threat trained machine learning to identify threats.

Darktrace’s unique Self-learning AI deeply understands your business environment by analyzing trillions of real-time events that understands your normal ‘pattern of life’, unique to your business. By connecting isolated incidents across your business, including third party alerts and telemetry, Darktrace / NETWORK uses anomaly chains to identify deviations from normal activity.

The benefit to this is that when we are not predefining what we are looking for, we can spot new threats, allowing end users to identify both known threats and subtle, never-before-seen indicators of malicious activity that traditional solutions may miss if they are only looking at historical attack data.

AI-led investigations empower your SOC to prioritize what matters

Anomaly detection is often criticized for yielding high false positives, as it flags deviations from expected patterns that may not necessarily indicate a real threat or issues. However, Darktrace applies an investigation engine to automate alert triage and address alert fatigue.

Darktrace’s Cyber AI Analyst revolutionizes security operations by conducting continuous, full investigations across Darktrace and third-party alerts, transforming the alert triage process. Instead of addressing only a fraction of the thousands of daily alerts, Cyber AI Analyst automatically investigates every relevant alert, freeing up your team to focus on high-priority incidents and close security gaps.

Powered by advanced machine-learning techniques, including unsupervised learning, models trained by expert analysts, and tailored security language models, Cyber AI Analyst emulates human investigation skills, testing hypotheses, analyzing data, and drawing conclusions. According to Darktrace Internal Research, Cyber AI Analyst typically provides a SOC with up to  50,000 additional hours of Level 2 analysis and written reporting annually, enriching security operations by producing high level incident alerts with full details so that human analysts can focus on Level 3 tasks.

Containing threats with Autonomous Response

Simply quarantining a device is rarely the best course of action - organizations need to be able to maintain normal operations in the face of threats and choose the right course of action. Different organizations also require tailored response functions because they have different standards and protocols across a variety of unique devices. Ultimately, a ‘one size fits all’ approach to automated response actions puts organizations at risk of disrupting business operations.

Darktrace’s Autonomous Response tailors its actions to contain abnormal behavior across users and digital assets by understanding what is normal and stopping only what is not. Unlike blanket quarantines, it delivers a bespoke approach, blocking malicious activities that deviate from regular patterns while ensuring legitimate business operations remain uninterrupted.

Darktrace offers fully customizable response actions, seamlessly integrating with your workflows through hundreds of native integrations and an open API. It eliminates the need for costly development, natively disarming threats in seconds while extending capabilities with third-party tools like firewalls, EDR, SOAR, and ITSM solutions.

Unlocking a proactive state of security

Securing the network isn’t just about responding to incidents — it’s about being proactive, adaptive, and prepared for the unexpected. The NIST Cybersecurity Framework (CSF 2.0) emphasizes this by highlighting the need for focused risk management, continuous incident response (IR) refinement, and seamless integration of these processes with your detection and response capabilities.

Despite advancements in security technology, achieving a proactive posture is still a challenge to overcome because SOC teams face inefficiencies from reliance on pattern-matching tools, which generate excessive false positives and leave many alerts unaddressed, while novel threats go undetected. If SOC teams are spending all their time investigating alerts then there is no time spent getting ahead of attacks.

Achieving proactive network resilience — a state where organizations can confidently address challenges at every stage of their security posture — requires strategically aligned solutions that work seamlessly together across the attack lifecycle.

References

1.       Market Guide for Extended Detection and Response, Gartner, 17thAugust 2023 - ID G00761828

Continue reading
About the author
Mikey Anderson
Product Marketing Manager, Network Detection & Response
Your data. Our AI.
Elevate your network security with Darktrace AI