Blog

No items found.

Unsupervised Machine Learning and JA3 for Enhanced Security

Default blog imageDefault blog imageDefault blog imageDefault blog imageDefault blog imageDefault blog image
21
Jun 2018
21
Jun 2018
Unlock the true power of Darktrace's algorithms. Learn how JA3 enhances cybersecurity defenses with unique TLS/SSL fingerprints & unsupervised machine learning.

Introducing JA3

JA3 is a methodology for fingerprinting Transport Layer Security applications. It was first posted on GitHub in June 2017 and is the work of Salesforce researchers John Althouse, Jeff Atkinson, and Josh Atkins. The JA3 TLS/SSL fingerprints created can overlap between applications but are still a great Indicator of Compromise (IoC). Fingerprinting is achieved by creating a hash of 5 decimal fields of the Client Hello message that is sent in the initial stages of an TLS/SSL session.

JA3 is an interesting approach to the increasing usage of encryption in networks. There is also a clear uptick in cyber-attacks using encrypted command and control (C2) channels – such as HTTPS – for malware communication.

The benefits of JA3 for enhancing rules-and-signatures security

These near-unique fingerprints can be used to enhance traditional cyber security approaches such as whitelisting, deny-listing, and searching for IoCs.

Let’s take the following JA3 hash for example: 3e860202fc555b939e83e7a7ab518c38. According to one of the public lists that maps JA3s to applications, this JA3 hash is associated with the ‘hola_svc’ application. This is the infamous Hola VPN solution that is non-compliant in most enterprise networks. On the other hand, the following hash is associated with the popular messenger software Slack: a5aa6e939e4770e3b8ac38ce414fd0d5. Traditional cyber security tools can use these hashes like traditional signatures to search for instances of them in data sets or trying to deny-list malicious ones.

While there is some merit to this approach, it comes with all the known limitations of rules-and-signatures defenses, such as the overlaps in signatures, the inability to detect unknown threats, as well as the added complexity of having to maintain a database of known signatures.

JA3 in Darktrace

Darktrace creates JA3 hashes for every TLS/SSL connection it encounters. This is incredibly powerful in a number of ways. First, the JA3 can add invaluable context to a threat hunt. Second, Darktrace can also be queried to see if a particular JA3 was encountered in the network, thus providing actionable intelligence during incident response if JA3 IoCs are known to the incident responders.

Things become much more interesting once we apply our unsupervised machine learning to JA3: Darktrace’s AI algorithms autonomously detect which JA3s are anomalous for the network as a whole and which JA3s are unusual for specific devices.

It basically tells a cyber security expert: This JA3 (3e860202fc555b939e83e7a7ab518c38) has never been seen in the network before and it is only used by one device. It indicates that an application, which is used by nobody else on the network, is initiating TLS/SSL connections. In our experience, this is most often the case for malware or non-compliant software. At this stage, we are observing anomalous behavior.

Darktrace’s AI combines these IoCs (Unusual Network JA3, Unusual Device JA3, …) with many other weak indicators to detect the earliest signs of an emerging threat, including previously unknown threats, without using rules or hard-coded thresholds.

Catching Red-Teams and domain fronting with JA3

The following is an example where Darktrace detected a Red-Team’s C2 communication by observing anomalous JA3 behavior.

The unsupervised machine learning algorithms identified a desktop device using a JA3 that was 100% unusual for the network connecting to an external domain using a Let’s Encrypt certificate, which, along with self-signed certificates, is often abused by malicious actors. As well as the JA3, the domain was also 100% rare for the network – nobody else visited it:

It turned out that a Red-Team had registered a domain that was very similar to the victim’s legitimate domain: www.companyname[.]com (legitimate domain) vs. www.companyname[.]online (malicious domain). This was intentionally done to avoid suspicion and human analysis. Over a 7-day period in a 2,000-device environment, this was the only time that Darktrace flagged unusual behavior of this kind.

As the C2 traffic was encrypted (therefore no intrusion detection was possible on the payload) and the domain was non-suspicious (no reputation-based deny-listing worked), this C2 had remained undetected by the rest of the security stack.

Combining unsupervised machine learning with JA3 is incredibly powerful for the detection of domain fronting. Domain fronting is a popular technique to circumvent censorship and to hide C2 traffic. While some infrastructure providers take action to prevent domain fronting on their end, it is still prevalent and actively used by attackers.

The only agreed-upon method within wide parts of the cyber-security community to detect domain fronting appears to be TLS/SSL inspection. This usually involved breaking up encrypted communication to inspect the clear-text payloads. While this works, it commonly involves additional infrastructure, network restructuring and comes with privacy issues – especially in the context of GDPR.

Unsupervised machine learning makes the detection of domain fronting without having to break up encrypted traffic possible by combining unusual JA3 detection with other anomalies such as beaconing. A good start for a domain fronting threat hunt? A device beaconing to an anomalous CDN with an unusual JA3 hash.

Conclusion

JA3 is not a silver bullet to pre-empt malware compromise. As a signature-based solution, it shares the same limitations of all other defenses that rely on pre-identified threats or deny-lists: having to play a constant game of catch-up with innovative attackers. However, as a novel means of identifying TLS/SSL applications, JA3 hashing can be leveraged as a powerful network behavioral indicator, an additional metric that can flag the use of unauthorized or risky software, or as a means of identifying emerging malware compromises in the initial stages of C2 communication. This is made possible through the power of unsupervised machine learning.

INSIDE THE SOC
Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.
AUTHOR
ABOUT ThE AUTHOR
Max Heinemeyer
Chief Product Officer

Max is a cyber security expert with over a decade of experience in the field, specializing in a wide range of areas such as Penetration Testing, Red-Teaming, SIEM and SOC consulting and hunting Advanced Persistent Threat (APT) groups. At Darktrace, Max is closely involved with Darktrace’s strategic customers & prospects. He works with the R&D team at Darktrace, shaping research into new AI innovations and their various defensive and offensive applications. Max’s insights are regularly featured in international media outlets such as the BBC, Forbes and WIRED. Max holds an MSc from the University of Duisburg-Essen and a BSc from the Cooperative State University Stuttgart in International Business Information Systems.

Book a 1-1 meeting with one of our experts
share this article
USE CASES
No items found.
PRODUCT SPOTLIGHT
No items found.
COre coverage
No items found.

More in this series

No items found.

Blog

Thought Leadership

The State of AI in Cybersecurity: Understanding AI Technologies

Default blog imageDefault blog image
24
Jul 2024

About the State of AI Cybersecurity Report

Darktrace surveyed 1,800 CISOs, security leaders, administrators, and practitioners from industries around the globe. Our research was conducted to understand how the adoption of new AI-powered offensive and defensive cybersecurity technologies are being managed by organizations.

This blog continues the conversation from “The State of AI in Cybersecurity: Unveiling Global Insights from 1,800 Security Practitioners”. This blog will focus on security professionals’ understanding of AI technologies in cybersecurity tools.

To access download the full report, click here.

How familiar are security professionals with supervised machine learning

Just 31% of security professionals report that they are “very familiar” with supervised machine learning.

Many participants admitted unfamiliarity with various AI types. Less than one-third felt "very familiar" with the technologies surveyed: only 31% with supervised machine learning and 28% with natural language processing (NLP).

Most participants were "somewhat" familiar, ranging from 46% for supervised machine learning to 36% for generative adversarial networks (GANs). Executives and those in larger organizations reported the highest familiarity.

Combining "very" and "somewhat" familiar responses, 77% had familiarity with supervised machine learning, 74% generative AI, and 73% NLP. With generative AI getting so much media attention, and NLP being the broader area of AI that encompasses generative AI, these results may indicate that stakeholders are understanding the topic on the basis of buzz, not hands-on work with the technologies.  

If defenders hope to get ahead of attackers, they will need to go beyond supervised learning algorithms trained on known attack patterns and generative AI. Instead, they’ll need to adopt a comprehensive toolkit comprised of multiple, varied AI approaches—including unsupervised algorithms that continuously learn from an organization’s specific data rather than relying on big data generalizations.  

Different types of AI

Different types of AI have different strengths and use cases in cyber security. It’s important to choose the right technique for what you’re trying to achieve.  

Supervised machine learning: Applied more often than any other type of AI in cyber security. Trained on human attack patterns and historical threat intelligence.  

Large language models (LLMs): Applies deep learning models trained on extremely large data sets to understand, summarize, and generate new content. Used in generative AI tools.  

Natural language processing (NLP): Applies computational techniques to process and understand human language.  

Unsupervised machine learning: Continuously learns from raw, unstructured data to identify deviations that represent true anomalies.  

What impact will generative AI have on the cybersecurity field?

More than half of security professionals (57%) believe that generative AI will have a bigger impact on their field over the next few years than other types of AI.

Chart showing the types of AI expected to impact security the most
Figure 1: Chart from Darktrace's State of AI in Cybersecurity Report

Security stakeholders are highly aware of generative AI and LLMs, viewing them as pivotal to the field's future. Generative AI excels at abstracting information, automating tasks, and facilitating human-computer interaction. However, LLMs can "hallucinate" due to training data errors and are vulnerable to prompt injection attacks. Despite improvements in securing LLMs, the best cyber defenses use a mix of AI types for enhanced accuracy and capability.

AI education is crucial as industry expectations for generative AI grow. Leaders and practitioners need to understand where and how to use AI while managing risks. As they learn more, there will be a shift from generative AI to broader AI applications.

Do security professionals fully understand the different types of AI in security products?

Only 26% of security professionals report a full understanding of the different types of AI in use within security products.

Confusion is prevalent in today’s marketplace. Our survey found that only 26% of respondents fully understand the AI types in their security stack, while 31% are unsure or confused by vendor claims. Nearly 65% believe generative AI is mainly used in cybersecurity, though it’s only useful for identifying phishing emails. This highlights a gap between user expectations and vendor delivery, with too much focus on generative AI.

Key findings include:

  • Executives and managers report higher understanding than practitioners.
  • Larger organizations have better understanding due to greater specialization.

As AI evolves, vendors are rapidly introducing new solutions faster than practitioners can learn to use them. There's a strong need for greater vendor transparency and more education for users to maximize the technology's value.

To help ease confusion around AI technologies in cybersecurity, Darktrace has released the CISO’s Guide to Cyber AI. A comprehensive white paper that categorizes the different applications of AI in cybersecurity. Download the White Paper here.  

Do security professionals believe generative AI alone is enough to stop zero-day threats?

No! 86% of survey participants believe generative AI alone is NOT enough to stop zero-day threats

This consensus spans all geographies, organization sizes, and roles, though executives are slightly less likely to agree. Asia-Pacific participants agree more, while U.S. participants agree less.

Despite expecting generative AI to have the most impact, respondents recognize its limited security use cases and its need to work alongside other AI types. This highlights the necessity for vendor transparency and varied AI approaches for effective security across threat prevention, detection, and response.

Stakeholders must understand how AI solutions work to ensure they offer advanced, rather than outdated, threat detection methods. The survey shows awareness that old methods are insufficient.

To access the full report, click here.

Continue reading
About the author
The Darktrace Community

Blog

Inside the SOC

Jupyter Ascending: Darktrace’s Investigation of the Adaptive Jupyter Information Stealer

Default blog imageDefault blog image
18
Jul 2024

What is Malware as a Service (MaaS)?

Malware as a Service (MaaS) is a model where cybercriminals develop and sell or lease malware to other attackers.

This approach allows individuals or groups with limited technical skills to launch sophisticated cyberattacks by purchasing or renting malware tools and services. MaaS is often provided through online marketplaces on the dark web, where sellers offer various types of malware, including ransomware, spyware, and trojans, along with support services such as updates and customer support.

The Growing MaaS Marketplace

The Malware-as-a-Service (MaaS) marketplace is rapidly expanding, with new strains of malware being regularly introduced and attracting waves of new and previous attackers. The low barrier for entry, combined with the subscription-like accessibility and lucrative business model, has made MaaS a prevalent tool for cybercriminals. As a result, MaaS has become a significant concern for organizations and their security teams, necessitating heightened vigilance and advanced defense strategies.

Examples of Malware as a Service

  • Ransomware as a Service (RaaS): Providers offer ransomware kits that allow users to launch ransomware attacks and share the ransom payments with the service provider.
  • Phishing as a Service: Services that provide phishing kits, including templates and email lists, to facilitate phishing campaigns.
  • Botnet as a Service: Renting out botnets to perform distributed denial-of-service (DDoS) attacks or other malicious activities.
  • Information Stealer: Information stealers are a type of malware specifically designed to collect sensitive data from infected systems, such as login credentials, credit card numbers, personal identification information, and other valuable data.

How does information stealer malware work?

Information stealers are an often-discussed type MaaS tool used to harvest personal and proprietary information such as administrative credentials, banking information, and cryptocurrency wallet details. This information is then exfiltrated from target networks via command-and-control (C2) communication, allowing threat actors to monetize the data. Information stealers have also increasingly been used as an initial access vector for high impact breaches including ransomware attacks, employing both double and triple extortion tactics.

After investigating several prominent information stealers in recent years, the Darktrace Threat Research team launched an investigation into indicators of compromise (IoCs) associated with another variant in late 2023, namely the Jupyter information stealer.

What is Jupyter information stealer and how does it work?

The Jupyter information stealer (also known as Yellow Cockatoo, SolarMarker, and Polazert) was first observed in the wild in late 2020. Multiple variants have since become part of the wider threat landscape, however, towards the end of 2023 a new variant was observed. This latest variant achieved greater stealth and updated its delivery method, targeting browser extensions such as Edge, Firefox, and Chrome via search engine optimization (SEO) poisoning and malvertising. This then redirects users to download malicious files that typically impersonate legitimate software, and finally initiates the infection and the attack chain for Jupyter [3][4]. In recently noted cases, users download malicious executables for Jupyter via installer packages created using InnoSetup – an open-source compiler used to create installation packages in the Windows OS.

The latest release of Jupyter reportedly takes advantage of signed digital certificates to add credibility to downloaded executables, further supplementing its already existing tactics, techniques and procedures (TTPs) for detection evasion and sophistication [4]. Jupyter does this while still maintaining features observed in other iterations, such as dropping files into the %TEMP% folder of a system and using PowerShell to decrypt and load content into memory [4]. Another reported feature includes backdoor functionality such as:

  • C2 infrastructure
  • Ability to download and execute malware
  • Execution of PowerShell scripts and commands
  • Injecting shellcode into legitimate windows applications

Darktrace Coverage of Jupyter information stealer

In September 2023, Darktrace’s Threat Research team first investigated Jupyter and discovered multiple IoCs and TTPs associated with the info-stealer across the customer base. Across most investigated networks during this time, Darktrace observed the following activity:

  • HTTP POST requests over destination port 80 to rare external IP addresses (some of these connections were also made via port 8089 and 8090 with no prior hostname lookup).
  • HTTP POST requests specifically to the root directory of a rare external endpoint.
  • Data streams being sent to unusual external endpoints
  • Anomalous PowerShell execution was observed on numerous affected networks.

Taking a further look at the activity patterns detected, Darktrace identified a series of HTTP POST requests within one customer’s environment on December 7, 2023. The HTTP POST requests were made to the root directory of an external IP address, namely 146.70.71[.]135, which had never previously been observed on the network. This IP address was later reported to be malicious and associated with Jupyter (SolarMarker) by open-source intelligence (OSINT) [5].

Device Event Log indicating several connections from the source device to the rare external IP address 146.70.71[.]135 over port 80.
Figure 1: Device Event Log indicating several connections from the source device to the rare external IP address 146.70.71[.]135 over port 80.

This activity triggered the Darktrace / NETWORK model, ‘Anomalous Connection / Posting HTTP to IP Without Hostname’. This model alerts for devices that have been seen posting data out of the network to rare external endpoints without a hostname. Further investigation into the offending device revealed a significant increase in external data transfers around the time Darktrace alerted the activity.

This External Data Transfer graph demonstrates a spike in external data transfer from the internal device indicated at the top of the graph on December 7, 2023, with a time lapse shown of one week prior.
Figure 2: This External Data Transfer graph demonstrates a spike in external data transfer from the internal device indicated at the top of the graph on December 7, 2023, with a time lapse shown of one week prior.

Packet capture (PCAP) analysis of this activity also demonstrates possible external data transfer, with the device observed making a POST request to the root directory of the malicious endpoint, 146.70.71[.]135.

PCAP of a HTTP POST request showing streams of data being sent to the endpoint, 146.70.71[.]135.
Figure 3: PCAP of a HTTP POST request showing streams of data being sent to the endpoint, 146.70.71[.]135.

In other cases investigated by the Darktrace Threat Research team, connections to the rare external endpoint 67.43.235[.]218 were detected on port 8089 and 8090. This endpoint was also linked to Jupyter information stealer by OSINT sources [6].

Darktrace recognized that such suspicious connections represented unusual activity and raised several model alerts on multiple customer environments, including ‘Compromise / Large Number of Suspicious Successful Connections’ and ‘Anomalous Connection / Multiple Connections to New External TCP Port’.

In one instance, a device that was observed performing many suspicious connections to 67.43.235[.]218 was later observed making suspicious HTTP POST connections to other malicious IP addresses. This included 2.58.14[.]246, 91.206.178[.]109, and 78.135.73[.]176, all of which had been linked to Jupyter information stealer by OSINT sources [7] [8] [9].

Darktrace further observed activity likely indicative of data streams being exfiltrated to Jupyter information stealer C2 endpoints.

Graph displaying the significant increase in the number of HTTP POST requests with No Get made by an affected device, likely indicative of Jupyter information stealer C2 activity.
Figure 4: Graph displaying the significant increase in the number of HTTP POST requests with No Get made by an affected device, likely indicative of Jupyter information stealer C2 activity.

In several cases, Darktrace was able to leverage customer integrations with other security vendors to add additional context to its own model alerts. For example, numerous customers who had integrated Darktrace with Microsoft Defender received security integration alerts that enriched Darktrace’s model alerts with additional intelligence, linking suspicious activity to Jupyter information stealer actors.

The security integration model alerts ‘Security Integration / Low Severity Integration Detection’ and (right image) ‘Security Integration / High Severity Integration Detection’, linking suspicious activity observed by Darktrace with Jupyter information stealer (SolarMarker).
Figure 5: The security integration model alerts ‘Security Integration / Low Severity Integration Detection’ and (right image) ‘Security Integration / High Severity Integration Detection’, linking suspicious activity observed by Darktrace with Jupyter information stealer (SolarMarker).

Conclusion

The MaaS ecosystems continue to dominate the current threat landscape and the increasing sophistication of MaaS variants, featuring advanced defense evasion techniques, poses significant risks once deployed on target networks.

Leveraging anomaly-based detections is crucial for staying ahead of evolving MaaS threats like Jupyter information stealer. By adopting AI-driven security tools like Darktrace / NETWORK, organizations can more quickly identify and effectively detect and respond to potential threats as soon as they emerge. This is especially crucial given the rise of stealthy information stealing malware strains like Jupyter which cannot only harvest and steal sensitive data, but also serve as a gateway to potentially disruptive ransomware attacks.

Credit to Nahisha Nobregas (Senior Cyber Analyst), Vivek Rajan (Cyber Analyst)

References

1.     https://www.paloaltonetworks.com/cyberpedia/what-is-multi-extortion-ransomware

2.     https://flashpoint.io/blog/evolution-stealer-malware/

3.     https://blogs.vmware.com/security/2023/11/jupyter-rising-an-update-on-jupyter-infostealer.html

4.     https://www.morphisec.com/hubfs/eBooks_and_Whitepapers/Jupyter%20Infostealer%20WEB.pdf

5.     https://www.virustotal.com/gui/ip-address/146.70.71.135

6.     https://www.virustotal.com/gui/ip-address/67.43.235.218/community

7.     https://www.virustotal.com/gui/ip-address/2.58.14.246/community

8.     https://www.virustotal.com/gui/ip-address/91.206.178.109/community

9.     https://www.virustotal.com/gui/ip-address/78.135.73.176/community

Appendices

Darktrace Model Detections

  • Anomalous Connection / Posting HTTP to IP Without Hostname
  • Compromise / HTTP Beaconing to Rare Destination
  • Unusual Activity / Unusual External Data to New Endpoints
  • Compromise / Slow Beaconing Activity To External Rare
  • Compromise / Large Number of Suspicious Successful Connections
  • Anomalous Connection / Multiple Failed Connections to Rare Endpoint
  • Compromise / Excessive Posts to Root
  • Compromise / Sustained SSL or HTTP Increase
  • Security Integration / High Severity Integration Detection
  • Security Integration / Low Severity Integration Detection
  • Anomalous Connection / Multiple Connections to New External TCP Port
  • Unusual Activity / Unusual External Data Transfer

AI Analyst Incidents:

  • Unusual Repeated Connections
  • Possible HTTP Command and Control to Multiple Endpoints
  • Possible HTTP Command and Control

List of IoCs

Indicators – Type – Description

146.70.71[.]135

IP Address

Jupyter info-stealer C2 Endpoint

91.206.178[.]109

IP Address

Jupyter info-stealer C2 Endpoint

146.70.92[.]153

IP Address

Jupyter info-stealer C2 Endpoint

2.58.14[.]246

IP Address

Jupyter info-stealer C2 Endpoint

78.135.73[.]176

IP Address

Jupyter info-stealer C2 Endpoint

217.138.215[.]105

IP Address

Jupyter info-stealer C2 Endpoint

185.243.115[.]88

IP Address

Jupyter info-stealer C2 Endpoint

146.70.80[.]66

IP Address

Jupyter info-stealer C2 Endpoint

23.29.115[.]186

IP Address

Jupyter info-stealer C2 Endpoint

67.43.235[.]218

IP Address

Jupyter info-stealer C2 Endpoint

217.138.215[.]85

IP Address

Jupyter info-stealer C2 Endpoint

193.29.104[.]25

IP Address

Jupyter info-stealer C2 Endpoint

Continue reading
About the author
Nahisha Nobregas
SOC Analyst
Our ai. Your data.

Elevate your cyber defenses with Darktrace AI

Start your free trial
Darktrace AI protecting a business from cyber threats.