ブログ
/
AI
/
October 30, 2023

Exploring AI Threats: Package Hallucination Attacks

Learn how malicious actors exploit errors in generative AI tools to launch packet attacks. Read how Darktrace products detect and prevent these threats!
Inside the SOC
Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.
Written by
Charlotte Thompson
Cyber Analyst
Written by
Tiana Kelly
Senior Cyber Analyst & Team Lead
Default blog imageDefault blog imageDefault blog imageDefault blog imageDefault blog imageDefault blog image
30
Oct 2023

AI tools open doors for threat actors

On November 30, 2022, the free conversational language generation model ChatGPT was launched by OpenAI, an artificial intelligence (AI) research and development company. The launch of ChatGPT was the culmination of development ongoing since 2018 and represented the latest innovation in the ongoing generative AI boom and made the use of generative AI tools accessible to the general population for the first time.

ChatGPT is estimated to currently have at least 100 million users, and in August 2023 the site reached 1.43 billion visits [1]. Darktrace data indicated that, as of March 2023, 74% of active customer environments have employees using generative AI tools in the workplace [2].

However, with new tools come new opportunities for threat actors to exploit and use them maliciously, expanding their arsenal.

Much consideration has been given to mitigating the impacts of the increased linguistic complexity in social engineering and phishing attacks resulting from generative AI tool use, with Darktrace observing a 135% increase in ‘novel social engineering attacks’ across thousands of active Darktrace/Email™ customers from January to February 2023, corresponding with the widespread adoption of ChatGPT and its peers [3].

Less overall consideration, however, has been given to impacts stemming from errors intrinsic to generative AI tools. One of these errors is AI hallucinations.

What is an AI hallucination?

AI “hallucination” is a term which refers to the predictive elements of generative AI and LLMs’ AI model gives an unexpected or factually incorrect response which does not align with its machine learning training data [4]. This differs from regular and intended behavior for an AI model, which should provide a response based on the data it was trained upon.  

Why are AI hallucinations a problem?

Despite the term indicating it might be a rare phenomenon, hallucinations are far more likely than accurate or factual results as the AI models used in LLMs are merely predictive and focus on the most probable text or outcome, rather than factual accuracy.

Given the widespread use of generative AI tools in the workplace employees are becoming significantly more likely to encounter an AI hallucination. Furthermore, if these fabricated hallucination responses are taken at face value, they could cause significant issues for an organization.

Use of generative AI in software development

Software developers may use generative AI for recommendations on how to optimize their scripts or code, or to find packages to import into their code for various uses. Software developers may ask LLMs for recommendations on specific pieces of code or how to solve a specific problem, which will likely lead to a third-party package. It is possible that packages recommended by generative AI tools could represent AI hallucinations and the packages may not have been published, or, more accurately, the packages may not have been published prior to the date at which the training data for the model halts. If these hallucinations result in common suggestions of a non-existent package, and the developer copies the code snippet wholesale, this may leave the exchanges vulnerable to attack.

Research conducted by Vulcan revealed the prevalence of AI hallucinations when ChatGPT is asked questions related to coding. After sourcing a sample of commonly asked coding questions from Stack Overflow, a question-and-answer website for programmers, researchers queried ChatGPT (in the context of Node.js and Python) and reviewed its responses. In 20% of the responses provided by ChatGPT pertaining to Node.js at least one un-published package was included, whilst the figure sat at around 35% for Python [4].

Hallucinations can be unpredictable, but would-be attackers are able to find packages to create by asking generative AI tools generic questions and checking whether the suggested packages exist already. As such, attacks using this vector are unlikely to target specific organizations, instead posing more of a widespread threat to users of generative AI tools.

Malicious packages as attack vectors

Although AI hallucinations can be unpredictable, and responses given by generative AI tools may not always be consistent, malicious actors are able to discover AI hallucinations by adopting the approach used by Vulcan. This allows hallucinated packages to be used as attack vectors. Once a malicious actor has discovered a hallucination of an un-published package, they are able to create a package with the same name and include a malicious payload, before publishing it. This is known as a malicious package.

Malicious packages could also be recommended by generative AI tools in the form of pre-existing packages. A user may be recommended a package that had previously been confirmed to contain malicious content, or a package that is no longer maintained and, therefore, is more vulnerable to hijack by malicious actors.

In such scenarios it is not necessary to manipulate the training data (data poisoning) to achieve the desired outcome for the malicious actor, thus a complex and time-consuming attack phase can easily be bypassed.

An unsuspecting software developer may incorporate a malicious package into their code, rendering it harmful. Deployment of this code could then result in compromise and escalation into a full-blown cyber-attack.

Figure 1: Flow diagram depicting the initial stages of an AI Package Hallucination Attack.

For providers of Software-as-a-Service (SaaS) products, this attack vector may represent an even greater risk. Such organizations may have a higher proportion of employed software developers than other organizations of comparable size. A threat actor, therefore, could utilize this attack vector as part of a supply chain attack, whereby a malicious payload becomes incorporated into trusted software and is then distributed to multiple customers. This type of attack could have severe consequences including data loss, the downtime of critical systems, and reputational damage.

How could Darktrace detect an AI Package Hallucination Attack?

In June 2023, Darktrace introduced a range of DETECT™ and RESPOND™ models designed to identify the use of generative AI tools within customer environments, and to autonomously perform inhibitive actions in response to such detections. These models will trigger based on connections to endpoints associated with generative AI tools, as such, Darktrace’s detection of an AI Package Hallucination Attack would likely begin with the breaching of one of the following DETECT models:

  • Compliance / Anomalous Upload to Generative AI
  • Compliance / Beaconing to Rare Generative AI and Generative AI
  • Compliance / Generative AI

Should generative AI tool use not be permitted by an organization, the Darktrace RESPOND model ‘Antigena / Network / Compliance / Antigena Generative AI Block’ can be activated to autonomously block connections to endpoints associated with generative AI, thus preventing an AI Package Hallucination attack before it can take hold.

Once a malicious package has been recommended, it may be downloaded from GitHub, a platform and cloud-based service used to store and manage code. Darktrace DETECT is able to identify when a device has performed a download from an open-source repository such as GitHub using the following models:

  • Device / Anomalous GitHub Download
  • Device / Anomalous Script Download Followed By Additional Packages

Whatever goal the malicious package has been designed to fulfil will determine the next stages of the attack. Due to their highly flexible nature, AI package hallucinations could be used as an attack vector to deliver a large variety of different malware types.

As GitHub is a commonly used service by software developers and IT professionals alike, traditional security tools may not alert customer security teams to such GitHub downloads, meaning malicious downloads may go undetected. Darktrace’s anomaly-based approach to threat detection, however, enables it to recognize subtle deviations in a device’s pre-established pattern of life which may be indicative of an emerging attack.

Subsequent anomalous activity representing the possible progression of the kill chain as part of an AI Package Hallucination Attack could then trigger an Enhanced Monitoring model. Enhanced Monitoring models are high-fidelity indicators of potential malicious activity that are investigated by the Darktrace analyst team as part of the Proactive Threat Notification (PTN) service offered by the Darktrace Security Operation Center (SOC).

Conclusion

Employees are often considered the first line of defense in cyber security; this is particularly true in the face of an AI Package Hallucination Attack.

As the use of generative AI becomes more accessible and an increasingly prevalent tool in an attacker’s toolbox, organizations will benefit from implementing company-wide policies to define expectations surrounding the use of such tools. It is simple, yet critical, for example, for employees to fact check responses provided to them by generative AI tools. All packages recommended by generative AI should also be checked by reviewing non-generated data from either external third-party or internal sources. It is also good practice to adopt caution when downloading packages with very few downloads as it could indicate the package is untrustworthy or malicious.

As of September 2023, ChatGPT Plus and Enterprise users were able to use the tool to browse the internet, expanding the data ChatGPT can access beyond the previous training data cut-off of September 2021 [5]. This feature will be expanded to all users soon [6]. ChatGPT providing up-to-date responses could prompt the evolution of this attack vector, allowing attackers to publish malicious packages which could subsequently be recommended by ChatGPT.

It is inevitable that a greater embrace of AI tools in the workplace will be seen in the coming years as the AI technology advances and existing tools become less novel and more familiar. By fighting fire with fire, using AI technology to identify AI usage, Darktrace is uniquely placed to detect and take preventative action against malicious actors capitalizing on the AI boom.

Credit to Charlotte Thompson, Cyber Analyst, Tiana Kelly, Analyst Team Lead, London, Cyber Analyst

References

[1] https://seo.ai/blog/chatgpt-user-statistics-facts

[2] https://darktrace.com/news/darktrace-addresses-generative-ai-concerns

[3] https://darktrace.com/news/darktrace-email-defends-organizations-against-evolving-cyber-threat-landscape

[4] https://vulcan.io/blog/ai-hallucinations-package-risk?nab=1&utm_referrer=https%3A%2F%2Fwww.google.com%2F

[5] https://twitter.com/OpenAI/status/1707077710047216095

[6] https://www.reuters.com/technology/openai-says-chatgpt-can-now-browse-internet-2023-09-27/

Inside the SOC
Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.
Written by
Charlotte Thompson
Cyber Analyst
Written by
Tiana Kelly
Senior Cyber Analyst & Team Lead

More in this series

No items found.

Blog

/

Email

/

March 24, 2026

Darktrace Unites Human Behavior and Threat Detection Across Email, Slack, Teams, and Zoom

Default blog imageDefault blog image

The communication attack surface is expanding

Modern attackers no longer focus solely on inboxes, they target people and the productivity systems where work actually happens. Meanwhile, the boundary between internal and external usage of tools is becoming blurrier everyday – turning the entire workplace into the attack surface. In 2025, identity compromise emerged as the single most consistent threat across the global threat landscape, as observed by Darktrace research across our entire customer base. Over 70% of incidents in the US involved SaaS/M365 account compromise and phishing or email-based social engineering, making credential abuse the single most effective initial access vector.

Despite this upward trend, investment in existing security awareness training (SAT) isn’t moving the needle on reducing risk. 84% of organizations still measure success through completion rates1, even though completion of standard training correlates with less than 2% real improvement in risky behavior.2 By prioritizing completion, organizations reward time spent rather than meaningful engagement, yet time in training doesn’t translate to retention or real-world decision-making. This compliance-first approach has left the workforce unprepared for the threats they actually face.

At the same time, attacks have evolved. Highly personalized, AI-generated campaigns now move fluidly across email, Slack, Teams, Zoom, and beyond, blending channels and even targeting systems directly through techniques like prompt injection. This new reality demands a different approach: one that treats people and the tools they use as a single ecosystem, where behavior and detection continuously inform and strengthen each other.

Only an adaptive communication security system can keep pace with the speed, creativity, and cross channel nature of today’s threats. 

Ushering in the adaptive era of workplace security

With this release, Darktrace brings together our new behavior-driven training solution with email detection, cross-channel visibility, and platform-level insights. Powered by Self-Learning AI, it delivers protection across both people and the communication tools they rely on every day, including email, Slack, Teams, and Zoom.

Each component learns from the others – training adapts to real user behavior, detection evolves across channels, and response is continuously refined – creating a powerful feedback loop that strengthens resilience and improves accuracy against today’s AI-driven threats.

Introducing: Unified training and email security for a self-improving email defense

Our brand new product, Darktrace / Adaptive Human Defense, closes the gap between human behavior and email security to continuously strengthen both people and defenses. Each user receives personalized training that adapts to their own inbox activity and skill level, with learning delivered directly within the flow of their day-to-day email interactions.

By learning from each user’s interactions with security training, it adapts security responses, creating a closed-loop system where training reinforces detection and detection informs training. Let’s look at some of the benefits.

  • Reduce successful phishing at the source with contextual Just in Time coaching: Contextual coaching appears directly in real email threads the moment risky behavior is detected, so habits change where mistakes actually happen. Configurable triggers and group policies target the right users, reducing repeated errors and administrative overhead.
  • Adaptive phishing simulations that progress automatically with each user: Embedded simulations vary in their degree of realism, from generic phishing to generative AI-enabled spear phishing. Users progress through the difficulty levels based on their performance to give an accurate picture of their phishing preparedness.  
  • Native email security integration turns human behavior into quantified risk: The native email security integration allows engagement, links clicked, and question success signals to flow back into / EMAIL recipes and models, so detection and response adapt automatically as users learn.  
  • Actionable risk and trend analytics beyond completion rates: Analytics that surface repeat offenders, high-value targets, and measurable exposure, moving beyond completion metrics to give leaders actionable insights tied to real behavior.

Learn more about / Adaptive Human Defense in the product solution brief.

Industry-first cross-channel full-message analysis for email, Slack, Teams, and Zoom

Darktrace now brings full-message analysis to Email, Slack, Teams, Zoom, and even generative AI prompts. The same leading behavioral analysis from EMAIL extends to every message, tracing intent, tone, relationships, and conversation flow across all communication activity for a complete understanding of every user interaction.

By correlating messaging and collaboration activity with email and account environments, cross-channel analysis reveals multi-domain attack paths and follows both users and threats as a single, continuous narrative – delivering better context to improve detection across the entire organization.

  • Eliminate cross-channel blind spots: Detect phishing, malware, account takeovers, and conversational manipulation across email and collaboration platforms, so attackers can’t exploit Slack, Teams, or Zoom as a new entry point. Unified behavioral analysis gives security teams a coherent, single view, for no more fragmented, channel-specific gaps.
  • Spot generative AI prompt injection attacks before they manipulate assistants: Dedicated models surface threats targeting corporate AI assistants – like ShadowLeak and Hashjack – before they can silently manipulate workflows, reducing risk before static filters catch up.

Learn more about Darktrace’s messaging security offering in the product solution brief.

Industry-first DMARC with bi-directional ASM and email security integration

Darktrace transforms domain protection by linking DMARC, attack surface intelligence, and email security into a single, continuously evolving workflow. Instead of treating domain authentication and exposure as separate tasks, this unified approach shows not just where domains are vulnerable, but how attackers are actively exploiting them.

  • Fix authentication weaknesses faster: SPF, DKIM, DMARC configurations, and external exposure data are analyzed together, giving teams clear guidance to correct weaknesses before they can be abused. Deep bidirectional integration with attack surface intelligence reduces impersonation risk at the source.
  • Accelerate email investigations: DMARC context is embedded directly into email workflows, enriching triage with authentication posture, internal/external sender lists, and seamless pivots between email and domain intelligence for faster, more accurate investigations.

Committed to innovation

These updates are part of a broader Darktrace release, which also includes:

Join our Live Launch Event on April 14, 2026.

Join us for an exclusive announcement event where Darktrace, the leader in AI-native cybersecurity, will be announcing our latest innovations, including  a demo of our new product / Adaptive Human Defense, an exclusive conversation with a Darktrace customer, and a deep dive into the Darktrace ActiveAI Security Portal.  

Register here.

References

[1] 84% of organizations still measure security awareness training success through completion rates, a vanity metric with no correlation to behavior change. (Source:  NIST Awareness Effectiveness Study, Forrester 2025)

[2] 'Limited benefit from embedded phishing training. Using randomized controlled trials and statistical modeling, embedded training provides a statistically-significant reduction in average failure rate, but of only 2%.' Ho, G., Mirian, A., Luo, E., Tong, K., Lee, E., Liu, L., Longhurst, C. A., Dameff, C., Savage, S., & Voelker, G. M. (2025). Understanding the Efficacy of Phishing Training in Practice. Proceedings of the 2025 IEEE Symposium on Security and Privacy.

Continue reading
About the author
Carlos Gray
Senior Product Marketing Manager, Email

Blog

/

OT

/

March 24, 2026

Advancing OT Security with Architecture Visibility, Operational Reporting, and Industrial Context

Default blog imageDefault blog image

The challenge of operational understanding in complex OT environments

Most industrial organizations today already have some level of asset visibility. The bigger challenge is maintaining a trusted, shared understanding of the environment as it evolves. OT teams still frequently rely on static diagrams, spreadsheets, and manually maintained documentation because these are often the only artifacts trusted by auditors, leadership, and engineering teams. However, these references quickly become outdated as environments change.

At the same time, compliance expectations continue to increase, particularly around IEC-62443 aligned programs. Producing defensible security evidence often requires teams to manually assemble reports across multiple tools while still debating asset inventories and classifications. This creates operational overhead and reduces confidence during audits, risk reviews, and incident response situations.

Advancing operational OT security with Darktrace / OT 7.1

Darktrace / OT's latest updates focus on helping industrial organizations close this operational gap by strengthening how OT security platforms support real workflows. This release enhances Operational Overview with architecture visibility, improves how industrial assets are represented, and introduces structured reporting capabilities aligned to governance needs.

Together, these improvements help organizations maintain a more reliable operational picture of their environments while reducing manual effort associated with documentation, reporting, and asset validation.

Darktrace OT updates 2026

Native OT architecture visibility inside Operational Overview

Understanding how industrial environments are structured is critical during investigations and risk reviews, yet architecture diagrams are typically maintained outside security platforms and quickly fall out of sync with operational changes. This disconnect makes it harder for OT, IT, and security teams to maintain a shared understanding of their environments when incidents occur.

Darktrace / OT introduces native OT architecture diagrams directly within Operational Overview, allowing teams to maintain a live representation of how OT assets and systems relate to each other inside the same platform used for monitoring and investigations.

These updates help organizations:

  • Maintain a shared architectural understanding across OT, IT, and security teams
  • Improve investigation context by understanding how systems relate operationally
  • Reduce reliance on static diagrams that quickly become outdated

Improving OT governance with operational asset and compliance reporting

Accurate reporting remains a major operational challenge for industrial organizations, particularly when security posture must be demonstrated to auditors, regulators, and leadership. Many OT teams still rely on manual screenshots, spreadsheets, or fragmented exports to show asset inventories and compliance alignment.

Darktrace / OT introduces structured OT asset reporting and IEC-62443-3-3 compliance reporting directly from Operational Overview. These capabilities allow organizations to generate consistent, repeatable outputs based on continuously observed OT environments rather than manually assembled documentation.

These updates help customers:

  • Reduce manual compliance effort through automated IEC-62443 reporting aligned to live OT data
  • Support governance workflows with structured OT asset and architecture reporting
  • Improve audit readiness with consistent reporting aligned to operational security posture

Expanding industrial context through improved asset representation and protocol coverage

Industrial environments rely on diverse technologies spanning manufacturing systems, power and utilities infrastructure, healthcare devices, and Industrial IoT deployments. Maintaining strong visibility across these environments requires both accurate device representation and deeper protocol understanding.

Darktrace / OT strengthens industrial context through expanded ICS and IoMT device classification alongside broader industrial protocol coverage. These improvements help organizations better understand specialized devices and communications across sectors such as manufacturing, energy, healthcare, and Industrial IoT.

These enhancements enable organizations to:

  • Improve visibility into specialized ICS, IoMT, and industrial infrastructure devices
  • Strengthen monitoring across sector-specific industrial communications in manufacturing, utilities, and IIoT environments
  • Increase confidence in detection across complex and evolving industrial technology estates

Supporting practical OT security outcomes for industrial organizations

Darktrace / OT continues our focus on delivering capabilities that help industrial organizations operationalize security rather than simply deploy tools. By improving architecture understanding, strengthening asset representation, and supporting governance reporting, this release helps organizations manage OT security with greater confidence.

As industrial environments continue to evolve, organizations need more than visibility. They need the ability to maintain trusted operational understanding and demonstrate security readiness without increasing operational friction. This release reflects Darktrace’s continued commitment to supporting the priorities that matter most in OT: safety, uptime, and resilience.

Continue reading
About the author
Pallavi Singh
Product Marketing Manager, OT Security & Compliance
あなたのデータ × DarktraceのAI
唯一無二のDarktrace AIで、ネットワークセキュリティを次の次元へ