ブログ
/
AI
/
October 30, 2023

Exploring AI Threats: Package Hallucination Attacks

Learn how malicious actors exploit errors in generative AI tools to launch packet attacks. Read how Darktrace products detect and prevent these threats!
Inside the SOC
Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.
Written by
Charlotte Thompson
Cyber Analyst
Written by
Tiana Kelly
Senior Cyber Analyst & Team Lead
Default blog image
30
Oct 2023

AI tools open doors for threat actors

On November 30, 2022, the free conversational language generation model ChatGPT was launched by OpenAI, an artificial intelligence (AI) research and development company. The launch of ChatGPT was the culmination of development ongoing since 2018 and represented the latest innovation in the ongoing generative AI boom and made the use of generative AI tools accessible to the general population for the first time.

ChatGPT is estimated to currently have at least 100 million users, and in August 2023 the site reached 1.43 billion visits [1]. Darktrace data indicated that, as of March 2023, 74% of active customer environments have employees using generative AI tools in the workplace [2].

However, with new tools come new opportunities for threat actors to exploit and use them maliciously, expanding their arsenal.

Much consideration has been given to mitigating the impacts of the increased linguistic complexity in social engineering and phishing attacks resulting from generative AI tool use, with Darktrace observing a 135% increase in ‘novel social engineering attacks’ across thousands of active Darktrace/Email™ customers from January to February 2023, corresponding with the widespread adoption of ChatGPT and its peers [3].

Less overall consideration, however, has been given to impacts stemming from errors intrinsic to generative AI tools. One of these errors is AI hallucinations.

What is an AI hallucination?

AI “hallucination” is a term which refers to the predictive elements of generative AI and LLMs’ AI model gives an unexpected or factually incorrect response which does not align with its machine learning training data [4]. This differs from regular and intended behavior for an AI model, which should provide a response based on the data it was trained upon.  

Why are AI hallucinations a problem?

Despite the term indicating it might be a rare phenomenon, hallucinations are far more likely than accurate or factual results as the AI models used in LLMs are merely predictive and focus on the most probable text or outcome, rather than factual accuracy.

Given the widespread use of generative AI tools in the workplace employees are becoming significantly more likely to encounter an AI hallucination. Furthermore, if these fabricated hallucination responses are taken at face value, they could cause significant issues for an organization.

Use of generative AI in software development

Software developers may use generative AI for recommendations on how to optimize their scripts or code, or to find packages to import into their code for various uses. Software developers may ask LLMs for recommendations on specific pieces of code or how to solve a specific problem, which will likely lead to a third-party package. It is possible that packages recommended by generative AI tools could represent AI hallucinations and the packages may not have been published, or, more accurately, the packages may not have been published prior to the date at which the training data for the model halts. If these hallucinations result in common suggestions of a non-existent package, and the developer copies the code snippet wholesale, this may leave the exchanges vulnerable to attack.

Research conducted by Vulcan revealed the prevalence of AI hallucinations when ChatGPT is asked questions related to coding. After sourcing a sample of commonly asked coding questions from Stack Overflow, a question-and-answer website for programmers, researchers queried ChatGPT (in the context of Node.js and Python) and reviewed its responses. In 20% of the responses provided by ChatGPT pertaining to Node.js at least one un-published package was included, whilst the figure sat at around 35% for Python [4].

Hallucinations can be unpredictable, but would-be attackers are able to find packages to create by asking generative AI tools generic questions and checking whether the suggested packages exist already. As such, attacks using this vector are unlikely to target specific organizations, instead posing more of a widespread threat to users of generative AI tools.

Malicious packages as attack vectors

Although AI hallucinations can be unpredictable, and responses given by generative AI tools may not always be consistent, malicious actors are able to discover AI hallucinations by adopting the approach used by Vulcan. This allows hallucinated packages to be used as attack vectors. Once a malicious actor has discovered a hallucination of an un-published package, they are able to create a package with the same name and include a malicious payload, before publishing it. This is known as a malicious package.

Malicious packages could also be recommended by generative AI tools in the form of pre-existing packages. A user may be recommended a package that had previously been confirmed to contain malicious content, or a package that is no longer maintained and, therefore, is more vulnerable to hijack by malicious actors.

In such scenarios it is not necessary to manipulate the training data (data poisoning) to achieve the desired outcome for the malicious actor, thus a complex and time-consuming attack phase can easily be bypassed.

An unsuspecting software developer may incorporate a malicious package into their code, rendering it harmful. Deployment of this code could then result in compromise and escalation into a full-blown cyber-attack.

Figure 1: Flow diagram depicting the initial stages of an AI Package Hallucination Attack.

For providers of Software-as-a-Service (SaaS) products, this attack vector may represent an even greater risk. Such organizations may have a higher proportion of employed software developers than other organizations of comparable size. A threat actor, therefore, could utilize this attack vector as part of a supply chain attack, whereby a malicious payload becomes incorporated into trusted software and is then distributed to multiple customers. This type of attack could have severe consequences including data loss, the downtime of critical systems, and reputational damage.

How could Darktrace detect an AI Package Hallucination Attack?

In June 2023, Darktrace introduced a range of DETECT™ and RESPOND™ models designed to identify the use of generative AI tools within customer environments, and to autonomously perform inhibitive actions in response to such detections. These models will trigger based on connections to endpoints associated with generative AI tools, as such, Darktrace’s detection of an AI Package Hallucination Attack would likely begin with the breaching of one of the following DETECT models:

  • Compliance / Anomalous Upload to Generative AI
  • Compliance / Beaconing to Rare Generative AI and Generative AI
  • Compliance / Generative AI

Should generative AI tool use not be permitted by an organization, the Darktrace RESPOND model ‘Antigena / Network / Compliance / Antigena Generative AI Block’ can be activated to autonomously block connections to endpoints associated with generative AI, thus preventing an AI Package Hallucination attack before it can take hold.

Once a malicious package has been recommended, it may be downloaded from GitHub, a platform and cloud-based service used to store and manage code. Darktrace DETECT is able to identify when a device has performed a download from an open-source repository such as GitHub using the following models:

  • Device / Anomalous GitHub Download
  • Device / Anomalous Script Download Followed By Additional Packages

Whatever goal the malicious package has been designed to fulfil will determine the next stages of the attack. Due to their highly flexible nature, AI package hallucinations could be used as an attack vector to deliver a large variety of different malware types.

As GitHub is a commonly used service by software developers and IT professionals alike, traditional security tools may not alert customer security teams to such GitHub downloads, meaning malicious downloads may go undetected. Darktrace’s anomaly-based approach to threat detection, however, enables it to recognize subtle deviations in a device’s pre-established pattern of life which may be indicative of an emerging attack.

Subsequent anomalous activity representing the possible progression of the kill chain as part of an AI Package Hallucination Attack could then trigger an Enhanced Monitoring model. Enhanced Monitoring models are high-fidelity indicators of potential malicious activity that are investigated by the Darktrace analyst team as part of the Proactive Threat Notification (PTN) service offered by the Darktrace Security Operation Center (SOC).

Conclusion

Employees are often considered the first line of defense in cyber security; this is particularly true in the face of an AI Package Hallucination Attack.

As the use of generative AI becomes more accessible and an increasingly prevalent tool in an attacker’s toolbox, organizations will benefit from implementing company-wide policies to define expectations surrounding the use of such tools. It is simple, yet critical, for example, for employees to fact check responses provided to them by generative AI tools. All packages recommended by generative AI should also be checked by reviewing non-generated data from either external third-party or internal sources. It is also good practice to adopt caution when downloading packages with very few downloads as it could indicate the package is untrustworthy or malicious.

As of September 2023, ChatGPT Plus and Enterprise users were able to use the tool to browse the internet, expanding the data ChatGPT can access beyond the previous training data cut-off of September 2021 [5]. This feature will be expanded to all users soon [6]. ChatGPT providing up-to-date responses could prompt the evolution of this attack vector, allowing attackers to publish malicious packages which could subsequently be recommended by ChatGPT.

It is inevitable that a greater embrace of AI tools in the workplace will be seen in the coming years as the AI technology advances and existing tools become less novel and more familiar. By fighting fire with fire, using AI technology to identify AI usage, Darktrace is uniquely placed to detect and take preventative action against malicious actors capitalizing on the AI boom.

Credit to Charlotte Thompson, Cyber Analyst, Tiana Kelly, Analyst Team Lead, London, Cyber Analyst

References

[1] https://seo.ai/blog/chatgpt-user-statistics-facts

[2] https://darktrace.com/news/darktrace-addresses-generative-ai-concerns

[3] https://darktrace.com/news/darktrace-email-defends-organizations-against-evolving-cyber-threat-landscape

[4] https://vulcan.io/blog/ai-hallucinations-package-risk?nab=1&utm_referrer=https%3A%2F%2Fwww.google.com%2F

[5] https://twitter.com/OpenAI/status/1707077710047216095

[6] https://www.reuters.com/technology/openai-says-chatgpt-can-now-browse-internet-2023-09-27/

Inside the SOC
Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.
Written by
Charlotte Thompson
Cyber Analyst
Written by
Tiana Kelly
Senior Cyber Analyst & Team Lead

More in this series

No items found.

Blog

/

AI

/

June 1, 2026

効率化が露出へ:AI導入が製造現場にもたらす見えない脆弱性

Default blog imageDefault blog image

AIエージェントが製造業に与える影響

製造業界のセキュリティチームやIT担当者は、生産を守り、稼働時間を維持し、重要資産を保護するという絶え間ないプレッシャー下にあります。そしてAIは非常に大きなチャンスとともに、新たなサイバーリスクももたらしています。製造業全体で、AIはワークフローや意思決定に組み込まれつつあり、自律型AIエージェントが従業員やシステムに代わって行動する場面が増えています。

エージェント型システムは独立して行動できるため強力ですが、その同じ自律性がサイバーリスク、運用上のリスクも生み出します。エージェントは広範な権限を持ち、複雑なタスクの実行、意思決定、ツールや外部システムとのやり取りを、ほとんどまたは全く人間の介入なしに行うことができます。

あらかじめ定義されたタスクを実行する従来のAIモデルとは異なり、AIエージェントは高度なテクニックを使用して人間の意思決定プロセスを模倣することにより、新たな課題に動的に適応し、また自らの判断に基づいて意思決定し、アクションを実行します。彼らは業務の上では従業員のように見えますが、人間が持つ判断力、倫理観、または行動の結果に対する恐れが欠けています。これは、サイバー犯罪者によって簡単に操られる可能性があることを意味しており、OTネットワーク全体に埋め込まれたAIエージェントは、データ漏洩をはるかに超える脅威を生み出します。たとえば、BMWでは、AI は溶接プロセスのエラーの発生を識別するのに使われています。同社のスパータンバーグ(米サウスカロライナ州)の工場では、すべてのSUVフレーム上の300-400個のスタッドの溶接をAIが監視し、スタッドの配置間違いや欠陥を検知し直ちに修正します。このAIシステムが破損すれば壊滅的な品質管理問題につながる恐れがあります。

製造全体にエージェント型AIシステムを導入することについて多くのセキュリティチームはさまざまな懸念を示しています。ダークトレースの行ったAIサイバーセキュリティの現状調査では、製造業のセキュリティプロフェッショナルの78%が従業員によるAIエージェントの利用に懸念を抱いており、これは彼らの最も大きな危惧でした。それに続く問題点が従業員によるCopilotやChatGPT等の生成AIツールの使用であり、製造業のセキュリティプロフェッショナルの76%が懸念を抱いていました。これらのツールがますます多くのビジネスデータやプロセスにアクセスし、組織内でより多くの自律性を持つようになるにつれ、エージェントのアクティビティがほとんど可視化されていない現在、セキュリティチームにおいては機密データの露出(60%)や偶発的なポリシーおよび規制違反(59%)への懸念が高まっています。

外部からのAIによる脅威も急激に進化

製造業を変革しているのと同じAIの能力が、サイバー攻撃の形も変貌させています。

AIにより攻撃者は偵察を自動化し、標的をより高度に絞り込み、リアルタイムで適応できるようになっています。かつては人手による作業と時間を要していたことが、今では継続的かつ大規模に実行できるようになりました。そして、製造業はすでにその影響を実感しています。当社が調査した製造業のセキュリティプロフェッショナルの76%は、すでにAIを活用した脅威の影響を受けており、90%がAIによってソーシャルエンジニアリング攻撃の成功率が高まっていると回答しています。

また、攻撃のテクニック自体も進化しています。製造業界全体で、AIを利用した攻撃の経路の多様化に対する懸念が高まっています。特にリアルタイムで進化する適応型マルウェアについて、調査対象の製造業のセキュリティプロフェッショナルの半数近く(49%)が懸念しており、これは全産業の平均よりも9%高い数値です。AIを使った適応型マルウェアに続くその他の懸念には次が含まれます:

  • 自動化された脆弱性スキャンとエクスプロイトチェイニング(48%):Anthropicの新しいMythos AIモデルにより脆弱性探索が深刻化する中で、この問題は一層差し迫ったものとなっています。
  • 超パーソナライズされたフィッシングキャンペーン(46%):フィッシングは依然としてハッカーの主力兵器の1つであり、AIによってフィッシングメールはより説得力が高く検知困難なものとなり、その効果は増幅されました。

これは単に攻撃の量の増加だけでなく、攻撃の展開につれて静的な防御が対応できるよりも速く進化する脅威への変化なのです。

こうした認識が高まっているにもかかわらず、製造業の多くはまだこの変化に対応する準備ができていません。半数以上(51%)がAI駆動の脅威への準備が十分にできていないと回答し、AIの導入を管理する正式なポリシーを持っている組織はわずか37%でした。  

可視性、コンテキスト、およびガードレールを通じてAIのセキュリティを確保

これらの問題に対処するためにAIイノベーションを遅らせる必要はありません。それには、AIと同じスピードと規模で動作できる、これまでとは異なるアプローチのセキュリティが必要です。具体的には、製造業がAIの力を活用する上で、次の3つの優先課題が浮上しています。

可視性はすべての土台  

AIがどこで使用されているか、何にアクセスできるか、そしてITおよびOT環境にわたってどのように動作するかを理解する必要があります。それがなければ、リスクを測定したり管理したりすることはできません。ダークトレースの調査において、製造業のセキュリティプロフェッショナルの91%が、AIを信頼する前に、それがどのように意思決定を行うかを理解する必要があると回答したのは当然のことです。OT環境においてこのことはさらに重要です。稼働の中断は安全や環境、財務、および評判に大きな影響を及ぼすからです。

可視性をアクションにつなげるにはコンテキストが必要  

AIによって形作られる環境において、正常とされる挙動は絶えず変化します。つまり、脅威を検知するにはビヘイビアベースのアプローチが必要なのです。組織全体で生活パターンを理解し、わずかな逸脱をリアルタイムに検知すること- これは従来のセキュリティとリスク管理に対するアプローチからの根本的な変化です。

エージェントからの露出を防ぐガードレール  

AIシステムがより大きな責任を担うようになるなかで、組織はAIが何をできるか、そしていつ独立して行動できるかについて、明確な境界を設ける必要があります。これらのコントロールは何かがあってから適用されるのではなく、システム自体に組み込んでおかなければなりません。  

製造業のITおよびOT環境におけるAIエージェントのセキュリティ

エージェント型AIの出現は製造業を変革し、次世代のオペレーションを支える一方で、脅威ランドスケープも一変させています。これは単なる脅威の増加ではなく、自律型システムへの移行、挙動の絶え間ない変化、そしてマシンスピードで進行するリスクです。AIを活用しつつリスクを管理するという課題に取り組む組織にとって、可視性、コンテキスト、ガードレールはセキュリティの基盤となります。

Darktraceはこの基盤を実現することにより、製造業の安全なAIアプローチ構築を支援します。ITおよびOT環境全体を可視化し、異常なアクティビティに対するリアルタイムの検知および対応を提供することにより、従業員が使用するプロンプトや構築するエージェントから、それらのエージェントの環境全体での動作に至るまで、AIアクティビティの理解を可能にします。これにより、AIの導入を拡大する製造業はコントロールを犠牲にすることなくイノベーションの基盤を構築することができます。

Continue reading
About the author
Oakley Cox
Director of Product

Blog

/

Proactive Security

/

June 1, 2026

Defend What You Trust: Stories from the Front Lines of Modern Cyber Defense

Default blog imageDefault blog image

Modern attacks don’t always announce themselves, follow obvious patterns, or rely on known malware. Often, they move quietly inside trusted systems, authenticated sessions, and everyday behavior.

They don’t break in. They blend in.

That’s why an AI-powered defense is essential. It turns invisible signals into actionable insights at a scale neither analysts nor traditional tools can achieve alone.

Confidence is creating risk

One of the most dangerous assumptions in cybersecurity today is that strong controls equal strong protection.

Multi-factor authentication (MFA), for example, is widely viewed as a foundational safeguard. But as the CISO for a professional sports organization explains, that confidence can be misplaced. “A lot of organizations assume that once you have MFA, those accounts are safe. That’s not true.”

In one instance, his team identified a sophisticated attack where a threat actor bypassed MFA entirely, not by breaking it, but by going around it. A user’s authenticated session was hijacked and re-used, allowing the attacker to impersonate them without triggering traditional controls.

“Darktrace picked up that a session had been re-injected by the hacker, and we were able to block it right away,” he explains.

Attackers anticipate what we miss

Even well-trained users can become entry points.

“An email bypassed our existing security tools,” shares the VP of IT at a U.S.-based risk management services provider.  “The user missed one signal and entered their credentials into a malicious site. That’s what the bad guys count on.”

The organization responded quickly, but not before damage was done. Crucially, this occurred while Darktrace was in “watch mode,” before autonomous response was fully enabled. “Darktrace would have seen that and shut it down immediately,” he notes.

Mistakes and oversights like misconfigurations, forgotten machines, and missed patches can create serious vulnerabilities.

The CIO of a utility services organization shares an instance when Darktrace detected a breach to a client’s network via their ZTNA VPN due to misconfigured MFA. “Darktrace alerted us and autonomously blocked the scanning, preventing what could have been a ransomware-type incident.”  

The most dangerous threats are already inside

The Head of Security at a global business services provider knows firsthand how blind spots can persist inside environments. His team uncovered evidence of dormant ransomware artifacts sitting unnoticed within a company’s environment ¬¬– long before modern detection was in place.

“During a routine file transfer, Darktrace flagged the suspicious activity, identified the ransomware, and immediately quarantined the server,” he recalls.  While the attack was never executed, the implication was significant: the risk existed long before it was finally detected.

Cyber threats are also successful because they take advantage of normal human behavior, exploiting moments of cognitive overload, urgency, and trust.

The Executive Director of IT and Business Applications at a pharmaceutical lab describes the time Darktrace flagged an employee logging into Microsoft 365 from Singapore, despite him being physically located in the U.S. Darktrace immediately cut off his access and within minutes revealed that the employee’s son was using a VPN to play a video game.

While the threat was benign, it demonstrated the strength of AI to use contextual information to detect threats other tools miss. The information also saved security analysts hours of investigation and minimized downtime for the employee. “That level of precision and speed isn’t just convenient, it’s game changing.”

“Unusual” behavior is the new red flag

Detecting modern threats requires an understanding of what “normal” looks like and recognizing when something subtly deviates.

One security leader  at an AI technology enterprise described a scenario in which an employee connected to a proxy service in China. The service itself was legitimate, and although traditional tools didn’t flag it, the behavior was unusual for that user specifically.

“That’s what Darktrace picked up on. The activity turned out to be benign, but without visibility into behavioral deviations, it could just as easily have been something more serious.”

AI shifts defense from reaction to anticipation

These stories point to a fundamental shift by cyber attackers, both tactically and strategically. Because traditional security tools were built to detect what’s already known, modern attacks are often:

  • Credential-based, not malware-based
  • Behavioral, not signature-based
  • Subtle, not overt

They may operate within the boundaries of what appears normal, exploiting what organizations trust, not what they block:

  • Trusted sessions
  • Legitimate services
  • Human error

This is where AI is changing the equation. Rather than relying on predefined rules or known threat signatures, AI can:

  • Establish a baseline of normal behavior
  • Detect subtle anomalies in real time
  • Act autonomously to contain potential threats

Resilience, not perfection, is the new security standard

As these frontline experiences show, the organizations that lead are those that move beyond reactive defense and embrace AI as a core part of their strategy.

It eliminates the blind spots and uncertainty, says the CISO of a professional sports organization. “If you lack visibility, you’re not managing risk, you’re assuming it. AI gives you the actionable insights needed to turn uncertainty into control.”

And it provides the speed and agility that are vital when seconds matter, says the Executive Director of IT and Business Applications. “When Darktrace alerted us at 3:00 am to a ransomware attack, it had already quarantined the affected systems, blocked the attacker’s access, and provided us with the critical details and time needed to investigate. That action likely saved us hundreds of thousands, if not millions, of dollars.”

The modern SOC has become a cornerstone of enterprise resilience, responsible for protecting data and operational continuity while enabling digital growth and innovation. For today’s security professional, that means success is no longer measured by what they keep out, but by what they protect: revenue, reputation, and trust.

Continue reading
About the author
あなたのデータ × DarktraceのAI
唯一無二のDarktrace AIで、ネットワークセキュリティを次の次元へ