Use of HTML/CSS introduces complexity to the email space, and with it the potential for exploitation. For example, malicious emails often use legitimate CSS styling, or user-invisible text to conceal a malicious payload and mimic a known company layout.
We have found that legitimate email communication broadly falls into three categories:
The categories are characterized by features, such as the frequency of CSS appearance, frequency of HTML node appearance, and HTML tree depth. A classifier can use these categories to further direct feature extraction and tracking. The ability to quantify the complexity and style of a HTML document, and to track changes over time or against a model, allows the detection of anomalous and potentially malicious email communications.
This approach has been incorporated into the Darktrace Antigena Email product and contributes to detecting account takeovers and behavioral anomalies.
In existence since Darktrace’s inception in 2013, the Darktrace AI Research Centre is foundational to our continued innovation. Rather than a defined product roadmap, the Centre looks at how AI can be applied to real-world challenges, to find solutions that cannot be achieved by humans alone.