Exaforce blog

June 24, 2025

Detections done right: Threat detections require more than just rules and anomaly detection

Discover how Exaforce fuses logs, config & identity into an AI-powered graph that improves on legacy and naive detection techniques.

What does it take to do detections right, and why have we gotten it wrong for so long?

Detections were originally built using rules in real time on log data. In the mid-2010s, User and Entity Behavior Analytics (UEBA) products introduced sophisticated baselining and grouping techniques to identify outliers, focusing primarily on users and, occasionally, resources. For many mature SOC teams, these components of rules and anomalies are still the pillars of their detection architecture.

However, this approach leaves three critical gaps: 

  1. It cannot correlate findings with configuration (config) data
  2. It fails to account for unique cloud semantics
  3. It leaves context evaluation to a manual post-detection phase, which results in noisy individual detections instead of full incidents.

Incorporating configuration information

One element that both legacy approaches lack is the ability to correlate event data and historical data with config data. This piece of the puzzle is critical to ensuring that excessive false positives are not raised and to properly assess the impact of the alert. Today, most teams incorporate this data only post detection during the triage and analysis phase, and often this information has to be retrieved manually, and is difficult to parse. Moving the config analysis into the detection logic itself can make detections more accurate and help your teams operate the most efficiently. This config data can include:

  • Blast radius assessment: This user may be compromised, but what resources can they access? This is not easy. In some environments, such as AWS, assessing this requires a full chain of identity analysis of which roles can assume what other roles, each of which can access which resources. Without this full analysis, there can be false positives for what the true access the identity has.
  • Proper severity assessment: This EC2 instance is acting anomalously, and its attached instance profile has admin permissions, which may be extra risky.
  • False positive recognition: Using effective permission analysis, we can tell that this user was suddenly granted a very permissive role, but the sensitive assets have resource policies that override the role granted, so it’s actually benign and shouldn’t create an alert.

Attempts to incorporate config data into detections themselves have been made. Many products offer features such as user or entity frameworks, tags, and HR integrations to try and bring aspects of this data into the detection fold. However, maintaining those integrations and updates, and covering the breadth of config data in the model, is extremely time-consuming and difficult, and as such, many teams have resorted to moving the config analysis to the post-detection phase.

Built for cloud & SaaS

UEBA has its origins in the insider threat use case. It was primarily built for identities and modeling their typical activities/patterns. As the approach proved fruitful, many expanded to model some resources as well, often individual virtual machines (VMs) or instances. However, with the shift to IaaS and SaaS environments that many are embarking on, the notion of UEBA needs a major reset. Cloud resources are often ephemeral, leading to different scaling requirements and a new approach to baselining and anomaly detection. The variety of IaaS and SaaS resources - from Kubernetes workloads and pods, GitHub repositories and actions, to Google documents - all require very different modeling. Even the traditional identity is not as straightforward. Roles in AWS, for example, may be assumed by a mix of humans and machines, making their modeling far more complex. In some AWS cases, the alert may not even be attributable to an origin identity, only to the role used. As a result, the traditional UEBA tools and features often fall short of the needs of modern organizations that operate in cloud and multicloud environments. 

Detections are not incidents

The job of the detection tool is not just to provide a hint of suspicious activity but also to ensure that the alert is framed in the full context of the environment. Examples include auto-grouping duplicate alerts and incorporating business context during events such as reorganizations, mergers, or new tool rollouts. The ability for the detection tool to accommodate such context is critical to ensuring analyst expediency and completeness of investigation, as it greatly reduces the noisy individual detections and transforms them into well-documented incidents. Very pointed tools often put a burden on a Security Information and Event Management (SIEM), Security Orchestration, Automation, and Response (SOAR), or other system to do a second level of correlation, aggregation, and analysis to perform some of these steps, making maintenance of this system cumbersome and manual.

At Exaforce, effective detection is a well-balanced triad of rules, anomalies, and config data purpose-built for the modern cloud and SaaS centric company. Here's how our approach breaks from tradition and why that matters. 

In the next few sections, we’ll explore how Exaforce overcomes these limitations in current solutions with a fresh, AI-powered approach to data ingestion and modeling. By fusing log, config, and identity data into a unified semantic layer, and then layering behavioral baselines and knowledge-driven reasoning on top, Exaforce converts scattered signals into precise, high-fidelity alerts that reveal complete attack chains rather than isolated anomalies.

The Semantic Data Model

Ensuring quality data

Our approach to detection begins with Exaforce’s three-model architecture: the Semantic Data, Behavioral, and Knowledge models. Each adds a distinct layer of context.

We ingest event and config data from various sources and convert them into structured Events and Resources. Events are organized into Sessions to add perspective and contextual signals such as location, Autonomous System Numbers (ASN), and duration at a session level. We also chain sessions to capture origin identities, role assumptions, cross-role behavior, and action sequences that enable more complete analysis of what was done and by whom. This preps the data for the detection assessments to come. 

Resources undergo similar treatment. We capture config, parse resource types, enrich the resources, and build relationship graphs. Exaforce also collects config changes over time, enabling us to detect subtle but critical changes that would otherwise go unnoticed. It also empowers us to assess the impact of each config change, effectively conducting a full blast radius analysis. Identities, a key subset of resources, receive extra enrichment, for example:

  • Human vs. machine classification: Exaforce’s model analyzes identity types, behavior patterns, and role assumption patterns to classify identities as humans or machines. This classification is dynamic to allow for complex scenarios (e.g., cases in which a new identity is created by a human but then used in a script executed by a machine identity, or roles which are shared by both human and machine identities). As an identity’s human vs machine classification changes, so will the way they are enriched and modeled. 
  • Effective permission analysis: Interpret the full range of permissions the user has based on transitive role assumption capabilities and overlay them with resource policy information. 
    • Identity chaining: Which identity actually performed this action, not just which role was used
    • Reachable resource analysis: which resources can this identity access, with what actions, and access level
  • 3rd-party identities/access: Identify third-party identities and monitor their behavior and privileges more carefully. 

Resources in our context are a generic construct. They could be anything from AWS EC2 instances, Kubernetes Jobs, GitHub repositories, to Okta roles. This modeling of the config from the outset allows for a more complete detection to be formed and provides the foundation for the first pillar: configuration.

The Behavioral Model

In Exaforce, any dimension that could be anomalous is referred to as a Signal, for example, an unusual location or rare service usage. Signals may be weak or strong, but both are important. Detections are generated by grouping signals that occur in the same event or session, representing collections of medium-fidelity anomalies. These signals and detections provide the rule and anomaly pillars of the solution.

The Semantic Model sets the data up to be modeled in the Behavioral Model. Sessionizing events, for example, allows us to go beyond baseline individual actions to baseline combinations of events and event patterns. Similarly, baselines are customized to the object in question; for example, humans and machines (identified in the aforementioned Semantic Data Model) are modeled differently. Machines tend to follow predictable patterns, while humans are far more eclectic. Shared identities, such as a role used by both an engineer and automation scripts, are modeled with this nuance in mind. 

We model a wide range of signals, independently and in combination, including:

  • Action (and action patterns)
  • Service
  • Time 
  • Duration
  • Location and ASN (including cross-source comparisons)
  • Resource

Here’s an example of an Exaforce finding with multiple signals. In this example, we saw both an Operation Anomaly and a Service Usage anomaly. This user, Mallam, does not usually perform this GetSecretValue action, and they do not typically perform actions in the AWS US East 2 region. This led Exaforce to fire a detection. 

A contextualized threat finding bringing together an action with past behavior.
Additional event data and signals brought together into a unified finding.

This multidimensional approach is critical: a single weak signal is rarely enough, but several weak signals, together, often are. This rule and anomaly detection approach across the breadth of resources and log sources supported represent two pillars in the detection trio. 

The Knowledge Model

The goal of detections is completeness to make sure no signal of potential compromise is overlooked. But completeness can result in noise. That’s where our Knowledge Model comes in. 

After the semantic data model runs and signals fire, and are grouped into detections, Exaforce runs a triage pipeline that contextualizes the detection and adds organization-specific business context to turn medium fidelity detections into high fidelity findings. This triage process is performed on Exaforce and 3rd party alerts alike and helps augment context even further to ensure we truly only surface alerts worthy of analyst attention. This analysis includes weighing conflicting factors in context and occurs at the end of the detection stage. For example, it could include weighing the fact that the user has broad privileges, with the severity of the action taken.

The weighing of resource/identity config data with rule and anomaly outputs happens in this knowledge model and is supplemented by the additional context around similar findings, business context provided by the user, user/manager validation responses, etc. 

  • Similar Findings are identified. If closed, their resolutions are used as inputs to the model to assess this finding. If they are still open or in progress, the model will group them. Once grouped, the findings will be classified as duplicate, grouped, or chained to specify the relationship and level of related analysis. 
  • Business context rules allow users a mechanism to input free-form data into the model. This could include context about the environment - eg these resources are very important, or we use this VPN, they could include information about users - users A, B, C are all part of the Executive Team team and should be monitored carefully, or general context about the company - eg we are a Health company with offices in the following locations, and often have teams commuting between these sites. This freeform input allows novice users to influence and inform the Exabots about critical context without manually having to silence or suppress individual alerts. 
  • Exabots also have skills that allow them to seek validation from end users. If the Exabot determines that a user validation or manager validation would be helpful, it can trigger a Slack/Teams message to the individual and use their response as an influence on the determination.  

The Exabots curate this set of information and pass it to the Knowledge Model agents to assess each of these factors and make a determination of “False Positive” or “Needs Investigation,” turning basic Detections into context-rich Incidents. All of these analyses are run continuously while the alert is open or in progress, so even as your environment changes, your recommended assessments stay up to date. Note that the preparation, structuring, and condensing of this data helps ensure that the AI agents performing the analysis are the most accurate and minimizes hallucinations. 

Running the initial knowledge model before presenting the detection to the user allows the Exaforce detections to be extremely high fidelity.

Example

The user here was seen in two locations, Zurich and Matran, in quick succession. This quick mid-session location switch was anomalous, but the locations themselves were actually consistent with the user's previous behavior and consistent with other company employees as well. The actions performed were also consistent with historical behavior for this user. Therefore, the triage agent was able to weight the anomalous action against the other factors and rule this a false positive. You'll note that the triage agent is also armed with company-specific business context - In this example, it refers to an office in Zurich. (More about business context and triage in our next blog!) 

An automatically marked false positive of a user accessing a repository from multiple locations based on business context.

After triage, we group findings, both Exaforce and third-party, into aggregated attack chains. This lets analysts see the full picture, not just disconnected events.

Exaforce in action: GitHub example

Let’s see the Exaforce approach in practice.

GitHub is a critical data source. It contains sensitive data such as company intellectual property and can even have attached secrets that are highly permissive to perform CI/CD actions. However, it’s often overlooked.

Exaforce ingests logs and config data to gather activity information and identify risks and threats associated with supply chain attacks. For example, uses of Personal Access Tokens (PATs), the credentials commonly used in CI/CD and developer workflows. Out of the box, GitHub logs provide hashed PATs and basic attribution. Exaforce goes further. In this example, the Semantic Data Model 

  • Ingests log and config data, and sessionizes it to understand resources such as the repositories, workloads, actions, tokens, etc. 
  • Enriches the token resource with scope information for the token from the config data to understand access and permissions. This involves correlating the token’s scope information from the config information with runtime data containing the hashed token in the logs. 
  • Classifies tokens used for cron jobs, ad-hoc scripts, and user-driven actions based on their historical usage

Instead of simply attributing actions to a user, the behavioral model then also builds tailored baselines for the tokens themselves and generates signals for any anomalies found. PAT-based baselines allow for a variety of unique detections and protections. Users may have multiple PATs in use simultaneously for a mix of automation and ad hoc usage. Distinguished baselines per PAT allow us to avoid firing false positive detections where both are in use concurrently. 

Here, we identified 6 types of anomalies (signals), most critically, a new repository being accessed from a new ASN.

A threat identified with a user making code changes with multiple locations and ASNs, contextualized with configuration data (lacking branch protection rules).

The Knowledge Model weighed these anomalies against the PAT’s scopes to determine alert severity.

Multiple event signals correlated with configuration data culminating in a single alert with a dynamic severity.

The traditional detection pillars were powerful in finding things but lacked context, creating noisy alerts without enough detail to paint a full picture. Exaforce delivers high-fidelity findings by starting with strong foundations: a semantic data model that structures raw IaaS and SaaS data into enriched, contextual entities.

We monitor a wide range of signals across actions, identities, sessions, and more to detect even minor deviations that add up to real alerts. Our bespoke modeling ensures deep coverage across both IaaS and SaaS environments, including overlooked systems like GitHub and Google Workspace.

Signals are aggregated into cohesive, cross-dimensional findings, and our triage agents weigh conflicting anomalies to surface only what truly matters.

The result? Comprehensive coverage, smarter triage, and dramatically fewer false positives.

June 10, 2025

The KiranaPro breach: A wake-up call for cloud threat monitoring

Practical takeaways and best practices in the aftermath of the KiranaPro breach.

The KiranaPro breach: A wake-up call for cloud threat monitoring

The breach at KiranaPro, an Indian grocery delivery startup, underscores a widespread misconception: that cloud-provider controls alone are sufficient. After attackers gained access through a former employee’s account, they deleted KiranaPro’s entire AWS and GitHub infrastructure—wiping out code, data, and operations. The incident highlights a dangerous gap in how organizations monitor SaaS and IaaS environments.

A deeper look at the KiranaPro incident

On May 26, 2025, KiranaPro’s entire cloud infrastructure was wiped out by hackers who exploited credentials from a former employee. Despite the startup’s use of standard security measures, including multi-factor authentication, attackers managed to bypass these safeguards. The damage included deletion of sensitive customer data, operational code, and critical cloud resources. The root issue was not a weakness in AWS or GitHub, but rather gaps in KiranaPro’s own security practices, specifically inadequate user access management and a lack of proactive monitoring for abnormal activity.

Cloud providers aren’t watching your accounts

Many organizations mistakenly believe cloud providers handle comprehensive security. In reality, cloud providers employ a “shared responsibility model”: providers secure the underlying infrastructure, while customers secure their data, accounts, and access policies. KiranaPro’s breach vividly demonstrates the risks organizations face when they misunderstand or neglect their side of this shared responsibility.

Built-in security tools from SaaS and IaaS providers are robust, but they typically focus on static defenses and configuration checks. They rarely detect real-time threats like credential misuse or unauthorized account activity—issues central to the KiranaPro breach.

Threats aren’t just insiders 

While insider threats (e.g., former or disgruntled employees) pose a significant risk, proactive threat monitoring is essential across multiple attack vectors. External attackers frequently exploit stolen credentials, phishing attacks, misconfigurations, and weak API security. Organizations must recognize that threats come from multiple directions simultaneously.

Proactive threat monitoring involves continuously analyzing cloud activities in real-time to spot anomalies—such as logins from unexpected locations, abrupt permission changes, or unusual data deletions—and taking immediate, automated action to contain threats. Some organizations use SIEM rules to detect these patterns. Others adopt platforms that deliver out-of-the-box monitoring across SaaS and IaaS environments.

Practical takeaways from KiranaPro

The KiranaPro breach underscores the importance of continuous vigilance in cloud security. Organizations cannot afford to adopt a passive stance:

  • Strict access controls: Access to critical systems should be restricted to only those who absolutely need it, following the principle of least privilege. Over-permissioned accounts increase the impact of any compromise or misuse. Privileged actions should be tightly scoped, and administrative access should be granted only when required and revoked when not in use.
  • Avoid persistent IAM credentials: Long-lived credentials—especially for privileged IAM users or root accounts—create enduring risk. Instead, use short-lived, automatically rotated credentials issued via identity federation (e.g., IAM roles with SSO) or just-in-time access. This approach reduces exposure, improves auditability, and makes it easier to manage access at scale.
  • Systematic offboarding: Any IAM user accounts or long-term credentials associated with former employees must be revoked immediately. However, simply deleting these credentials can break production systems, so it’s critical to understand their usage beforehand. Having visibility into actual credential usage and mapping dependencies is therefore essential for secure offboarding.
  • Change control via CI systems: All changes to production environments should be enforced through controlled CI/CD pipelines with mandatory approvals. This discipline adds a valuable layer of oversight and would have likely caught or prevented a destructive action like a mass deletion. While idealistic, it’s a proven safeguard that mature cloud teams should strive toward.
  • Disaster recovery and backups: No system is immune to compromise. Having a disaster recovery plan—including infrastructure-as-code templates and tested, restorable backups—can make the difference between downtime and a total shutdown. KiranaPro’s inability to quickly recover infrastructure suggests major gaps in their resilience planning.
  • Proactive monitoring: Investing in active threat monitoring solutions ensures real-time visibility into system activities, significantly enhancing the ability to detect and mitigate potential security threats swiftly.

Additional best practices from the field

In our previous blog, “Bridging the Cloud Security Gap: Real-World Use Cases for Threat Monitoring,” we examined common cloud security anti-patterns and offered actionable guidance to continuously monitor, detect, and effectively respond to emerging threats.

One highlighted use case involved a device manufacturing company relying on a single IAM user with long-term credentials accessed from multiple locations. This setup amplified risk due to varied operating systems and environments. To mitigate this type of risk, additional recommendations from our best practices blog include:

  • IP Allow-Listing: Defining and enforcing an allowed list of IP addresses for each location.
  • Resource Access Monitoring: Continuously monitoring and logging which resources the IAM user accesses.
  • User Agent and Device Validation: Identifying and allowing only predefined user agents and flagging anomalies.

These measures are also applicable to preventing cloud breaches like the one experienced by KiranaPro.

Conclusion

The KiranaPro breach is a reminder that cloud security requires ongoing, active vigilance. Organizations should move beyond relying solely on provider-native tools and adopt continuous threat monitoring as a foundational security practice. By clearly understanding their security responsibilities, implementing robust access governance, and monitoring cloud activities proactively, companies can significantly reduce their vulnerability to breaches and maintain operational resilience.

Need help building real-time visibility across your cloud stack? Exaforce provides AI-driven threat monitoring across IaaS and SaaS environments such as AWS IaaS, GCP IaaS, Github, Okta, AWS Bedrock, Google Workspace etc. that allows you to expand your threat coverage to your cloud services without writing and maintaining rules. Contact us to request a demo. 

May 29, 2025

3 points missing from agentic AI conversations at RSAC

Agentic AI tools for security operations centers promise to enhance—not replace—human analysts, but their true value lies in thoughtful integration, deep context, and rigorous proof-of-concept testing, not hype-driven adoption.

This article originally appeared in SC Magazine.

For those who attended RSAC 2025 this year, chances are agentic AI came up in the conversation. Vendors pushed dozens of agentic AI products, many of which were tailored to use cases for security operations centers (SOCs) – and marketers dove in head-first to position their companies at the forefront of innovation.

However, thoughtful dialogue about the practical application and true value of agentic AI in the SOC got lost. Here’s what many of the sales pitches missed:

Agentic SOC platforms are a force multiplier, not a replacer.

One of the biggest misconceptions about agentic SOC solutions that we heard is that they will put security professionals out of work and replace some of the tools they’re most familiar with, such as security incident and event management (SIEM) tools. That’s not accurate - in fact, humans, SIEMs and agentic SOC solutions work better when used in tandem.

Security professionals benefit from using effective agentic SOC tools. The new products can minimize tedious workloads, time spent triaging alerts and performing investigations will decrease substantially, and they’ll have more time to uplevel and focus on high-tier investigations and response tasks.

SIEMs have been around for decades and aren’t going anywhere. They collect large amounts of historical data and context that agentic SOC solutions can rely on to produce recommendations and responses. While some agentic SOC tools add reasoning and action to datapoints, they need access to the context in the SIEMs to remain effective.

Context too often gets overlooked.

An overlooked aspect of agentic AI that has gotten lost in conversations about minimizing workloads is its ability to work in tandem with third-party systems. These third-party tools and data sources have nuanced interfaces, data schemas, and operations that agents can misinterpret without deep contextual knowledge of how a tool works. AI agents need deep integration, with sufficient access to data, visibility into workflows, strong feedback mechanisms and environmental context.

If the enabling deep context gets overlooked, agentic AI tooling can add tasks to a to-do list, rather than removing them. For example, if the solution triages an alert and offers a recommendation, is there transparency on that data was gathered? Do we have to go through another system to get the transparency on that data? Is that adding work for the team? The level of context and importance of automating fine-tuning after deployment are still aspects that are being overlooked.

The vendors don’t offer PoCs that can prove a product’s real value.

Crowded booths and flashy banners were everywhere, but booth demos are optimized to tease the best functionality the vendor has to offer – they can’t deliver the insights that deploying the product in the user’s own environment can elicit.

Vendor claims for agentic AI SOC tools ranged from saving time and money to agents making decisions and executing on them autonomously. A proof of concept (PoC) can help verify whether those claims hold up under the company’s SOC’s conditions. Can the tool operate with the company’s specific data volumes and alert types? Can they integrate with the tools in the tech stack that are crucial to the organization’s business operations?

Many may think: “PoCs are nothing new – we know there’s value.” True, but the misconception that AI agents will replace security professionals in combination with the current economic climate adds concerns that we can quell with a PoC in favor of a paper evaluation. Giving analysts the opportunity to test the product and see that it’s there to help them, not replace them, will go a long way in building trust between the user and the product, as well as the employee and the investment decision-makers.

Getting a PoC and fighting the urge to make a heavy investment immediately for the sake of quick innovation lets a team fine-tune the tool’s logic, policies and thresholds to match a SOC’s risk appetite and operational nuances.

So with any new technology, we’re bound to have a hype cycle that spin up fluff. To find the true value of a new product, take it for a test drive and hold it to a high standard to deliver on its promises. Make sure the outcomes are accurate, the sources transparent, the data immediately accessible, that it complements the operations of the teams and tools that are crucial to the success of the organization.

May 27, 2025

5 reasons why security investigations are broken - and how Exaforce fixes them

Struggling with alert overload or slow triage? Discover 5 reasons security investigations fail—and how Exaforce uses AI to fix them fast.

Security investigations have been broken for years. The problems are nothing new: 

  • Alerts without context that leave analysts scouring to gather all the relevant data
  • Gaps in cloud knowledge - analysts are forced to triage issues they don't have expertise in
  • Slow, cumbersome, investigations that can take hours
  • Lack of expertise in system nuances like advanced querying, log parsing, etc. 
  • Overwhelming alert volumes that cause fatigue and mistakes

Every SOC team has felt the pain. What’s changed is the scale and complexity of the environments we defend—cloud-native architectures, third-party SaaS sprawl, identity complexity, and constantly evolving threats. The tradition of static rules, dashboards, prebuilt playbooks, and SIEM queries simply can’t keep up.

At Exaforce, we’re building a new way forward.

We combine AI bots (called “Exabots”)  with advanced data exploration to make security operations faster, smarter, and radically more scalable. Our platform understands your cloud and SaaS environments at a behavioral level—connecting logs, configs, identities, and activities into a unified, contextual graph. From there, our task-specific Exabots take over, autonomously triaging alerts, answering investigation questions, and threat hunting—with accuracy and evidence.

The result? Clear explanations, actionable insights, and fewer hours wasted digging through logs or waiting on other teams.

In the following sections, we review the five main reasons investigations are still broken—and how Exaforce solves those issues for the SOC.

1. Not enough context: “What even is this alert?”

Most alerts land in your SIEM with minimal templated explanations. Why did it fire? What does it mean? What’s the potential impact? Ideally, every alert would come with a detailed description, evidence, and an investigation runbook. In reality, most teams never have the time to write or maintain this. Even anomaly alerts often fall short—showing raw logs instead of a clear comparison to expected behavior. For example AWS GuardDuty alerts show up with generic terms like “unusual” and “differ from the established baseline”. They do not contain detailed information to help analyze or confirm, and understanding what the abnormal behavior was, or what was normal inevitably requires additional data and lookups. 

A sample GuardDuty finding with minimal information about the nature of the suspicous and unusual activity.
The same finding after the Exaforce enrichment, analyzed on multiple dimensions that clearly articulate the anomalies.

The Exaforce Approach:

  • Every alert—ours or third-party—comes with an explanation of why it fired. In “easy mode” english for quick understanding, and in “hard mode” with full data details for those who want to go deep.
  • Data supporting the conclusion is shown clearly—so you have concrete evidence.
  • Alerts are enriched automatically with data from multiple sources—no SOAR playbook required.
  • All findings include “next steps” to kickstart the investigation or remediation.
  • Similar and duplicate alerts are grouped out-of-the-box to prevent redundant effort.

Whether you’re skimming or scrutinizing, Exaforce gives you the context you need to move with confidence.

2. Lack of cloud knowledge: “We’re a SOC, not cloud ops.”

Most SOC analysts come from network security backgrounds. Now they’re expected to triage cloud alerts involving IAM chains, misconfigured S3 buckets, and GitHub permissions. Meanwhile, the actual cloud or DevOps teams often live in a different org entirely, making collaboration slow and awkward. Eg not sure why user A was able to perform a risky action? Not familiar with how AWS identity chaining works? No problem - we summarize the effective permissions a user has, and if you want the details - show you the full identity chain of how they got them.

An example of permission analysis done by Exaforce. All the user's roles and their usage are presented, as well as a view of the effective permissions.
An example of the visual layout of a user's permissions from their IDP through the various AWS services they can access, traversing the complex identity and permission management structure.

The Exaforce Approach:

  • Exabot acts as your built-in AI cloud expert—explaining alerts in natural language.
  • Works across cloud and SaaS sources like AWS, GCP, Okta, GitHub, GoogleWorkspace, and more.
  • For deeper dives, the investigate tab provides full technical context—ideal for handing off to DevOps or engineering.
  • Our semantic graph view shows how users, roles, and resources connect—so analysts can understand identity behaviors visually, not just textually.

We bridge the cloud knowledge gap—  translating cloud complexity into clarity.

3. Time to investigate: Attacks are quick, investigations aren’t.

Investigating a single alert can take hours—jumping between consoles, writing queries, checking with senior analysts, and gathering context from different systems. Now multiply that by the volume of daily alerts, and investigation becomes the biggest bottleneck in your entire response pipeline.

The Exaforce Approach:

  • Exabot handles triage in under 5 minutes, using semantic context to reach conclusions with supporting evidence.
  • And if you have questions? Just ask Exabot—no Slack messages, no dashboards to build, no delays.
The queue of findings. Many have already been marked false positive.
A view inside the activity of an Exaforce finding - finding created and promptly analyzed, analyst asked a question and bot responded immediately with a robust response.

We cut investigation time down from hours to minutes—without cutting corners.

4. Lack of expertise: You shouldn’t need to be a SQL ninja.

Investigations traditionally require deep knowledge: what logs to look at, how they’re structured, what’s “normal,” and how to ask the right questions in the right query language. Most junior analysts just don’t have that expertise—and most teams don’t have the documentation to help.  

 The Exaforce Approach:

  • Exabot answers complex questions in plain language—no syntax required.
  • Want details? Every alert comes with a bespoke investigation canvas—pre-loaded with all the questions an analyst would ask, and data heavy answers for each one. 
  • Our semantic data model pre-enriches and structures log data so analysts see what matters, when it matters. You get enriched, joined, cleaned, and contextualized data out of the box. 
  • We surface behavioral baselines, patterns, and ownership insights that usually live in tribal knowledge.

Even this common AWS GuardDuty alert for unusual behavior requires an analyst to - understand who the root identity is, query for other logs in the same time period, parse those logs for a unique list of resources touched, extend the query to include other users on the same resources to establish a baseline, and build statistical analysis to understand 'normal' behavior for the user, action, location, and resource. But not with Exaforce: 

The detailed Exaforce investigation canvas supporting the recommendation. Note the Q&A style with supporting data.

Now anyone on the team can investigate like a pro—without mastering a query language, managing log parsers, or building custom dashboards.

5. Too many alerts: Welcome to burnout city.

Your team gets thousands of alerts and most of them false positives (85%+). Analysts get desensitized, threat signals get missed, and triage becomes a box-checking exercise instead of a security process. (A great analysis on the alert fatigue problem by security guru Anton Chuvakin: https://medium.com/anton-on-security/antons-alert-fatigue-the-study-0ac0e6f5621c)

 The Exaforce Approach:

  • Exaforce automatically triages the majority of alerts.
  • Duplicate and related alerts are grouped together so they can be handled once.
  • Analysts only focus on the high-signal, high-impact findings that actually require human insight.
A grouped Exaforce finding. Findings from github and aws are aggregated into a larger finding with a higher severity.

We cut the noise, so your team can spend less time firefighting and more time securing.

Final Thoughts: Investigations, Reimagined

The problems aren’t new. But the solution is.

With Exaforce, you get a better approach to investigation—powered by intelligent bots, and an advanced data interface that is intuitive, visual, and conversation.

May 7, 2025

Bridging the Cloud Security Gap: Real-World Use Cases for Threat Monitoring

At Exaforce, as we work with our initial set of design partners to reduce the human burden on SOC teams, we’re gaining valuable insights into current cloud usage patterns that reveal a larger and more dynamic threat surface. While many organizations invest in robust security tools like CSPM, SIEM, and SOAR, these solutions often miss the nuances of evolving behaviors and real-time threats. This blog examines common cloud security anti-patterns and offers actionable guidance, including practical remediation measures, to continuously monitor, detect, and effectively respond to emerging threats.

Use Case: Single IAM User With Long Term Credentials Accessed From Multiple Locations

A device manufacturing company relies on a single IAM user with long term credentials for various tasks such as device testing, telemetry collection, and metrics gathering across multiple factories in different geographic regions. This consolidated identity is used from varied operating systems (e.g., Linux, Windows) and environments, which amplifies risk.

AWS IAM user X accessing multiple S3 buckets from processes running in factories located in different locations.

Threat Vectors and Monitoring Recommendations

To mitigate the risks associated with such a setup, focus on continuous threat monitoring with these priority measures:

  1. IP Allow-Listing
  • Define and enforce an allowed list of IP addresses for each factory.
  • Alert on any access attempts from unauthorized IPs.
  • Tool: AWS using policy conditions. Below is an example to deny everything but CIDR 192.0.2.0/24, 203.0.113.0/24
AWS IAM policy to deny all requests unless requests originate from specified IP address ranges.

2. Resource Access Monitoring

  • Continuously monitor and log which resources the IAM user accesses.
  • Correlate access patterns with expected behavior for each factory or task.
  • Tool: SIEM platforms integrated with cloudtrail logs.

3. Regular Credential Rotation

  • Implement strict policies to rotate long term credentials periodically.
  • Automate token rotation and integrate alerts for unusual rotation delays.

4. User Agent and Device Validation

  • Identify and allow only a predefined list of acceptable user agents (e.g., specific OS versions like Linux and Windows Server) for each use case.
  • Flag anomalies such as access from unexpected operating systems (e.g., macOS when not approved).
  • Tool: SIEM platforms to co-related EDR and AWS cloudtrail logs and generate detections

Use Case: Long-Term IAM User Credentials in GitHub Pipelines

One of our SaaS provider partners is using long-term AWS IAM user credentials directly into their GitHub Actions CI/CD pipelines as static GitHub secrets, allowing automation scripts to deploy services into AWS. This practice poses significant security risks; credentials stored in CI/CD pipelines can easily become exposed through accidental leaks or external breaches—as seen recently with Sisense (April 2024) and TinaCMS (Dec 2024)—enabling attackers to gain unauthorized cloud access, escalate privileges, and exfiltrate sensitive data.

GitHub pipelines using long-term AWS IAM user access keys.

Threat Vectors and Monitoring Recommendations

To monitor and detect threats associated with this anti-pattern, consider these prioritized measures:

1. Credential Usage Monitoring

  • Continuously monitor IAM user activity and set alerts for any anomalous actions, such as unusual access patterns, region shifts, or privilege escalation attempts.
  • Tool: SIEM platform integrated with cloudtrail logs.

2. Regular Credential Rotation

  • Implement strict policies to rotate long term credentials periodically.
  • Automate token rotation and integrate alerts for unusual rotation delays.

Remediation: Short-lived Credentials via OIDC

Transition to GitHub Actions’ OpenID Connect (OIDC) integration, enabling temporary credentials instead of embedding long-term keys, minimizing risk exposure.

Ineffective Use of Permission Sets in Multi-Account Environments

A cloud-first SaaS provider is misusing AWS permission sets by provisioning direct access in the management accounts where sensitive permission sets and policies are defined instead of correctly provisioning them across member accounts. This setup complicates policy management and leaves the management account largely unmonitored, creating blind spots where identity threats can emerge before affecting production or staging.

Complex IAM access management across multiple accounts.

Threat Vectors and Monitoring Recommendations

1. Monitoring Management Account Activity

  • Monitor all IAM and policy changes in the management account using AWS Tools: SIEM Tool integrated with CloudTrail logs. Detections should trigger alerts on any modifications to permission sets or cross-account role assumptions.

2. Misconfigured Trust Relationships:

  • Audit and continuously validate trust policies for cross-account roles to ensure they only allow intended access.
  • Tools: AWS Config rules to flag deviations from approved configurations.

3. Policy Drift and Unauthorized Changes:

  • Implement automated periodic reviews of permission sets and associated IAM roles. This ensures that any drift or unauthorized changes are quickly detected and remediated.
  • Tools: SIEM Tool integrated with CloudTrail logs.

Root User Access Delegated to a Third Party

Delegating root user access to a third party for managing AWS billing and administration may seem low-risk, but it leaves the company without direct oversight of its highest-privilege account. When the root credentials including long-term passwords and MFA tokens are controlled externally, the risk escalates dramatically: if the third party is compromised or mismanages their controls, attackers could gain unrestricted access to the entire AWS environment.

Third party with root user access to your AWS accounts.

Threat Vectors and Monitoring Recommendations

  1. Monitoring Unauthorized Root Activity
  • Monitor all root user actions via CloudTrail and SIEM alerts for any anomalous behavior.
  • Tools: SIEM Tool integrated with CloudTrail logs.
  1. Third-Party Compromise
  • Regularly audit third-party access and security posture
  • Tool: Identity access management tool.

Remediation: Centralized root access

Remediate by removing root access and migrating to centrally manage root access using AssumeRoot, which issues short-term credentials for privileged tasks.

Contact us to learn how Exaforce leverages Exabots to address these challenges.

April 17, 2025

Reimagining the SOC: Humans + AI bots = Better, faster, cheaper security & operations

Announcing our $75M Series A to fuel our mission

Attack surfaces continue to grow as enterprises widen their digital footprint and AI goes mainstream into various aspects of the business. Meanwhile, CISOs and CIOs continue to struggle with their defenses – they all want their security operations centers (SOC) to be more efficacious, productive, and scalable.

A few years back, some of us were at F5 and Palo Alto Networks defending mission-critical applications and cloud services for global banks, major social media networks, and video collaboration platforms. We saw advanced cyber threats — from nation-states to organized crime — constantly probing our customer's’ defenses. Meeting strict 24x7x365 SLAs with limited talent was an uphill battle and our SOC teams worked tirelessly. No matter how good we got, it felt like we were always reacting, never truly getting ahead.

Simultaneously, other members of our founding team were at Google pioneering large language models (LLMs).Their main focus was on improving the quality and consistency of output from these frontier models. AI showed massive promise to automate human work, but suffered from a few inherent flaws — long-short term memory, consistency of reasoning, cost of analyzing a very large data set, etc. 

Together, we reached the same conclusion that the problems of security and operations cannot be solved by hiring more people or building a bigger foundation model or a smaller security-specific model — the solution will require grounds-up re-thinking! 

The magical combination: Humans + Bots

We founded Exaforce with a singular goal: 10X speed-up for tasks done by humans.  And nowhere is this work more complex than in enterprise security and operations. We have made great strides towards this goal using our task-specific AI agents called “Exabots” and advanced data exploration. We think of this platform as an Agentic SOC Platform. Our goal with Exabots from conception has been to help automate difficult tasks, not the simple or low-skill tasks that you see in demos. 

For the last 18 months, we have been working with our design partners to train Exabots to help SOC analysts, detection engineers, and threat hunters. Exabots augment them to auto-triage alerts, detect breaches in critical cloud services, and simplify the process of threat hunting. We are seeing up to 60X speed-up in day-to-day tasks alongside dramatic improvement in efficacy and auditability. 

Our light-bulb moment: Multi-Model AI Engine

Our ex-Google team knew from day one that no foundation model will be able to deliver consistency of reasoning needed for human-grade analysis of threat alerts or do the analysis of all runtime events while meeting the cost points needed to detect breaches. As a result, we had to innovate on a brand new approach by building a new multi-model AI engine that combines three different types of AI models that we have been developing for the last 18 months:

  • Semantic Model: imbibes human-grade understanding of runtime events/logs, cloud configuration, code, identity data, and threat feeds. 
  • Behavioral Model: that learns patterns of actions, applications, data, resources, identities (humans and machines), locations, etc. 
  • Knowledge Model: LLM that performs reasoning on this data, executes dynamically generated workflows, analyzes historical tickets in ITSM (eg. Jira, ServiceNow).   

Together, these models work in perfect harmony to overcome the inherent flaws of a LLM-only approach (long-short term memory, consistency of reasoning, cost of reasoning over a very large data set). This AI engine can analyze all the data at cost points that are unmatched in the industry and yet deliver human-grade analysis!

Backed by leading investors: $75 Million in Series A

Today, we’re thrilled to announce $75 million in Series A funding, led by Khosla Ventures and Mayfield, alongside Thomvest, Touring Capital, and others who share our belief in augmenting today’s hard working cyber professionals with AI that works consistently and reliably! 

This investment allows us to scale our investment in R&D to refine our multi-model AI engine, train Exabots to perform more and more complex tasks, and onboard more design partners eager to see how an agentic SOC can transform their security operations. 

A glimpse into the future of SOC

With Exaforce, our design partners are already seeing multitude of benefits for their SOC teams: 

  • Higher Efficacy: much higher consistency and quality in investigation of complex threats than their existing in-house SOC or external partners (MSSP and MDR)
  • Better Productivity: much faster in detecting and responding to complex threats to their cloud services compared to existing SIEMs/CDR solutions.  
  • Cheaper to scale: automated handling of challenging and tedious tasks (data collection, analysis, user and manager confirmations, ticket analysis, etc) along with the ability to scale defense on-demand without adding headcount or new contracts with MDR/MSSP 

See what the Wall Street Journal has to say about our funding! 

What’s next

Though we’re very excited about launching the company, our journey is just beginning! We’ll continue collaborating with more design partners to expand coverage, refine AI workflows, and ensure that humans always remain in control. Our goal is to build a SOC where AI handles the busywork and humans focus on true threats — creating a security environment that is truly more consistent in results, faster in response, and lower in TCO.

If you want a SOC that is composed of superhuman analysts, detection engineers, or threat hunters - request a demo to learn more. Together, we can build the future of the SOC!

March 16, 2025

Safeguarding against Github Actions(tj-actions/changed-files) compromise

How users can detect, prevent, recover from supply chain threats with Exaforce

Since March 14th, 2025, Exaforce has been very busy helping our design partners overcome a critical attack to the software supply chain through Github. In the last 6 months, this is a second major attack experienced by our design partners to their cloud deployments and we are grateful to have delivered value to them.

What Happened?

On March 14, 2025, security researchers detected unusual activity in the widely used GitHub Action tj-actions/changed-files. This action, primarily designed to list changed files in repositories, suffered a sophisticated supply chain compromise. Attackers injected malicious code into nearly all tagged versions through a malicious commit (0e58ed8671d6b60d0890c21b07f8835ace038e67).

The malicious payload was a base64-encoded script designed to print sensitive CI/CD secrets — including API keys, tokens, and credentials — directly into publicly accessible GitHub Actions build logs. Public repositories became especially vulnerable, potentially allowing anyone to harvest these exposed secrets.

Attackers retroactively updated version tags to point to the compromised commit, meaning even pinned tagged versions (if not pinned by specific commit SHAs) were vulnerable. While the script didn’t exfiltrate secrets to external servers, it exposed them publicly, leading to the critical vulnerability CVE-2025–30066.

How We Helped Our Design Partners

Leveraging the Exaforce Platform, we swiftly identified all customer repositories and workflows using the compromised action. Our analysis included:

Quickly querying repositories and workflows across customer accounts.

Identifying affected secrets used by compromised workflows.

  • Directly communicating these findings and recommended remediation actions to affected customers.

Our security team proactively informed customers detailing specific impacted workflows and guiding them to rotate compromised secrets immediately.

What Should You Do?

Use the below search url to look for impacted repositories. Replace the string <Your Org Name> with your github org name.

https://github.com/search?q=org%3A<Your Org Name>+tj-actions%2Fchanged-files+&type=issues

If your workflows include tj-actions/changed-files, take immediate action.

  • Stop Using the Action Immediately: Remove all instances from your workflows across all branches.
  • Review Logs: Inspect GitHub Actions logs from March 14–15, 2025, for exposed secrets. Assume all logged secrets are compromised, especially in public repositories.
  • Rotate Secrets: Immediately rotate all potentially leaked credentials — API keys, tokens, passwords.
  • Switch to Alternatives: Use secure alternatives or inline file-change detection logic until a verified safe version becomes available.

Lessons Learned

This breach highlights critical vulnerabilities inherent in software supply chains. Dependence on third-party actions requires stringent security practices:

  • Pin your third party GitHub Actions to commit SHAs instead of versions
  • Wherever possible, rather than relying on a third-party action you can use native Git commands within your workflow. This avoids external dependencies, reducing supply chain risks.
  • Restrict permissions via minimally scoped tokens (like GITHUB_TOKEN).
  • Implement continuous runtime monitoring including enabling audit logs, action logs, and capturing detailed resource information to promptly detect anomalous behavior and facilitate comprehensive investigations.

By adopting these best practices, organizations can significantly reduce the risk posed by compromised third-party software components.

Reach out to us contact@exaforce.com if you’d like to understand how we protect GitHub and other data sources from supply chain compromises and other threats.

November 6, 2024

Npm provenance: bridging the missing security layer in JavaScript libraries

Why verifying package origins is crucial for secure JavaScript applications

The recent security incident involving the popular lottie-player library once again highlighted the fragility of the NPM ecosystem’s security. While NPM provides robust security features like provenance attestation, many of the most downloaded packages aren’t utilising these critical security measures.

What is NPM Provenance?

NPM provenance is a security feature that creates a verifiable connection between a published package and its source code repository introduced last year. When enabled, it provides cryptographic proof that a package was built from a specific GitHub repository commit using GitHub Actions or Gitlab runners. This helps prevent supply chain attacks where malicious actors could publish compromised versions of popular packages. However, it’s important to note that this security relies on the integrity of your build environment itself — if your GitHub/GitLab account or CI/CD pipeline is compromised, the provenance attestation could still be generated for malicious code. Therefore, securing your source control and CI/CD infrastructure with strong access controls, audit logging, and regular security reviews remains critical.

The Current State of Popular NPM Packages

Let’s examine some of the most downloaded NPM packages and their provenance status:

Among the 2,000 most downloaded packages on jsDelivr, 205 packages have a public GitHub repository and directly publish to npm using GitHub Workflows. However, only 26 (12.6%) of these packages have enabled provenance — a security feature that verifies where and how a package was built. Making this incremental change to their GitHub workflows would be a significant security improvement for the entire community at large.

Critical Gaps in NPM’s Security Model

Server-Side Limitations

The NPM registry currently lacks critical server-side enforcement mechanisms:

1. No Mandatory Provenance

  • Packages can be published without any attestation
  • No way to enforce provenance requirements for specific packages or organizations
  • Registry accepts packages with or without verification

2. Missing Policy Controls

  • Organizations cannot set requirements for package publishing
  • No ability to enforce provenance for specific package names or patterns similar to git branch protection
  • No automated verification of build source authenticity

3. Version Control

  • No mechanism to prevent version updates without matching provenance
  • Cannot enforce stricter requirements for major version updates

Client-Side Verification Gaps

npm/yarn client tools also lack essential security controls:

1. Installation Process

2. Missing security Features

  • No built-in flags to require provenance
  • Cannot enforce organization-wide attestation policies
  • No way to verify single package attestation

3. Package.json Limitations

The Lottie-Player Incident

The recent compromise of the lottie-player library serves as a stark reminder of what can go wrong. The attack timeline:

  1. Attackers gained access to the maintainer’s NPM account
  2. Published a malicious version of the package
  3. Users automatically received the compromised version through unpinned dependency updates and direct CDN links
  4. Malicious code executed on affected systems

Had the provenance attestation been enforced at either the registry or client level, this attack could have been prevented.

Why Aren’t More Packages Using Provenance?

Several factors contribute to the low adoption of NPM provenance:

  1. Awareness Gap: Many maintainers aren’t familiar with the feature
  2. Implementation Overhead: Requires GitHub Actions workflow modifications
  3. Legacy Systems: Existing build pipelines may need significant updates
  4. False Sense of Security: Reliance on other security measures like 2FA
  5. Lack of Enforcement: No pressure to implement due to missing registry requirements

To enable provenance for your NPM packages:

<script src="https://gist.github.com/pupapaik/9cc17e02a0b204281a5c14d8bc56aabb#file-npm-publish-workfow-yaml.js"></script>

Or do it in package.json in

<script src="https://gist.github.com/pupapaik/fc640fbadf4581ad92b2143c7391e791#file-package-provenance-json.js"></script>

Package Provenance check

The npm command audit can check the integrity and authenticity of packages, but it doesn’t allow you to verify individual packages — only all packages in a project at once.

NPM with invalid attestations

Since the npm CLI doesn’t provide an easy way to do this, I wrote a simple script to check the integrity and attestation of individual packages. This script makes it straightforward to validate each package.

This script can be used in a GitHub Workflow on the client side or as a monitoring tool to continuously check the attestation of upstream packages.

Client-Side Script Integrity Verification

While NPM provenance helps secure your package ecosystem, web applications loading JavaScript directly via CDN links need additional security measures. The Subresource Integrity (SRI) mechanism provides cryptographic verification for externally loaded resources. The Lottie-player attack was particularly devastating due to three common but dangerous practices:

1. Using latest tag

2. Missing integrity check

3. No Fallback Strategy

SRI works by providing a cryptographic hash of the expected file content. The browser:

  1. Downloads the resource
  2. Calculates its hash
  3. Compares it with the provided integrity value
  4. Blocks execution if there’s a mismatch

When integrity check verification fails, the browser does not allow javascript executing with the sample error

Recommendations for the Ecosystem

1. Package Maintainers:

  • Enable provenance attestation immediately
  • Document provenance status in README files
  • Use GitHub Actions for automated, verified builds

2. Package Users:

  • Check provenance status before adding new dependencies
  • Prefer packages with enabled provenance. Check websites such as TrustyPkg to understand its trustworthiness based on activity, provenance, and more
  • Monitor existing dependencies for provenance adoption

3. Platform Providers:

  • Make provenance status more visible in NPM registry UI
  • Provide tools for bulk provenance verification
  • Consider making provenance mandatory for high-impact packages
  • Implement server-side enforcement mechanisms
  • Add client-side verification tools

4. NPM Registry

  • Add organization-level provenance requirements
  • Implement mandatory attestation for popular packages
  • Provide API endpoints for provenance verification
  • Provide package approval process / workflow

Conclusion

The security of the NPM ecosystem affects millions of applications worldwide. The current lack of enforcement mechanisms at both the registry and client levels creates a significant security risks. While provenance attestation is available, the inability to enforce it systematically leaves the ecosystem vulnerable to supply chain attacks.

The NPM team should prioritize implementing both server-side and client-side enforcement mechanisms. Until then, the community must rely on manual verification and best practices. Package maintainers should enable provenance attestation immediately, while users should demand better security controls and verification tools.

Only by working together to improve NPM’s infrastructure can we create a more secure JavaScript ecosystem. At ExaForce, we’re committed to taking the first step by helping open-source libraries adopt provenance attestation in their publishing process.

References

[1] Resolution of Security Incident with @lottiefiles/lottie-player Package

[2] Supply Chain Security Incident: Analysis of the LottieFiles NPM Package Compromise

[3] TrustyPkg Lottie verification database for developers to consume secure open source libraries

November 1, 2024

Exaforce’s response to the LottieFiles npm package compromise

Analyzing the supply chain attack and steps taken to secure the ecosystem

October 30th, 2024, Exaforce’s Incident Response team was engaged by LottieFiles following the discovery of a sophisticated supply chain attack targeting their popular lottie-player NPM package.

  • The incident involved the compromise of a package maintainer’s credentials through a phishing attack, resulting in the distribution of malicious code designed to target crypto currency wallets used in the DeFi and Web3 community.
  • LottieFiles moved rapidly and were jointly able to contain the attack within an hour, minimizing potential impact on the package’s extensive user base, estimated at over 11 million daily active users.
  • In the entire process, LottieFiles demonstrated commendable speed and commitment to its community of users.

Exaforce is committed to ensuring LottieFiles is able to serve its community with the trust it has gained over the years. Key actions taken:

  • Helping the team at LottieFiles implement NPM package provenance attestation, providing cryptographic verification of package origins, build processes, continuous detection & response.
  • Continue being actively engaged with LottieFiles to strengthen their security posture and ongoing monitoring of critical systems.
  • A follow up post incident blog where we will share additional learnings and suggestions on best practices will be made available.

Official details of the incident report here:

About LottieFiles and NPM Packages

LottieFiles has revolutionized web animation by providing developers with tools to implement lightweight, scalable animations across platforms. At the heart of their ecosystem lies the lottie-player NPM package, which serves over 9 million lifetime users and averages 94,000 weekly downloads. NPM packages form the backbone of modern JavaScript development, acting as building blocks that developers use to construct applications efficiently and securely. In the software supply chain, these packages represent both incredible value and potential vulnerability points, making their security paramount.

Attack Overview and Impact

The incident began with a sophisticated phishing campaign targeting LottieFiles developers. The attacker (email notify.npmjs@pm.me) sent a carefully crafted phishing email to a developer’s private Gmail account that was registered with NPM with an invitation to collaborate on the @lottiefiles/jlottie npm package. Through this social engineering attack, the threat actor successfully harvested both NPM credentials and two-factor authentication codes from the targeted developer.

Using compromised credentials, the attacker executed their campaign on October 30th, 2024, between 19:00 UTC and 20:00 UTC, publishing three malicious versions of the lottie-player package (2.0.5, 2.0.6, and 2.0.7) directly to the NPM registry. This manual publication bypassed LottieFiles’ standard GitHub Actions deployment pipeline.

The attack’s distribution mechanism proved particularly effective due to the nature of modern web development practices. The compromised versions rapidly propagated through major Content Delivery Networks (CDNs), affecting websites configured to automatically pull the latest library version. This auto-update feature, typically a security benefit, became an attack vector that significantly amplified the incident’s reach.

Important Lessons Learned

In the process of handling this incident we’ve come to the conclusion that the current NPM package distribution model presents significant security challenges that should concern enterprise organizations relying on it for their JavaScript dependencies. While Github (after its acquisition of NPM and subsequent deprecation of NPM Enterprise) is promoting a migration strategy, there are critical security gaps with existing npmjs.com offerings — lack of SSO for users, no logs for upstreaming of packages or usage of packages, limited integrity checks, lack of OIDC support for automated systems, and no controls on distribution through CDNs. These limitations collectively represent a substantial security deficit in what has become the backbone of modern JavaScript development, potentially exposing organizations to supply chain attacks and compliance issues. We, along with Lottie Files will work with npmjs and Github to improve the current gaps in such a vital software supply chain.

Incident Detection and Response Timeline

The incident was first reported through LottieFiles’ community website at approximately 19:24 UTC on October 30th, when users began noticing suspicious wallet connection prompts. Exaforce’s incident response team, working in conjunction with LottieFiles, implemented immediate countermeasures:

  • October 30th, 19:24 UTC: Initial detection and report
  • October 30th, 19:30 UTC: Impacted package versions (2.0.5, 2.0.6, 2.0.7) deleted
  • October 30th, 19:35 UTC: Revocation of compromised NPM access tokens
  • October 30th, 19:58 UTC: Publication of clean version 2.0.8
  • October 31st, 02:35 UTC: Removal of affected developer’s NPM access
  • October 31st, 02:40 UTC: Access of individual developers to NPM repositories revoked
  • October 31st, 02:45 UTC: All NPM keys as well as other systems had their keys revoked and NPM automations suspended
  • October 31st, 03:30 UTC: Laptop in question quarantined for further post-incident analysis
  • October 31st, 03:35 UTC: Begin forensics on the compromised laptop
  • October 31st, 03:55 UTC: Coordination with major CDN providers to purge compromised files
  • October 31st, 04:00 UTC: First official X (Twitter) post by LottieFiles
  • October 31st, 20:06 UTC: All infected files removed from downstream CDNs (cdnjs.com, unpkg.com) with the help of the community operators
  • November 1st, 01:59 UTC: Second official update on X (Twitter) post by LottieFiles

Hardening Effort Towards a More Secure LottieFiles

In response to this incident, we are working with Lottie Files to implement comprehensive security improvements across their infrastructure. Key measures include:

  1. Implementation of NPM package provenance attestation and continuous monitoring of this, providing cryptographic verification of package origins and build processes. This ensures that packages are built and published through verified GitHub workflows only, eliminating the risk of direct human publishing.
  2. Understanding the posture of human and machine identities in critical systems. Machine identities, including credentials, are the most common threat vector in the cloud today. Gaining visibility into these identities, how they are being used and by whom is critical to establishing a strong cloud security posture.
  3. Real-time monitoring and threat detection coverage across all critical systems leveraging a combination of Exaforce AI-BOTs and our Managed Cloud Detection & Response service.

Stay tuned for a follow up where we will share our learnings helping Lottie establish industry leading Security Engineering and Operations by augmenting their existing teams with task specific AI bots. Only by working together to improve NPM’s infrastructure can we create a more secure JavaScript ecosystem. At Exaforce, we’re committed to taking the first step by helping open-source libraries adopt provenance attestation in their publishing process.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Interested in learning more?

Let us show you what Exabots can do for your team

Request demo