Five Eyes Sound Alarm, Issue First Global Blueprint to Secure AI Data

Written by Maria-Diandra Opre | Jul 3, 2025 10:46:19 PM

A warning and a blueprint for security AI data released by the NSA and Five Eyes Alliance may offer the clearest signal yet that AI data is now a front-line cybersecurity issue (NSA, 2025). And that organizations in both government and the private sector are woefully unprepared to lock it down.

The new Cybersecurity Information Sheet (CSI), “AI Data Security: Best Practices for Securing Data Used to Train & Operate AI Systems,” released last month by NSA’s Artificial Intelligence Security Center (AISC), may target stakeholders in the Department of Defense, National Security Systems, and the Defense Industrial Base, but the implications are far broader. The co-signing agencies, the NSA, CISA, FBI, NCSC-UK, ACSC (Australia), and NCSC-NZ, represent a unified front in a growing international movement: treating AI data as a national security asset.

While headlines often focus on rogue AI behavior or bias, the NSA and its partners are drawing attention to a more fundamental threat: compromised data used to train, test, and run AI systems. Malicious actors are increasingly targeting that data and, in many cases, as enterprises rush to train up their large language models (LLMs), it is poorly protected. As a result, those models can be misled, poisoned, or subtly manipulated to produce unreliable or even dangerous outputs.

Not surprisingly, Compromised Supply Chains among the threats the CSI identifies in AI data environment. Data sourced from third parties or scraped from the open web can be tampered with long before it reaches a training set. If these datasets aren't validated, they can carry built-in vulnerabilities.

The guidance also points to Malicious Data Injection where attackers can stealthily insert poisoned samples into datasets that alter model behavior in targeted ways, like suppressing certain outputs or introducing hidden triggers.

Data Drift and Model Decay also are among the threats. Even without an active adversary, slowly changing data environments can degrade model performance over time if not monitored and corrected.

But perhaps more important than simply identifying the threat, the CSI lays out exactly how organizations can start taking AI data security seriously–from the first data pull to long after a model is live.

It starts with something deceptively simple: Know your data’s backstory. Where did it come from? Who touched it? What changed along the way? Tracking data provenance, complete with version histories and metadata, is similar to keeping a paper trail for AI’s memory. Without it, an organization is flying blind.
Verification. The guidance recommends cryptographically signing trusted datasets turning them into tamper-proof packages. When an organization knows training data hasn’t been altered, it can trust what your AI is learning.
Locking down infrastructure comes next. It’s not enough to have clean data if it’s flowing through insecure systems. Whether a company is training massive models or deploying AI in production, those environments need to be airtight with strong access controls, encrypted pipelines, and detailed audit trails.

But, as organizations know, data doesn’t sit still. It drifts, evolves, and sometimes erodes. AI models require a feedback loop, systems that can detect when the world has shifted, flag unexpected behavior, and trigger updates or retraining before the performance starts to deteriorate.

The CSI reads like both warning and more like a blueprint, one that challenges organizations to think differently about AI readiness. That guidance, though, doesn’t target compute power, model complexity, or feature velocity, but rather what underpins it all: the trustworthiness of data.

View full post