A warning and a blueprint for security AI data released by the NSA and Five Eyes Alliance may offer the clearest signal yet that AI data is now a front-line cybersecurity issue (NSA, 2025). And that organizations in both government and the private sector are woefully unprepared to lock it down.
The new Cybersecurity Information Sheet (CSI), “AI Data Security: Best Practices for Securing Data Used to Train & Operate AI Systems,” released last month by NSA’s Artificial Intelligence Security Center (AISC), may target stakeholders in the Department of Defense, National Security Systems, and the Defense Industrial Base, but the implications are far broader. The co-signing agencies, the NSA, CISA, FBI, NCSC-UK, ACSC (Australia), and NCSC-NZ, represent a unified front in a growing international movement: treating AI data as a national security asset.
While headlines often focus on rogue AI behavior or bias, the NSA and its partners are drawing attention to a more fundamental threat: compromised data used to train, test, and run AI systems. Malicious actors are increasingly targeting that data and, in many cases, as enterprises rush to train up their large language models (LLMs), it is poorly protected. As a result, those models can be misled, poisoned, or subtly manipulated to produce unreliable or even dangerous outputs.
Not surprisingly, Compromised Supply Chains among the threats the CSI identifies in AI data environment. Data sourced from third parties or scraped from the open web can be tampered with long before it reaches a training set. If these datasets aren't validated, they can carry built-in vulnerabilities.
The guidance also points to Malicious Data Injection where attackers can stealthily insert poisoned samples into datasets that alter model behavior in targeted ways, like suppressing certain outputs or introducing hidden triggers.
Data Drift and Model Decay also are among the threats. Even without an active adversary, slowly changing data environments can degrade model performance over time if not monitored and corrected.
But perhaps more important than simply identifying the threat, the CSI lays out exactly how organizations can start taking AI data security seriously–from the first data pull to long after a model is live.
But, as organizations know, data doesn’t sit still. It drifts, evolves, and sometimes erodes. AI models require a feedback loop, systems that can detect when the world has shifted, flag unexpected behavior, and trigger updates or retraining before the performance starts to deteriorate.
The CSI reads like both warning and more like a blueprint, one that challenges organizations to think differently about AI readiness. That guidance, though, doesn’t target compute power, model complexity, or feature velocity, but rather what underpins it all: the trustworthiness of data.