Securing AI Agent Systems: Recommendations in the NIST RFI Filings

Securing AI Agent Systems: Recommendations in the NIST RFI Filings

Welcome to TPI’s Docket Roundup, our semi-regular presentation of docket filings of interest to tech policy nerds. DISCLAIMER: Filings are not affiliated with TPI. We do not necessarily agree with anything in these filings, but find them noteworthy to highlight.

In this Issue: Over 80 substantive comments were filed in March 2026 with the National Institute of Standards and Technology (NIST) in reply to its Request for Information (RFI) on Securing AI Agent Systems (NIST-2025-0035). The RFI centered on novel, machine-learning-driven risks such as adversarial inference-time attacks, indirect prompt injection, data poisoning, model backdoors, and the potential for even uncompromised models to undermine confidentiality, integrity, or availability through specification gaming or misaligned objectives. NIST solicited input on whether these frameworks require agent-specific extensions related to threats, controls, assessment methods, deployment constraints, monitoring, research needs, and cross-sector collaboration. Docket: https://www.regulations.gov/search/comment?filter=NIST-2025-0035

Key Takeaways

1. Don’t build a new regime — extend the existing one. This is the single most common position. Almost everyone (Amazon, Microsoft, ITI, CSET, Hitachi, USTelecom) wants NIST to add agent-specific overlays/profiles onto the AI RMFSP 800-53, the Cybersecurity Framework, and SP 800-218A rather than create a parallel framework. Amazon’s “security box” framing captures the mood: constraints as enabling infrastructure, not new bureaucracy.

2. Agents are treated as categorically different from both traditional software and chatbots. The recurring justification is the combination of non-deterministic reasoning + delegated authority + persistent memory + multi-step autonomous action at machine speed. BCG put it best: the security question shifts from what a model can generate to what a system is permitted to execute.

3. Indirect prompt injection is the consensus #1 threat — and model-level defenses are declared insufficient.Perplexity, the Frontier Model Forum, Elastic, and CrowdStrike all argue you need at least one deterministic enforcement layer outside the model. FAI/CSET cite adaptive attacks bypassing injection/jailbreak defenses at >90% success, so “the model will refuse” is explicitly rejected as a control.

4. Agent identity is its own sub-debate. A large cluster wants agents treated as first-class non-human identities with unique verifiable credentials, scoped authorization, and machine-speed revocation: Okta (“first-class digital identities”), Twilio (“agentic identity”), GoDaddy (domain-anchored Agent Name Service), OpenID Foundation, Cloudflare, and Ericsson (agents as “new insiders” under zero trust). Consensus that SP 800-63 is the foundation but incomplete for agents.

5. Least privilege + isolation/sandboxing + runtime governance is the most-recommended control stack. Booz Allen, Autodesk, Palo Alto, Red Hat, Splunk. Cognition AI (Devin/Windsurf) pushes specifically for VM-based isolation over containers because containers share the host kernel.

6. Continuous monitoring and observability are seen as essential but immature. Partnership on AI, Splunk, and CSET all flag that logging/observability standards are underdeveloped; Elastic proposes an “Agentic SOC” that baselines behavior, isolates suspicious agents, and rolls back before cascade.

7. Human oversight should scale with autonomy and stakes. Risk-tiered human-in-the-loop for consequential actions shows up in ServiceNow, Docusign (the “$1M renewal without a checkpoint” scenario), Intuit, and Okta.

8. Trade associations seek voluntary, risk-based, non-prescriptive guidance — backed by economic upside. CTA ($450–650B in annual revenue by 2030), CCIA, CTIA, USTelecom, BSA, TechNet, ITI, CEI, and ACT urge voluntary guidance. ACT (App Association) adds an important wrinkle: don’t assume hyperscale capacity, or you’ll lock out small developers.

9. Data provenance and integrity are reframed as security problems. Nielsen (“Data Nutrition Labels,” provenance metadata), Cloudflare (cryptographic proof of permissions/provenance), OpenID (signed metadata), and HAI.AI’s JACS (cryptographically signed agent communications, post-quantum via ML-DSA-87) frame data provenance as security matters.

10. Repeated calls for shared infrastructure. Agent-specific benchmarks (Google), a CVE-like AI vulnerability database and MITRE ATLAS extension (Lockheed Martin, AI Policy Network), reliability evaluation protocols (Princeton, arguing critical infra needs 99.9–99.999% vs. today’s agents), updated incident definitions (Anthropic and Microsoft both say FISMA/SP 800-61 don’t fit an authorized agent causing harm), and cross-sector info-sharing via ISACs (Agentic Futures Initiative) are proposals for shared infrastructure.

The risk frameworks cited across the filings included NIST AI RMFSP 800-53SP 800-218A (secure development), SP 800-207 (zero trust), MITRE ATLASOWASP Top 10 for Agentic Applications/LLMs, and ISO/IEC 42001. The most-repeated concrete asks were agent-specific profile/overlay; a deterministic enforcement/policy layer; agent identity + scoped credentials; least privilege across tools/APIs/code execution; runtime monitoring with rollback; risk-tiered human oversight; and standardized evaluation/assessment depth. 

Scott Wallsten is President and Senior Fellow at the Technology Policy Institute and also a senior fellow at the Georgetown Center for Business and Public Policy. He is an economist with expertise in industrial organization and public policy, and his research focuses on competition, regulation, telecommunications, the economics of digitization, and technology policy. He was the economics director for the FCC's National Broadband Plan and has been a lecturer in Stanford University’s public policy program, director of communications policy studies and senior fellow at the Progress & Freedom Foundation, a senior fellow at the AEI – Brookings Joint Center for Regulatory Studies and a resident scholar at the American Enterprise Institute, an economist at The World Bank, a scholar at the Stanford Institute for Economic Policy Research, and a staff economist at the U.S. President’s Council of Economic Advisers. He holds a PhD in economics from Stanford University.

Sarah Oh Lam is a Senior Fellow at the Technology Policy Institute. Oh completed her PhD in Economics from George Mason University, and holds a JD from GMU and a BS in Management Science and Engineering from Stanford University. She was previously the Operations and Research Director for the Information Economy Project at George Mason School of Law. She has also presented research at the 39th Telecommunications Policy Research Conference and has co-authored work published in the Northwestern Journal of Technology & Intellectual Property among other research projects. Her research interests include law and economics, regulatory analysis, and technology policy.

Share This Article

View More Publications by

Recommended Reads

Why Does OpenAI Pretend to Be a Nonprofit?

AI Made My Expertise More Effective

The Administration Is Already Governing AI Development. It Just Doesn’t Have a Strategy.

Explore More Topics

Antitrust and Competition 185
Artificial Intelligence 41
Big Data 21
Blockchain 29
Broadband 390
China 2
Content Moderation 15
Economics and Methods 37
Economics of Digitization 15
Evidence-Based Policy 18
Free Speech 20
Infrastructure 1
Innovation 2
Intellectual Property 56
Miscellaneous 335
Privacy and Security 137
Regulation 18
Trade 2
Uncategorized 5

Related Articles

Why Does OpenAI Pretend to Be a Nonprofit?

AI Made My Expertise More Effective

The Administration Is Already Governing AI Development. It Just Doesn’t Have a Strategy.

Building the Analytical Infrastructure for Governing Frontier AI Development

Jeff Macher on Generative AI and the Future of Global Research

AI Isn’t Flooding FCC Comments (At Least Not Yet)

Shane Greenstein on Co-Invention and the Geography of AI Innovation

From Simple to Impossible: How Task Complexity Limits AI Research Assistants

Sign Up for Updates

This field is for validation purposes and should be left unchanged.