AI Agent Security: What Developers Should Worry About
AI coding agents have real security risks. They also have overhyped risks that dominate media coverage but rarely materialize in practice. This article separates the two so you can focus your security efforts on what actually matters.
Real Risks
1. Unauthorized File Access
AI agents can read any file their process has access to. If you run an agent as your user, it can read ~/.ssh/id_rsa, ~/.aws/credentials, .env files, and anything else your user can access. This is not theoretical. Agents routinely read .env files to understand project configuration. When they include those values in their context window, the credentials are sent to the AI provider's API.
Mitigation: Use agent configuration to restrict file access to the project directory. Claude Code's .claude/settings.json can deny read access to sensitive paths. Set up .gitignore patterns that exclude credential files from the agent's file listing.
2. Credential Exposure in Context
Even without reading credential files directly, agents encounter secrets in code: hardcoded API keys, database connection strings in config files, tokens in test fixtures. These become part of the agent's context and are transmitted to the AI provider.
Mitigation: Never hardcode secrets. Use environment variables. Run credential scanning tools (like gitleaks) as part of your CI pipeline. Consider Styrby's E2E encryption for sessions that might include sensitive context.
3. Unintended Network Requests
Agents can execute curl, wget, and other network tools. An agent that decides to "test the API endpoint" might send a POST request to a production server. An agent installing dependencies pulls packages from the internet, which could include malicious packages if the agent hallucinates a package name.
Mitigation: Restrict network access through permission controls. Deny curl and wget by default. Review npm install commands before approving. Use a lockfile (package-lock.json) to prevent the agent from installing arbitrary packages.
4. Destructive File Operations
Agents interpret instructions literally. "Clean up the project" can result in rm -rf on important directories. "Reset the database" might mean dropping tables in production if the agent has the connection string.
Mitigation: Block destructive commands in the permission configuration. Never give agents access to production credentials. Use a separate database for development with no production access.
5. Supply Chain Attacks via Generated Code
Agents can generate code that imports malicious packages. If an agent suggests npm install some-package and that package is typosquatting on a popular library, you end up with malware in your project.
Mitigation: Review all dependency additions. Run npm audit after installs. Consider using a package allowlist for your project.
Overhyped Risks
1. AI "Going Rogue"
The idea that an AI agent will deliberately sabotage your project is not a realistic near-term concern. Current AI coding agents are stateless between sessions and have no persistent goals. They do what they are prompted to do, sometimes incorrectly, but not maliciously.
The real risk is not intent but incompetence. An agent does not need to be malicious to delete your files. It just needs to misinterpret "clean up."
2. AI Stealing Your Code
Your code is sent to the AI provider's API for processing. This is the same as using any cloud service. The providers have data handling policies. Anthropic, OpenAI, and Google all state that API data is not used for training (for paid API access, not free tiers).
If this is a concern for your organization, review the provider's data processing agreement. For highly sensitive code, use E2E encryption (Styrby) or run local models.
3. Prompt Injection via Code
The concern that malicious comments in code could hijack the agent is theoretically possible but has not been a practical attack vector for coding agents. A comment saying "ignore previous instructions" in a codebase is unlikely to override the agent's system prompt. That said, prompt injection is an active research area. Keep an eye on it, but do not spend security budget on it today.
Practical Security Checklist
Focus your effort on these concrete actions:
- Restrict file access to the project directory. Deny reads to
~/.ssh,~/.aws,~/.config, and other sensitive paths. - Block network commands by default. Approve
curlandwgetonly when you understand the destination. - Block destructive commands. Deny
rm -rf,chmod 777, and writes to system paths. - Never expose production credentials. Use separate development environments. Keep production connection strings out of files the agent can access.
- Review dependency changes. Every
npm installorpip installshould be checked. - Use version control. Git is your safety net. If an agent makes a destructive change,
git checkoutreverts it. Commit frequently during agent sessions. - Audit sessions. Review what agents did, especially for unattended sessions. Permission audit trails help.
How Styrby Helps
Styrby's security features address the real risks: remote permission approval prevents unauthorized operations, risk classification helps you make faster approval decisions, blocked tool lists act as a hard safety net, and the audit trail provides visibility into what happened during sessions. E2E encryption protects session data from server-side breaches.
These are practical measures for practical risks. Not a solution to hypothetical AI threats.
Ready to manage your AI agents from one place?
Styrby gives you cost tracking, remote permissions, and session replay across five agents.