From Detection to Automation: Automate Your Cloud Security Posture
1. AI That Keeps Thinking
With the AI bubble expanding every day, more and more AI-focused tools are being utilised for security reviews and automation. Unlike noisy and messy tools that scan with no context, utilising AI allows for more deterministic results.
Today we are diving into the realm of cloud plugins and how they can be combined with existing tools to perform semi-autonomous cloud security reviews.
Why use AI?
With traditional scanning tools, they mostly follow the pattern of:
Scan -> Write Up -> Report -> Display
While this allows mass scanning, it also has its pitfalls. A key one being the absurd amount of false positives. Security reviewers don’t have unlimited time to sort through every finding that is true or not. Cloud environments differ vastly from each other in setup and configuration. The lack of context within code/logic-based scanners (e.g., prowler) allows these false positives to creep through.
Here’s an example: A security tool finds that an S3 Bucket is publicly accessible but in reality, it’s configured for a static website and has no sensitive data exposed.
This is where AI can move beyond scanning. The goal now is no longer just finding the issues but actually understanding the context of the environment and whether those issues matter.
2. What I Built
What I’ve created is a plugin for the popular Claude Code CLI framework, which works by connecting into your native terminal. It has the power to perform commands on your behalf, which is extremely useful for running security tools like nmap or curl.
Instead of following the traditional pattern of a single scan and showing findings, the plugin continuously investigates the environment using a loop-based reasoning structure. This loop only applies to configuration and policy questions. If you were to ask it to probe an endpoint or list something, it will skip this loop. This is decided at runtime by a classification table.
Validation Loop
At a high level, the plugin includes the following:
- A loop-based validation structure that decides what to inspect next and if the evidence is high confidence with a score.
- A skills system that provides instructions, a lookup table and documentation for any AWS service or tool related command.
- An external validation layer that probes endpoints and external services.
- A reporting pipeline that only outputs findings that are backed by real and deterministic evidence to the CLI (although this can differ).
- A PoC generation skill that provides penetration testers manual instructions to reproduce exploits and bugs if they want to manually verify.
- A blast radius generation skill that creates a markdown for the user to review the damages and impacts.
Tool Integrations
The plugin also performs real world validation by integrating existing security tools:
- PMapper - allows detection of IAM privilege escalation
- Nmap - test external network exposure and port validation
- Curl - probe external endpoints
- testssl.sh - for TLS configuration analysis
How Does This Differ From Existing Scan Tools?
AI scanners do not rely on static and hardcoded cases. Traditional scanners are programmed to follow strict criteria; they do not have the freedom of adaptation and context. Trying to create a scanner that covers every case is unrealistic, as the amount of edge cases you encounter in cloud is probably exponential.
This is where we fill the gap. AI has the ability to learn and adapt to data in real time, it is not locked down to matching signatures of known information but rather uses machine learning to figure out new unknown information.
A lot of popular cloud scanning tools use API read-only access to check if something is configured, but they do not actually interact with the environment externally. What I mean by this is that they don’t have the power to probe reachability. This is a huge gap. A bucket could appear “private” to a scanner but still be reachable because of:
- CloudFront distribution or CDN
- Lambda or API Gateway that proxies access to the bucket with no auth checks
- Misconfigured VPC
You get the idea, right?
A cloud scanner will tell you if something is misconfigured but never if it’s an actual security issue. Relying on configurations alone tells you the intent, while probing tells you the reality.
The plugin mixes both of these concepts by reading configuration then validating via probe. If the two results mismatch then it will label it for review or dig deeper. If something says it is public, it asks why. It compares multiple pieces of context (this being the aws-cli, external probes, pmapper… etc.) before finalizing.
I’d suggest giving this article a read by Daniel Grzelak. Most of the methodology for the tool is inspired by this article.
3. Lab Setup
To test the capabilities of the plugin, I created a test environment in Terraform which allows for customised AWS infrastructure. The lab environment has multiple modules that can be enabled or disabled for testing. It’s a fake production environment that uses a VPC and a bastion jump box for authentication.
The key idea behind this environment is that some modules are made to look dangerous in their configuration but they are actually safe when you look at the context.
The Lab environment includes the following:
- IAM privilege escalation (PassRole abuse, permission boundaries, policy shadowing, service-linked roles, role-hopping chains)
- Data Exposure (S3 public access, EBS/RDS snapshot sharing, KMS wildcard grants)
- Network and Compute (wide egress rules, IMDSv1 metadata access)
- Logging and detection gaps (CloudTrail misconfigurations)
- Cross-service attack chains (Lambda with hardcoded secrets, Cognito unauthenticated access, API Gateway with no auth leading to S3)
Three secure baselines exist to test that the plugin correctly identifies safe configurations and doesn’t flag them as issues.
4. The Skills System
Claude Code allows you to steer its thinking with plugins for custom tool use and pipeline automation. A feature of the plugin system is that you can provide Claude with markdown files for certain skills.
Instead of me creating skills for every system command, my co-worker found a super cool repo on GitHub that includes AWS service skills. This gives easy lookup for Claude when it needs to query something via the terminal or reference something.
On top of this, I added my own custom skills specific to the plugin’s security workflow.
- Validation Rules - Maps claims to the exact AWS CLI commands needed to verify them.
- Output - The Output skill defines the report format. It allows a structured and consistent output for any text.
- PoC Generator - The PoC Generator produces step-by-step instructions for reproducing findings, allowing a security reviewer to manually verify if something is exploitable.
- Identity Blast Radius - It maps out the affected resources and maximum damage from a compromised identity.
- Prowler - Integrates prowler functionality into the tool to parse outputs. Raw findings are cross-validated then finalized into a report structure.
- PMapper - Integrates PMapper for IAM privilege escalation analysis. It tells Claude how to build an IAM graph, run escalation queries, and interpret the results.
Each of these skill files tells Claude how to run commands, the documentation, and some examples. It’s really great for customisability.
5. From Detection to Exploitation
Something that I find missing from current security scanners is they do not provide steps to reproduce. If something is said to be exploitable, then we need to prove it. I decided to add a skill for Proof of Concepts. When attack chains are found inside an environment, the user can ask Claude to use the poc-generator skill.
Claude will write a report that includes:
- Steps to reproduce
- The commands in detail
- The evidence chain
- Clean up steps
One of the exploit chains within the lab environment is a boundary bypass via PassRole. To perform this, an attacker operating within the platform-restricted-admin-production boundary escapes it entirely via PassRole. They then gain DynamoDB full CRUD and S3 read across the account.
The AI was able to successfully generate a working PoC for this exploit chain. Manual verification of the section Steps to Reproduce showed that the concept was exploitable.
Below are some of the outputs and files straight from the AI plugin.
- Findings: Permissions Boundary Bypass Analysis
- PoC: Permissions Boundary Escape via Lambda PassRole
- Blast Radius: Attack Chains Combined
6. Strengths vs Limitations
Strengths
Overall the plugin performed very well for scenarios 3, 6, and 9 and strongly identified. It understood the IAM evaluation order, the S3 layered access and the inline deny catch rather than using pattern matching alone for the policy and config.
It had really good attack chain mapping and successfully chained scenarios 1 + 4. This demonstrates the critical thinking that policy scanners lack and could skip.
The reports that it created were extremely accurate. The proof of concept that it made worked well and had strong depth. This shows how powerful an AI agent could be at cloud review or other aspects of cyber-security.
Limitations
The first obvious thing is that the testing environment doesn’t handle every edge case, customers and clients will always have different configurations that are unknown or new to an AI scanner. Although we can provide lots of context towards the AI, it will not be 100% deterministic with findings. The best we can do is try to squash the gap between assumption and determinism.
This also raises another important question: is the AI scanner tuned to perform well on my own testing environment? A term that is popular within the AI space is overfitting, this pretty much means the AI model has become too familiar with the training environment, leading to good results on training data (our test environment) and bad performance in unknown environments.
During testing, I would review the output from the AI. If it was wrong, I would explain why and give my findings back. The model and I would then go through the plugin and make updates. I think to determine accuracy, we need to try out environments that have completely different setups and unique cases. My tuning could have molded the AI to this environment. Further testing is needed to determine if this is an issue.
The plugin is huge. Every time Claude initializes with the claude.md, it wastes tokens. The roles.md file is 400+ lines long, for future goals I’d suggest looking into optimisation or stripping out redundant instructions for less token use.
7. What I Learned
Coming from a person who knows minimal things about the cloud, this project has given me a deep dive into how it actually functions.
I learned about the 21 types of privilege escalation techniques that attackers use to move through a cloud infrastructure. Looked into services that I didn’t know or understand and found out cool quirks. I tested my knowledge on a simulated lab environment called AWSGOAT, which is a fake vulnerable cloud website that has flaws you can exploit. It demonstrated how AI can be used along with human intervention to scan and automate long and repetitive tasks.
The project taught me about numerous existing security tools and how they can be utilised with an AI wrapper to perform powerful cloud security scanning.
I really enjoyed the red-team aspect of the project as well. Not only learning about these skills but applying them as an attacker really showed me how companies/businesses need to lock down their cloud environments.
8. Conclusion
The future of cybersecurity is going to be focused on AI and its ability to think with given information and the context of the environment. We can already see the capability of Claude finding zero days in open source codebases that have existed for years. It is very effective with speed and productivity, allowing security consultants with limited time to achieve more. Adapting to new unknown infomation is a strong suit as threat actors will always try different techniques.
We can assume that most security tools in the future are going to be AI related in someway.
It shows where the industry is heading, and companies like Aura Information Security can benefit from integrating AI early on into their pentesting pipeline.
To keep up with the evolving threat landscape we should use tools like Claude to better the protection of companies and secure data.
