A zombie resource is not broken. It is not malfunctioning. It passes every health check, appears green in every monitoring dashboard, and generates a steady, unremarkable line item on your cloud bill every month. The only thing wrong with it is that nobody needs it anymore.
Zombie resources are the inevitable byproduct of how cloud infrastructure gets built and maintained. Teams provision infrastructure for specific purposes — a project, a proof of concept, a migration, a development environment. The project ends or changes direction, but the infrastructure doesn’t automatically go away. It persists, quietly accumulating cost, until someone takes the specific action of decommissioning it.
In most organizations, that action never happens.
How zombies form
The creation of a zombie resource follows a predictable sequence:
A workload is built for a specific purpose. A developer spins up an EC2 instance for a performance test. A team provisions a database for a project. A DevOps engineer creates a load balancer for a staging environment.
The purpose changes or ends. The test completes. The project is deprioritized. The staging environment gets rebuilt with a different configuration.
The infrastructure isn’t decommissioned. This is the critical gap. The team is busy, the cost isn’t immediately visible, and there’s uncertainty about whether the resource might be needed again. The path of least resistance is to leave it running.
Ownership blurs over time. Team members move on. The resource loses its connection to anyone who remembers what it was for. The remaining team doesn’t know whether it’s safe to shut down. Nobody wants to be the person who terminated the database that turned out to be critical.
The zombie phase begins. The resource runs indefinitely, generating cost with no corresponding business value, until someone explicitly addresses it.
The most common zombie types
Stopped EC2 instances — Stopped instances don’t incur compute charges, but their attached EBS volumes do. A stopped m5.xlarge with a 500 GB data volume costs approximately $50/month in storage even while idle. Instances stopped for more than 30 days are high-priority zombie candidates.
Running but idle EC2 instances — Instances with consistently low CPU utilization (under 5% over 7+ days) are often zombies that were never stopped. A running instance with 0.4% CPU utilization is almost certainly not serving a purpose proportionate to its cost.
Unattached EBS volumes — When an EC2 instance is terminated without deleting its data volumes, the volumes persist as unattached storage. At $0.08–$0.10/GB/month, a 1 TB unattached volume costs $80–$100/month in perpetuity.
Load balancers with no targets — Application Load Balancers cost $0.008/LCU-hour regardless of traffic. An ALB with no registered targets is a zombie by definition. These are easy to identify: check target group health for any ALB with zero healthy targets.
Elastic IP addresses — Unused Elastic IPs (not associated with a running instance) cost $3.60/month each. Small individually, but easy to accumulate: security engineers often allocate IP addresses for whitelisting purposes and never release them when the need passes.
RDS instances in stopped state — Like stopped EC2 instances, stopped RDS instances still incur storage charges. AWS also automatically restarts a stopped RDS instance after 7 days, which can restart compute charges without anyone noticing.
NAT Gateways with no traffic — NAT Gateways cost $0.045/hour plus data transfer regardless of utilization. A NAT Gateway in a dev VPC with no active workloads costs $32/month. Multiply this across several forgotten environments and the costs become material.
Finding zombies in your account
The most reliable approach is age-based filtering, not cost-based filtering. The resources you’re looking for are not necessarily expensive — they’re old and idle.
For EC2: Export all instances with launch date, instance state, current CPU utilization (from CloudWatch), and Name/Owner tags. Sort by launch date. Anything running for more than 90 days with consistently low utilization and no owner tag is a zombie candidate.
For EBS volumes: Filter by state available (unattached). Any volume in this state for more than 30 days with no snapshot activity has no current consumer.
For load balancers: Export all ALBs and filter to those with zero healthy targets in any target group. These have no production traffic by definition.
For NAT Gateways: Pull CloudWatch BytesOutToDestination for each NAT Gateway over the past 30 days. Any gateway averaging less than 1 MB/day has no meaningful workload routing through it.
Building a decommissioning practice
The gap in the resource lifecycle diagram is structural — there’s no event in AWS that says “this resource’s purpose has ended.” That event has to be created through process.
Tag resources with a TTL. For temporary workloads (dev environments, performance tests, proof-of-concept deployments), require a ttl tag at creation time: ttl: 2026-08-01. Automated tooling can identify resources past their TTL and generate a finding for review.
Run monthly age-based reviews. Schedule a monthly review that surfaces all resources older than 90 days by owner. The question for each one: is this still serving an active purpose? The owner has to answer yes or begin decommissioning.
Require decommissioning as part of project close-out. When a project ends, the close-out checklist should include an infrastructure review. Which resources were created for this project? Which can be decommissioned?
Create a “zombie fund.” Some organizations make the financial impact of zombie resources visible by charging the accumulated cost of zombie resources back to the team that last owned them. This creates a financial incentive to decommission cleanly, rather than leaving cleanup for someone else.
CostDefender identifies zombie resources by combining inventory age, utilization metrics, and ownership data — surfacing findings to the right owner with the evidence needed to make a confident decommissioning decision.