Why Your Service Security Group Shouldn't Have 0.0.0.0/0

Security groups are the most fundamental network security control in AWS — stateful firewalls attached to ENIs that control inbound and outbound traffic at the instance level. And yet they’re among the most commonly misconfigured resources in any AWS account.

During a Security Hub remediation effort, we found that one service’s security group had an inbound rule allowing all traffic from 0.0.0.0/0 — the entire internet. That single misconfiguration triggered four separate Security Hub findings: EC2.13, EC2.14, EC2.18, and EC2.19. Four findings from one security group. One rule that should never have existed.

How one bad rule creates four findings

The four findings aren’t redundant — they check different aspects of the same problem:

+---------+--------------------+
| Finding | What it checks     |
+---------+--------------------+
| EC2.13  | Security groups    |
|         | should not allow   |
|         | ingress from       |
|         | 0.0.0.0/0 to port  |
|         | 22                 |
+---------+--------------------+
| EC2.14  | Security groups    |
|         | should not allow   |
|         | ingress from       |
|         | 0.0.0.0/0 to port  |
|         | 3389               |
+---------+--------------------+
| EC2.18  | Unrestricted       |
|         | incoming traffic   |
|         | on unauthorized    |
|         | ports              |
+---------+--------------------+
| EC2.19  | High-risk ports    |
|         | open to internet   |
+---------+--------------------+

An IpProtocol: -1 rule from 0.0.0.0/0 (protocol -1 = all protocols) implicitly includes every port — which triggers every port-specific finding. This is why overly permissive rules are so expensive in terms of security posture: they create a multiplicative effect on your finding count.

The before state:

aws ec2 describe-security-groups 
  --group-ids sg-0b5b3765c93df4ed4 
  --query 'SecurityGroups[0].IpPermissions'

[
  {
    "IpProtocol": "-1",
    "IpRanges": [{"CidrIp": "0.0.0.0/0", "Description": ""}]
  },
  {
    "FromPort": 3000,
    "ToPort": 3000,
    "IpProtocol": "tcp",
    "UserIdGroupPairs": [{"GroupId": "sg-0a1b2c3d", "Description": "ALB to service"}]
  }
]

Notice that the correct rule already existed: TCP port 3000 from the ALB security group. The 0.0.0.0/0 rule was redundant and dangerous — almost certainly added during initial debugging (“nothing is connecting, let me open everything and figure it out later”) and never removed.

The fix

Step 1: Revoke the 0.0.0.0/0 rule

aws ec2 revoke-security-group-ingress 
  --group-id sg-0b5b3765c93df4ed4 
  --ip-permissions 'IpProtocol=-1,IpRanges=[{CidrIp=0.0.0.0/0}]'

Step 2: Add VPC CIDR as intermediate replacement

aws ec2 authorize-security-group-ingress 
  --group-id sg-0b5b3765c93df4ed4 
  --ip-permissions 'IpProtocol=-1,IpRanges=[{CidrIp=10.0.0.0/16,Description=VPC internal traffic}]'

All four findings resolved. No service disruption. The service continued operating because the ALB-to-service rule on port 3000 was never touched.

Why VPC CIDR instead of just removing?

We could have simply removed the 0.0.0.0/0 rule and relied entirely on the existing ALB-to-service rule on port 3000. We chose the intermediate VPC CIDR step for safety:

Unknown dependencies. Other services within the VPC might connect to this service on ports other than 3000 (health checks, metrics, admin interfaces). Removing all broad access could break something.
Debugging access. Developers sometimes connect directly to services from bastion hosts or VPN-connected machines. A VPC CIDR rule preserves this.
Incremental tightening. VPC CIDR is the first step. After monitoring for a week to confirm no unexpected traffic patterns, tighten further to specific security group references.

The security group privilege hierarchy

Not all security group rules are equal. There’s a clear hierarchy from most permissive to most restrictive:

Level 1 — 0.0.0.0/0 all protocols (worst)

{"IpProtocol": "-1", "IpRanges": [{"CidrIp": "0.0.0.0/0"}]}

Equivalent to having no firewall at all. Every port on every instance is reachable from any IP on the internet.

Level 2 — 0.0.0.0/0 specific ports (bad)

{"IpProtocol": "tcp", "FromPort": 443, "ToPort": 443, "IpRanges": [{"CidrIp": "0.0.0.0/0"}]}

Appropriate for public-facing ALBs. Not appropriate for backend services.

Level 3 — VPC CIDR all protocols (acceptable as intermediate)

{"IpProtocol": "-1", "IpRanges": [{"CidrIp": "10.0.0.0/16"}]}

Internal-only access. Prevents internet attacks but still allows any VPC resource on any port.

Level 4 — VPC CIDR specific ports (good)

{"IpProtocol": "tcp", "FromPort": 3000, "ToPort": 3000, "IpRanges": [{"CidrIp": "10.0.0.0/16"}]}

Internal access on specific ports only.

Level 5 — security group reference, specific ports (ideal)

# CDK — the correct configuration
service_sg.add_ingress_rule(
    peer=alb_sg,
    connection=ec2.Port.tcp(3000),
    description="ALB health checks and request forwarding to PDF renderer"
)

Only traffic from the specific ALB security group on TCP port 3000. No other source, no other port.

Using VPC Flow Logs to determine the ideal state

Before going from Level 3 to Level 5, use VPC Flow Logs to identify all actual traffic sources:

aws logs filter-log-events 
  --log-group-name /vpc/flow-logs 
  --filter-pattern '[version, account, eni, srcaddr, dstaddr, srcport, dstport="3000", protocol="6", packets, bytes, start, end, action="ACCEPT", logstatus]' 
  --start-time $(date -v -7d +%s000) 
  --query 'events[].message' 
  --output text | awk '{print $4}' | sort -u

This gives a definitive list of source IPs, which you can map back to security groups. Armed with this data, you can safely tighten to Level 5.

CDK best practices for security group design

Always use security group references over CIDR ranges

# BAD: CIDR ranges go stale and don't track infrastructure changes
service_sg.add_ingress_rule(
    peer=ec2.Peer.ipv4("10.0.0.0/16"),
    connection=ec2.Port.tcp(3000),
    description="VPC traffic to service"
)

# GOOD: Security group references are dynamic
service_sg.add_ingress_rule(
    peer=alb_sg,
    connection=ec2.Port.tcp(3000),
    description="ALB to service"
)

Add descriptions on every rule

service_sg.add_ingress_rule(
    peer=alb_sg,
    connection=ec2.Port.tcp(3000),
    description="ALB health checks and request forwarding to PDF renderer"
)

service_sg.add_ingress_rule(
    peer=monitoring_sg,
    connection=ec2.Port.tcp(9090),
    description="Prometheus metrics scraping from monitoring stack"
)

Descriptions are your audit trail. When someone reviews the security group six months from now, descriptions explain the intent without needing to trace through application code.

Separate security groups per service

pdf_renderer_sg = ec2.SecurityGroup(self, "PdfRendererSG", vpc=vpc,
    description="PDF rendering service - Gotenberg")
api_sg = ec2.SecurityGroup(self, "ApiSG", vpc=vpc,
    description="Backend API service")
worker_sg = ec2.SecurityGroup(self, "WorkerSG", vpc=vpc,
    description="Background job worker service")

# Cross-service access is explicit
pdf_renderer_sg.add_ingress_rule(
    peer=api_sg,
    connection=ec2.Port.tcp(3000),
    description="API requests PDF rendering"
)

Never share security groups across services — access granted to one is implicitly granted to all.

Use CDK Aspects to prevent public ingress rules at synthesis time

from aws_cdk import IAspect, Annotations
import jsii

@jsii.implements(IAspect)
class NoPublicIngressAspect:
    def visit(self, node):
        if isinstance(node, ec2.CfnSecurityGroup):
            ingress = node.security_group_ingress or []
            for rule in ingress:
                if hasattr(rule, 'cidr_ip') and rule.cidr_ip == '0.0.0.0/0':
                    Annotations.of(node).add_error(
                        "Security group has 0.0.0.0/0 ingress rule. "
                        "Use specific CIDR or security group reference."
                    )

cdk.Aspects.of(app).add(NoPublicIngressAspect())

This CDK Aspect fails cdk synth if any security group in the entire application has a 0.0.0.0/0 ingress rule. The misconfiguration never gets deployed.

Detecting drift

Even with CDK managing security groups, manual console changes can introduce 0.0.0.0/0 rules. Detect them in real time with EventBridge:

aws events put-rule 
  --name "sg-modification-alert" 
  --event-pattern '{
    "source": ["aws.ec2"],
    "detail-type": ["AWS API Call via CloudTrail"],
    "detail": {
      "eventName": [
        "AuthorizeSecurityGroupIngress",
        "AuthorizeSecurityGroupEgress"
      ]
    }
  }'

This triggers on any security group modification. Route the event to an SNS topic for immediate notification.

What we deliberately did not fix

Default VPC security group (sg-default)

The default security group in the default VPC has rules allowing all traffic from itself — this triggers EC2.2. We chose not to fix it because:

The default VPC is not in use — no instances, no ENIs
Some AWS services automatically use the default SG when creating resources; removing its rules could cause failures
The finding is informational — it flags accounts where the default SG is actively used

We suppressed the finding with a documented rationale. This is a legitimate accepted risk, not a shortcut.

Key principles

Treat 0.0.0.0/0 as a code smell. Every occurrence should be justified and documented. The only legitimate use is for public-facing load balancers and bastion hosts on specific ports.

Fix the process, not just the resource. If a 0.0.0.0/0 rule was added manually, the problem isn’t the rule — it’s the lack of guardrails preventing manual changes. CDK Aspects, Service Control Policies, and Config rules prevent recurrence.

Use VPC Flow Logs before tightening. Don’t guess which traffic needs to be allowed. Analyze actual flow logs to confirm dependencies.

Prefer removal over replacement. If a broad rule is redundant (as the 0.0.0.0/0 rule was in our case, because the ALB SG rule already existed), remove it entirely rather than replacing it with something slightly less broad.

Layer your defenses. Security groups aren’t your only protection. NACLs for subnet-level blocking, WAF for application-level filtering, VPC endpoints to eliminate internet-routable traffic entirely. Each layer limits blast radius.

The technical fix was two CLI commands. The real fix was CDK Aspects that prevent 0.0.0.0/0 rules from being deployed, Config rules that detect manual additions, and a culture where every security group rule has a description explaining why it exists.