<< BACK

CloudFormation Stack Ownership Conflicts

the mechanics of cross-stack references, why CloudFormation exports create a hard coupling trap, and the SSM Parameter Store pattern that decouples shared infrastructure from the stacks that consume it.

DATE:
NOV.12.2024
READ:
16 MIN

CloudFormation stacks are meant to be self-contained units of infrastructure. Each stack owns its resources, manages their lifecycle, and handles updates autonomously. This model works until the moment two stacks need to touch the same resource — then things break in ways that are subtle, frustrating, and occasionally destructive.

This post is a technical analysis of resource ownership conflicts in CloudFormation, drawn from production experience running a multi-environment CDK application. We cover the mechanics of cross-stack references, the SSM Parameter Store pattern we adopted as a safer alternative, the dependency graphs between stacks, and the operational pitfalls of drift detection, resource adoption, and stack deletion ordering.


The ownership model

Every resource in a CloudFormation stack has exactly one owner: the stack that created it. When you delete the stack, CloudFormation deletes the resource (unless you’ve set DeletionPolicy: Retain).

CloudFormation handles shared resources in three ways:

Cross-stack export/import (Fn::ImportValue) — Stack A exports a value, Stack B imports it. Creates a CloudFormation-tracked dependency. Stack A cannot modify or delete the exported value while any importer exists.

SSM Parameter Store — Stack A writes resource identifiers to SSM parameters. Stack B reads them at synthesis time. No CloudFormation-level dependency between the stacks.

Resource import (CloudFormation Import) — Adopts an existing resource into a stack. Ownership transfers to the new stack.


How cross-stack exports create the coupling trap

When CDK generates cross-stack references, it creates CloudFormation exports on the producing stack and Fn::ImportValue calls on the consuming stack:

++
# Producing Stack
Outputs:
  AlbArnExport:
    Value: !Ref SharedAlb
    Export:
      Name: "SharedAlbStack:AlbArn"

# Consuming Stack
Resources:
  BackendTargetGroup:
    Properties:
      LoadBalancerArn: !ImportValue "SharedAlbStack:AlbArn"
++

Once Stack B imports from Stack A:

  1. Stack A cannot delete the export. CloudFormation refuses: Export SharedAlbStack:AlbArn cannot be deleted as it is in use by BackendStackQA.
  2. Stack A cannot change the exported value. If the ALB is replaced, the export value changes, and CloudFormation refuses the update.
  3. Adding new consumers is fine — but now each one locks Stack A.
  4. Serialization is required. To update an export, first update all consumers to stop importing, then update Stack A. You cannot parallelize this.

This is the exact scenario we hit. Our first architecture had the VPC created in the prod stack, with QA, Dev, and Demo stacks importing the VPC ID via CloudFormation exports. When we needed to refactor the prod stack, we couldn’t modify the VPC export without first updating all four consuming stacks. The consuming stacks themselves depended on the export existing, creating a circular dependency that CloudFormation cannot resolve in a single deployment.


The SSM decoupling pattern

SSM Parameter Store breaks this coupling. The producing stack writes resource identifiers to SSM. The consuming stack reads them at synthesis time via ssm.StringParameter.value_for_string_parameter. No CloudFormation dependency is created.

The critical detail: value_for_string_parameter doesn’t make an API call during cdk synth. It generates a CloudFormation dynamic reference:

++
ListenerArn: '{{resolve:ssm:/shared/alb-listener-arn}}'
++

CloudFormation resolves this at deploy time. Fast synthesis, current values at deploy time, clear error if the parameter doesn’t exist.

SharedAlbStack: writing to SSM

++
class SharedAlbStack(Stack):
    def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
        super().__init__(scope, construct_id, **kwargs)

        vpc = ec2.Vpc.from_lookup(
            self, "SharedVpc",
            tags={"aws:cloudformation:stack-name": "BackendStackQA"},
        )

        alb_sg = ec2.SecurityGroup(self, "SharedAlbSg",
            vpc=vpc,
            description="Shared ALB security group",
            allow_all_outbound=True)
        alb_sg.add_ingress_rule(
            ec2.Peer.ipv4(vpc.vpc_cidr_block),
            ec2.Port.tcp(443),
            "Allow HTTPS from VPC")

        alb = elbv2.ApplicationLoadBalancer(self, "SharedAlb",
            vpc=vpc,
            internet_facing=False,
            security_group=alb_sg)

        listener = alb.add_listener("SharedListener",
            port=80,
            default_action=elbv2.ListenerAction.fixed_response(
                status_code=404, content_type="text/plain",
                message_body="No route matched"))

        # Write identifiers to SSM — these are the cross-stack interface
        ssm.StringParameter(self, "AlbArnSsm",
            parameter_name="/shared/alb-arn",
            string_value=alb.load_balancer_arn,
            description="Shared internal ALB ARN")

        ssm.StringParameter(self, "ListenerArnSsm",
            parameter_name="/shared/alb-listener-arn",
            string_value=listener.listener_arn,
            description="Shared ALB listener ARN")

        ssm.StringParameter(self, "AlbSgIdSsm",
            parameter_name="/shared/alb-sg-id",
            string_value=alb_sg.security_group_id,
            description="Shared ALB security group ID")

        ssm.StringParameter(self, "AlbDnsNameSsm",
            parameter_name="/shared/alb-dns-name",
            string_value=alb.load_balancer_dns_name,
            description="Shared ALB DNS name for Route 53 alias targets")
++

SharedVpcLinkStack: writing VPC link ID

++
class SharedVpcLinkStack(Stack):
    def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
        super().__init__(scope, construct_id, **kwargs)

        vpc = ec2.Vpc.from_lookup(
            self, "SharedVpc",
            tags={"aws:cloudformation:stack-name": "BackendStackQA"})

        alb_sg_id = ssm.StringParameter.value_for_string_parameter(
            self, "/shared/alb-sg-id")

        vpc_link = apigwv2.CfnVpcLink(
            self, "SharedVpcLink",
            name="shared-vpc-link",
            subnet_ids=[s.subnet_id for s in vpc.private_subnets],
            security_group_ids=[alb_sg_id])

        ssm.StringParameter(self, "VpcLinkIdSsm",
            parameter_name="/shared/vpc-link-id",
            string_value=vpc_link.ref,
            description="VPC Link ID for API Gateway HTTP API integration")
++

BackendStackQA: reading from SSM

++
class BackendStackQA(Stack):
    def __init__(self, scope, construct_id, **kwargs):
        super().__init__(scope, construct_id, **kwargs)

        vpc = ec2.Vpc(self, "VpcQA", max_azs=2, nat_gateways=1)

        # Read shared infrastructure identifiers at synthesis time
        listener_arn = ssm.StringParameter.value_for_string_parameter(
            self, "/shared/alb-listener-arn")
        alb_sg_id = ssm.StringParameter.value_for_string_parameter(
            self, "/shared/alb-sg-id")

        shared_alb_sg = ec2.SecurityGroup.from_security_group_id(
            self, "SharedAlbSg", alb_sg_id)

        listener = elbv2.ApplicationListener.from_application_listener_attributes(
            self, "SharedListener",
            listener_arn=listener_arn,
            security_group=shared_alb_sg)

        # Add listener rule for this service
        elbv2.ApplicationListenerRule(
            self, "BackendRule",
            listener=listener,
            priority=100,
            conditions=[elbv2.ListenerCondition.path_patterns(["/api/*"])],
            target_groups=[target_group])
++

Dependency graph analysis

With SSM as the glue, the stack dependency graph looks like this across 6 deployment stages:

++
Shared infra (deployed once)
  └── writes SSM params
        ├── BackendStackQA (creates VPC)
        │     └── provides VPC for all other stacks via tag lookup
        ├── BackendStackDev
        ├── BackendStackDemo
        ├── BackendStackPoc
        ├── BackendStackMono
        └── BackendStackProd
++

The arrows from shared infra to service stacks represent runtime data flow (SSM reads), not CloudFormation dependencies. Stacks can be deployed independently.

Required deployment order:

  1. SharedAlbStack — creates ALB, writes SSM params
  2. BackendStackQA — creates VPC (looked up by all others)
  3. SharedVpcLinkStack — looks up ALB SG, creates VPC Link, writes VPC Link SSM param
  4. All other backend stacks — can deploy in any order or in parallel

Stack deletion order and the trap

Deletion order matters. If you delete the SharedAlbStack before the backend stacks, the backend stacks still have CloudFormation resources that reference the now-deleted ALB listener. Their next cdk diff or deploy will fail.

The safe deletion order is the reverse of creation:

++
# Safe deletion order (reverse of deployment)
STAGE=prod cdk destroy --all
STAGE=mono cdk destroy --all
STAGE=poc cdk destroy --all
STAGE=demo cdk destroy --all
STAGE=dev cdk destroy --all
STAGE=qa cdk destroy --all  # destroys VPC
STAGE=shared cdk destroy --all  # destroys ALB, VPC Link, SSM params
++

Security groups: the most common ownership conflict

The shared ALB’s security group needs ingress rules added by each service stack, but those rules affect a resource owned by the SharedAlbStack. This is a cross-stack security group modification.

Two CDK approaches handle this:

Option 1: connections.allow_from (CDK-managed, preferred)

++
# In BackendStackQA — allows the service to talk to the shared ALB
fargate_service.service.connections.allow_from(
    shared_alb_sg,
    ec2.Port.tcp(8000),
    "Shared ALB to backend service",
)
++

This creates an ingress rule on the ECS service’s security group referencing the shared ALB SG. The rule is owned by BackendStackQA, not SharedAlbStack. Clean, no conflicts.

Option 2: add_ingress_rule (manual, use with care)

++
# This adds a rule to the ECS service SG, also owned by BackendStackQA
fargate_service.service.security_groups[0].add_ingress_rule(
    peer=shared_alb_sg,
    connection=ec2.Port.tcp(8000),
    description="Shared ALB to backend",
)
++

Never add ingress rules to a security group owned by a different stack. If you try to add a rule to shared_alb_sg (owned by SharedAlbStack) from code in BackendStackQA, CDK will attempt to modify a resource it doesn’t own, which breaks the single-owner model.


Drift detection

CloudFormation drift occurs when someone modifies a resource outside of CloudFormation. With SSM as the cross-stack glue, drift in the SSM parameters themselves can silently break things.

Monitor SSM parameter changes with an EventBridge rule:

++
aws events put-rule 
  --name "ssm-shared-param-changes" 
  --event-pattern '{
    "source": ["aws.ssm"],
    "detail-type": ["Parameter Store Change"],
    "detail": {
      "name": [{
        "prefix": "/shared/"
      }],
      "operation": ["Update", "Delete"]
    }
  }'
++

For CloudFormation drift detection on the shared stack itself:

++
aws cloudformation detect-stack-drift 
  --stack-name SharedAlbStack

# Check drift results
aws cloudformation describe-stack-drift-detection-status 
  --stack-drift-detection-id <detection-id>
++

Any drift in the shared stack should be investigated immediately — manual console changes to the ALB or security group can break all consumer stacks.


SSM vs CloudFormation exports: the decision matrix

+--------------------+--------------------+--------------------+
| Scenario           | Use CF Exports     | Use SSM            |
+--------------------+--------------------+--------------------+
| Stable IDs that    | Fine for small     | Preferred at scale |
| never change       | setups             |                    |
+--------------------+--------------------+--------------------+
| Resources that     | Trap — exporters   | Safe — just update |
| might be replaced  | lock               | the param          |
+--------------------+--------------------+--------------------+
| Multiple consumers | Every consumer     | No locking,        |
|                    | locks exporter     | consumers are      |
|                    |                    | independent        |
+--------------------+--------------------+--------------------+
| Cross-account      | Not supported      | Use resource       |
| sharing            |                    | policies           |
+--------------------+--------------------+--------------------+
| Secrets and        | Never (plaintext   | Use SecureString   |
| credentials        | in outputs)        | type               |
+--------------------+--------------------+--------------------+
| Large teams / many | Becomes            | Scales cleanly     |
| stacks             | unmanageable       |                    |
+--------------------+--------------------+--------------------+

The fundamental difference: CloudFormation exports create a dependency tracked in CloudFormation’s own state machine. SSM parameters are just data — the dependency is conceptual, enforced by deployment ordering in your CI/CD pipeline, not by CloudFormation itself.

For small infrastructure with 3-5 stacks that rarely change, CloudFormation exports are fine. For anything larger — multiple teams, many services, frequent refactoring — SSM is the right choice.