<< BACK

RDS Proxy

Fixing database connection exhaustion on Aurora PostgreSQL with RDS Proxy — zero code changes, SSM parameter swap.

DATE:
MAR.30.2026
READ:
8 MIN

The symptom

Aurora PostgreSQL’s FreeableMemory metric hit zero. The cluster was spending all its memory managing connections — not running queries. Each App Runner instance and ECS task opens its own connection pool, and with rolling deployments creating overlapping instances, the connection count spiked from a steady 200 to 700+ during deploys.

The fix isn’t increasing instance size. The fix is pooling connections before they reach Aurora.

RDS Proxy in CDK

RDS Proxy sits between your application and the database. It maintains a warm pool of backend connections and multiplexes client connections across them. The application connects to the proxy endpoint instead of the Aurora endpoint — same port, same protocol, same password. No code changes.

from aws_cdk import aws_rds as rds
from aws_cdk import aws_iam as iam

proxy_role = iam.Role(
    self, "RdsProxyRole",
    role_name="acmecorp-qa-rds-proxy-role",
    assumed_by=iam.ServicePrincipal("rds.amazonaws.com"),
)
proxy_role.add_to_policy(
    iam.PolicyStatement(
        actions=["secretsmanager:GetSecretValue"],
        resources=[secret_arn],
    )
)

proxy = rds.CfnDBProxy(
    self, "QaRdsProxy",
    db_proxy_name="acmecorp-qa-rds-proxy",
    engine_family="POSTGRESQL",
    role_arn=proxy_role.role_arn,
    auth=[
        rds.CfnDBProxy.AuthFormatProperty(
            auth_scheme="SECRETS",
            secret_arn=secret_arn,
            iam_auth="DISABLED",
        )
    ],
    vpc_subnet_ids=db_subnet_ids,
    vpc_security_group_ids=[proxy_sg.security_group_id],
    require_tls=True,
    idle_client_timeout=1800,
)

The proxy reads database credentials from Secrets Manager — the same secret your application already uses. iam_auth="DISABLED" keeps password authentication, avoiding code changes.

Connection pool tuning

rds.CfnDBProxyTargetGroup(
    self, "QaRdsProxyTargetGroup",
    db_proxy_name=proxy.db_proxy_name,
    target_group_name="default",
    db_cluster_identifiers=[cluster_id],
    connection_pool_configuration_info=rds.CfnDBProxyTargetGroup
        .ConnectionPoolConfigurationInfoFormatProperty(
        max_connections_percent=80,
        max_idle_connections_percent=50,
        connection_borrow_timeout=120,
    ),
).add_dependency(proxy)
+--------------------+-------+--------------------+
| Parameter          | Value | Effect             |
+--------------------+-------+--------------------+
| max_connections_percent | 80%   | Proxy uses up to   |
|                    |       | 80% of Aurora      |
|                    |       | max_connections    |
+--------------------+-------+--------------------+
| max_idle_connections_percent | 50%   | Keep half the pool |
|                    |       | warm when idle     |
+--------------------+-------+--------------------+
| connection_borrow_timeout | 120s  | Wait 2 min for a   |
|                    |       | connection before  |
|                    |       | failing            |
+--------------------+-------+--------------------+
| idle_client_timeout | 1800s | Drop idle client   |
|                    |       | connections after  |
|                    |       | 30 min             |
+--------------------+-------+--------------------+

The cutover

Zero downtime. No application changes.

  1. Deploy the RDS Proxy stack
  2. Update the secret in Secrets Manager — change host from Aurora endpoint to proxy endpoint
  3. Store the proxy endpoint in SSM: /acmecorp/qa/rds-proxy/endpoint
  4. Redeploy services (App Runner rolling update, ECS rolling update)

Each service picks up the new host from the secret on restart. The proxy is wire-compatible with PostgreSQL — the application doesn’t know it’s talking to a proxy.

ssm.StringParameter(
    self, "RdsProxyEndpointParam",
    parameter_name="/acmecorp/qa/rds-proxy/endpoint",
    string_value=proxy.attr_endpoint,
    description="Use this as DB_HOST for QA services",
)

The result

+--------------------+--------------+--------+
| Metric             | Before       | After  |
+--------------------+--------------+--------+
| Backend            | 200-700      | 20-30  |
| connections to     |              |        |
| Aurora             |              |        |
+--------------------+--------------+--------+
| FreeableMemory     | ~0 bytes     | > 1 GB |
+--------------------+--------------+--------+
| Connection time    | 5-10s spikes | Stable |
| during deploy      |              |        |
+--------------------+--------------+--------+
| Code changes       | —            | Zero   |
| required           |              |        |
+--------------------+--------------+--------+

Security group chain

The proxy needs inbound access from your application’s security groups and outbound access to Aurora:

proxy_sg = ec2.SecurityGroup(
    self, "RdsProxySg",
    vpc=vpc,
    allow_all_outbound=True,
)

# App Runner VPC connector
proxy_sg.add_ingress_rule(
    peer=ec2.SecurityGroup.from_security_group_id(
        self, "AppRunnerSg", app_runner_sg_id
    ),
    connection=ec2.Port.tcp(db_port),
)

# VPN and peered VPC CIDRs
for cidr in ["10.0.0.0/16", "172.31.0.0/16"]:
    proxy_sg.add_ingress_rule(
        peer=ec2.Peer.ipv4(cidr),
        connection=ec2.Port.tcp(db_port),
    )

The chain is: Application → Proxy SG → Proxy → Aurora SG → Aurora. Each hop has its own security group rule. The proxy’s security group is the only one that needs to know about both sides.

Connection pooling should be infrastructure, not application code. RDS Proxy makes it a deploy, not a refactor.