RDS Proxy
Fixing database connection exhaustion on Aurora PostgreSQL with RDS Proxy — zero code changes, SSM parameter swap.
- DATE:
- MAR.30.2026
- READ:
- 8 MIN
The symptom
Aurora PostgreSQL’s FreeableMemory metric hit zero. The cluster was spending all its memory managing connections — not running queries. Each App Runner instance and ECS task opens its own connection pool, and with rolling deployments creating overlapping instances, the connection count spiked from a steady 200 to 700+ during deploys.
The fix isn’t increasing instance size. The fix is pooling connections before they reach Aurora.
RDS Proxy in CDK
RDS Proxy sits between your application and the database. It maintains a warm pool of backend connections and multiplexes client connections across them. The application connects to the proxy endpoint instead of the Aurora endpoint — same port, same protocol, same password. No code changes.
from aws_cdk import aws_rds as rds
from aws_cdk import aws_iam as iam
proxy_role = iam.Role(
self, "RdsProxyRole",
role_name="acmecorp-qa-rds-proxy-role",
assumed_by=iam.ServicePrincipal("rds.amazonaws.com"),
)
proxy_role.add_to_policy(
iam.PolicyStatement(
actions=["secretsmanager:GetSecretValue"],
resources=[secret_arn],
)
)
proxy = rds.CfnDBProxy(
self, "QaRdsProxy",
db_proxy_name="acmecorp-qa-rds-proxy",
engine_family="POSTGRESQL",
role_arn=proxy_role.role_arn,
auth=[
rds.CfnDBProxy.AuthFormatProperty(
auth_scheme="SECRETS",
secret_arn=secret_arn,
iam_auth="DISABLED",
)
],
vpc_subnet_ids=db_subnet_ids,
vpc_security_group_ids=[proxy_sg.security_group_id],
require_tls=True,
idle_client_timeout=1800,
) The proxy reads database credentials from Secrets Manager — the same secret your application already uses. iam_auth="DISABLED" keeps password authentication, avoiding code changes.
Connection pool tuning
rds.CfnDBProxyTargetGroup(
self, "QaRdsProxyTargetGroup",
db_proxy_name=proxy.db_proxy_name,
target_group_name="default",
db_cluster_identifiers=[cluster_id],
connection_pool_configuration_info=rds.CfnDBProxyTargetGroup
.ConnectionPoolConfigurationInfoFormatProperty(
max_connections_percent=80,
max_idle_connections_percent=50,
connection_borrow_timeout=120,
),
).add_dependency(proxy) +--------------------+-------+--------------------+ | Parameter | Value | Effect | +--------------------+-------+--------------------+ | max_connections_percent | 80% | Proxy uses up to | | | | 80% of Aurora | | | | max_connections | +--------------------+-------+--------------------+ | max_idle_connections_percent | 50% | Keep half the pool | | | | warm when idle | +--------------------+-------+--------------------+ | connection_borrow_timeout | 120s | Wait 2 min for a | | | | connection before | | | | failing | +--------------------+-------+--------------------+ | idle_client_timeout | 1800s | Drop idle client | | | | connections after | | | | 30 min | +--------------------+-------+--------------------+
The cutover
Zero downtime. No application changes.
- Deploy the RDS Proxy stack
- Update the secret in Secrets Manager — change
hostfrom Aurora endpoint to proxy endpoint - Store the proxy endpoint in SSM:
/acmecorp/qa/rds-proxy/endpoint - Redeploy services (App Runner rolling update, ECS rolling update)
Each service picks up the new host from the secret on restart. The proxy is wire-compatible with PostgreSQL — the application doesn’t know it’s talking to a proxy.
ssm.StringParameter(
self, "RdsProxyEndpointParam",
parameter_name="/acmecorp/qa/rds-proxy/endpoint",
string_value=proxy.attr_endpoint,
description="Use this as DB_HOST for QA services",
) The result
+--------------------+--------------+--------+ | Metric | Before | After | +--------------------+--------------+--------+ | Backend | 200-700 | 20-30 | | connections to | | | | Aurora | | | +--------------------+--------------+--------+ | FreeableMemory | ~0 bytes | > 1 GB | +--------------------+--------------+--------+ | Connection time | 5-10s spikes | Stable | | during deploy | | | +--------------------+--------------+--------+ | Code changes | — | Zero | | required | | | +--------------------+--------------+--------+
Security group chain
The proxy needs inbound access from your application’s security groups and outbound access to Aurora:
proxy_sg = ec2.SecurityGroup(
self, "RdsProxySg",
vpc=vpc,
allow_all_outbound=True,
)
# App Runner VPC connector
proxy_sg.add_ingress_rule(
peer=ec2.SecurityGroup.from_security_group_id(
self, "AppRunnerSg", app_runner_sg_id
),
connection=ec2.Port.tcp(db_port),
)
# VPN and peered VPC CIDRs
for cidr in ["10.0.0.0/16", "172.31.0.0/16"]:
proxy_sg.add_ingress_rule(
peer=ec2.Peer.ipv4(cidr),
connection=ec2.Port.tcp(db_port),
) The chain is: Application → Proxy SG → Proxy → Aurora SG → Aurora. Each hop has its own security group rule. The proxy’s security group is the only one that needs to know about both sides.
Connection pooling should be infrastructure, not application code. RDS Proxy makes it a deploy, not a refactor.