Question: Anyone have experience running HA NAT instances and how to set them up? Basically, I would need to have my own NAT instances with automatic failover in case something breaks. I've ran years ago similar setups on-prem, but never on EC2. (See context below for why I want this)
Context: I have a cloud hosted DB cluster that is operated by a 3rd party and need to access the cluster through VPC peering. VPC interface endpoints are not an option, since the client needs to directly communicate with the cluster node holding the data and is smart enough to find a node in the same AZ, so we pay almost no data transfer costs.
I would not like to keep open up the subnet holding the clients to the hosting provider owned VPC, since we have a service in the same subnet that has an allow everything security group configured (due to ENI connection tracking limit being hit). Also as a general policy, I would like to have 3rd party peerings isolated.
A simple solution to the problem is to add one more subnet to the VPC and put a NAT GW there. The new subnet is the only one with a route table allowing direct communication with the peered VPC while every other subnet in our own VPC is routed to the peer VPC through the NAT GW.
The above solution would be simple if not for the fact that we're pushing about 1GB/s traffic to the other VPC. The DB cluster itself costs only less than $20k per month, but traffic costs with a NAT GW would be a lot. If I ran my own NAT instances, then this would cost a few thousand per month, since I would only pay for the EC2 instances