How to Replace Your NAT Gateway with VNS3 NATe

How to Replace Your NAT Gateway with VNS3 NATe

From Wikipedia:

“Network address translation (NAT) is a method of mapping an IP address space into another by modifying network address information in the IP header of packets while they are in transit across a traffic routing device. The technique was originally used to avoid the need to assign a new address to every host when a network was moved, or when the upstream Internet service provider was replaced, but could not route the networks address space. It has become a popular and essential tool in conserving global address space in the face of IPv4 address exhaustion. One Internet-routable IP address of a NAT gateway can be used for an entire private network.”

Cohesive Networks introduced the NATe offering into our VNS3 lineup of network devices back in March. It lowers operational costs while adding functionality and increasing visibility. Easily deployable and managed, it should be a no brainer once you consider its functional gains and lower spend rate. Some of our large customers have already started the migration and are seeing savings in the tens, hundreds and thousands of thousands of dollars.

The AWS NAT Gateways provide bare bones functionality at a premium cost. They simply provide a drop in NAT function on a per availability zone basis within your VPC, nothing more.No visibility, no egress controls, and lots of hidden costs. You get charged between $0.045 and $0.093 an hour depending on the region. You get charged the same per gigabyte of data that they ‘process’, meaning data coming in and going out. That’s it, and it can really add up. A VPC with two availability zones will cost you $788.40 a year before data tax in the least expensive regions, going up to double that in the most expensive regions. Now consider that across tens, hundreds or thousands of VPCs. That’s some real money.

With Cohesive Networks VNS3 NATe you can run the same two availability zones on two t3a.nano instances with 1 year reserved instances as low as $136.66 per a year, with no data tax as ec2 instances do not incur inbound data fees. It is about a sixth to a tenth of the price depending on the region you are running in.

As a Solutions Architect at Cohesive Networks I’ve worked with enterprise customers around the world and understand the difficulty and challenges to change existing architecture and cloud design. Using cloud vender prescribed architecture is not always easy to replace as there are up and down stream dependancies. The really nice thing about swapping your NAT Gateways with VNS3 NATe devices is that it is really a drop in replacement for a service that is so well defined. It can be programatically accomplished to provide near zero downtime replacement. Then you can start to build upon all the new things that VNS3 NATe gives you.

The process of replacement is very straight forward:

  1. First you deploy a VNS3 NATe for each availability zone that you have in your VPC in a public subnet.
  2. Configure its security group to allow all traffic from the subnet CIDR ranges of your private subnets.
  3. You do not need to install a key pair.
  4. Once launched turn off source / destination checking under instance networking.
  5. Next you will repoint any VPC route table rules, typically 0.0.0.0/0, from the existing NAT Gateway to the Elastic Network Interface of the Elastic IP that is attached to your NAT Gateway.
  6. Delete the NAT Gateway so as o free up the Elastic IP.
  7. Finally, associate the Elastic IP to your VNS3 NATe instance.

The only downtime will be the 30 or so seconds that it takes to delete the NAT Gateway.

One safeguard we always recommend to our customers to set up a Cloud Watch Recovery Alarm on all VNS3 instances. This will protect your AWS instances from any underlining hardware and hypervisor failures. Which will give you effectively the same uptime assurancesas services like NAT Gateway. If the instance “goes away” the alarm will trigger an automatic recovery, including restoring the elastic ip, so that VPC route table rules remain intact as well as state as restoration occurs from run time cache.

Now you can log into your VNS3 NATe device by going to:

https://<elastic ip>:8000

usename: vnscubed

password: <ec2 instance-id>
VNS3 AWS Transitive Routing Deployment

Head over to the Network Sniffer page from the link on the left had side of the page and set up a trace for your private subnet range to get visibility into your NATe traffic.

VNS3 Transit Ipsec deployment
Cloud Instance Quality vs. Cloud Platform Cost-at-Scale

Cloud Instance Quality vs. Cloud Platform Cost-at-Scale

What is the failure rate of cloud instances at Amazon, Azure, Google?

I have looked for specific numbers – but so far found just aggregate “nines” for cloud regions or availability zones. So my anecdotal response is “for the most part, a REAL long time”. It is not unusual for us to find customers’ Cohesive network controllers running for years without any downtime. I think the longest we have seen so far is six years of uptime. 

So – with generally strong uptimes for instance-based solutions, and solid HA and recovery mechanisms for cloud instances – how much premium should you spend on some of the most basic “cloud platform solutions”?

Currently cloud platforms are charging a significant premium for some very basic services which do not perform that differently, and in some cases I would argue less well than instance-based solutions; either home-grown or 3rd-party vended. 

Let’s look at a few AWS examples:

  • NAT-Gateway 4.5 cents per hour plus a SIGNIFICANT data tax of 4.5 cents per gigabyte
  • Transit Gateway VPC Connection 5 cents per hour for each VPC connection plus a HEALTHY data tax of 2 cents per gigabyte
  • AWS Client VPN $36.50 per connected client (on a full-time monthly basis), $72 per month to connect those VPN users to a single subnet! (AWS does calculate your connected client costs at 5 cents per hour, but since we should all basically be on VPNs at all times, how much will this save you?)

NOTE: The items I call “data taxes” are on top of the cloud outbound network charges you pay (still quite hefty on their own). 

If you are using cloud at scale, depending on the size of your organization, the costs of these basic services get really big, really fast. At Cohesive we have customer’s that are spending high six figures, and even seven figures in premium on these types of services. The good news is for a number of those customers it is increasingly “were spending”, as they move to equally performant, more observable, instance-based solutions from Cohesive.

Here is a recent blog post from Ryan at Cohesive providing an overview of Cohesive NATe nat-gateway instances versus cloud platforms. For many, a solution like this seem to meet the need. 

Although – I think Ryan’s post may have significantly underestimated the impact of data taxes.  https://twitter.com/pjktech/status/1372973836539457547

    So you say “Yes, instance availability is really good, but what about [fill in your failure scenarios here] ?”

    Depending on how small your recovery windows need to be, there are quite a range of HA solutions to choose among. Here are a few examples:

    • Protect against fat-finger termination, automation mistakes with auto-scale groups of 1, and termination protection
    • Use AWS Cloud Watch and EC2 Auto Recovery to protect against AWS failures
    • Run multiple instances and add in a Network Load Balancer for still significant savings
    • Use Cohesive HA plugin allowing one VNS3 Controller instance to take over for another (with proper IAMs permissions)

    Overall, this question is a “modern” example of the “all-in” vs. “over-the-top” tension I wrote about in 2016 still available on Medium. More simply put now, I think of the choice as being when do you run “on the cloud” and when do you choose to run “in the cloud”, and ideally it is not all or none either way.

    In summary, given how darn good the major cloud platforms are at their basic remit of compute, bulk network transport, and object storage, do you need to be “in the cloud” at a significant expense premium, or can you be “on the cloud” for significant savings at scale for a number of basic services?

    NATe: A Tax-Free Alternative to Cloud NAT Gateways

    NATe: A Tax-Free Alternative to Cloud NAT Gateways

    Whether you need to connect multiple cloud instances, communicate with the public internet from private resources, or directly connect to instances in local data centers, chances are you will be using Network Address Translation (NAT) to make that connection. All major cloud providers provide some product or service to provide NAT functionality, and some platforms even provide separate public and private variants. Because cloud instances running in private subnets are unable to access resources like time servers, webpages, or OS repositories without NAT functionality, most users find themselves relying on their cloud platform’s NAT offerings. By simply following their cloud providers’ recommended best practices, users are overpaying for an overcomplicated and inflexible service that a home cable modem does for free.  So why pay so much for such a simple network function?

    If You’re Using Cloud Platform NAT Gateway(s), You’re Overspending on Cloud Deployments.

    Overspending of any kind in the wake of the economic disruption caused by the COVID-19 pandemic can be deadly for any business. Yes, some have fared better than others during this challenging time but all organizations have revisited projections and budgets in the face of uncertainty. According to Gartner, the pressure is on for budget holders to optimize costs.

    Pre COVID-19
    54%
    of enterprises planned IT budget increases
    Post COVID-19
    84%
    of enterprises expect IT budget decreases

    Where to Start?

    Look to the sky! Your cloud bill is likely full of opportunities for savings, especially if your application relies on NAT functionality. Using AWS NAT Gateway pricing as an example, let’s start with the comparative base subscription costs:

      AWS NAT Gateway VNS3 NATe
    Subscription $0.045 / hour $0.01 / hour*
    Data Processing (TAX) $0.045 / GB $0.00 / GB
    * Price includes runtime fees (on-demand t3.nano $.0052 / hr) + NATe subscription ($0.005 / hr)

    As you can see from this example, the standalone subscription cost of an AWS NAT gateway is more than the cost of a single t3.medium instance. The already low VNS3 NATe subscription cost will provide you even more savings when you consider the fact that you don’t have to create as many individual NAT gateways, each of which would be  accompanied by an additional AWS NAT Gateway subscription. The cost differential here makes NATe an obvious choice at any deployment scale and we even offer a free NATe license for smaller deployments.

    VNS3 NATe is also incredibly scalable because we don’t increase our data processing rates as your bandwidth needs scale.  Below is a pricing table that shows the total cost of running a single NAT Gateway vs a VNS3 NATe instance as the traffic throughput increases in a given month:

    GB / Month AWS NAT Gateway VNS3 NATe
    1 $32.45 $7.20
    10 $32.85 $7.20
    100 $36.90 $7.20
    1,000 $77.40 $7.20
    5,000 $257.40 $7.20
    10,000 $482.40 $7.20
    50,000 $2,282.40 $7.20
    100,000 $4,532.40 $7.20

    We also have customers who maintain 100s or 1000s of VPCs with NAT requirements of 1-100 GB per month.  Those enterprise cloud customer at scale have typically seen costs drop to 1/5 of what they would pay for AWS NAT Gateways.  To illustrate this savings, take the example from one of our customers has 1800 VPCs each with a NAT Gateway.  The total data processed through these NAT Gateways is low and averages 10GB / month with much more potential savings for deployments that pass more traffic out the NAT device.

    AWS NAT Gateway VNS3 NATe
    Monthly Runtime $58,320 Monthly Runtime $12,960
    Data Processing (TAX) $810 Data Processing (TAX) $0
    TOTAL PER MONTH $59,130 TOTAL PER MONTH $12,960
    * Price includes runtime fees (on-demand t3.nano $0.0052 / hr) + NATe subscription ($0.005 / hr)

    Total NATe saving per month in this case is $46K and $554K per annum.

    Of course, costs savings are not limited to just NAT Gateway spend.  Other opportunities for savings include right sizing instances (latest generation instance families are always less expensive), decommissioning unused services/resources (I’m looking at you load balancers), and reviewing storage strategies (such as EBS).

    What is a NAT Gateway?

    A NAT Gateway is a network service that performs a simple network function: Network Address Translation for cloud-based servers running in a private network (private VPC subnet). Here is the AWS documentation detailing the NAT Gateway functionality. NAT Gateways perform a specific type of NAT called IP Masquerading, where devices in a private IP network use a single public IP associated with the gateway for communication with the public Internet.

    This is the same function that your home modem performs for free. You’re likely leveraging this NAT functionality as you read this post. Basically the NAT functionality on a NAT Gateway or your home modem allow devices on a private network (computers, phones, TVs, refrigerators, toothbrushes, etc. in the case of your home network) to access the Internet and receive responses but not allow devices on the public Internet to initiate connection into your private network. All traffic sent from the private network to the public Internet uses the modem’s public IP address.

    NATe to the Rescue!

    In response to direct requests by our customers, we created a low-cost, instance-based alternative to NAT Gateways – VNS3 NATe.

    Available on AWS PM and Azure MP today:

    * No subscription premium but total throughput limited to 50mbps

    What is a NATe?

    NATe instances are drop-in replacements from Cohesive Networks for NAT Gateways. Simply launch in a VPC/VNET subnet with an Internet Gateway associated, Stop Src/Dst checking (enable IP forwarding), and update the Route Tables associated with the private Subnets to point 0.0.0.0/0 destinations at the NATe instance-id.

    Cohesive-Networks_NATe-Network-Diagram_20210318

    NATe provides all the functionality of a NAT Gateway plus enterprise grade security and controls at a fraction of the cost. Some of the functional highlights of NATe include:

    • High Performance – run on the smallest instance sizes to maximize value or larger instance for greater total throughput
    • Secure – access to a firewall to allow additional and orthogonal policy enforcement for traffic flows
    • Control – access logs, network tools like tcpdump, status information
    • Customize – leverage the Cohesive Networks Plugin system to add L4-L7 network services to the NATe instance like NIDs, WAF, Proxy, LB, etc.
    • Automate – fully automate the deployment of VNS3 NATe instances as part of your existing deployment framework leveraging the RESTful API to reduce implementation costs.
    • Failover – NATe can be configured in a number of HA architectures to provide the same level of insurance needed for critical infrastructure via instance auto recovery, auto scale groups, and Cohesive Networks’ own Peering and HA Container functionality
    • Upgrade – NATe is fully upgradeable to fully licensed VNS3 controllers deployed as a single application security controller or part of secure network edge mesh

    Still Not Convinced?

    Cohesive’s NATe offers a dramatically more cost-efficient solution to often critical NAT requirements in cloud deployments of all shapes and sizes. NATe is more flexible, more scalable, and easier to manage than first-party cloud NAT gateways that are charging you a premium for the functionality of a standard consumer modem. If you don’t believe us, we launched a free version of our NATe offering in both the AWS and Azure marketplaces so you can launch and configure them and see for yourself!

    Have questions about set-up or pricing? Please to contact us.

    Managing DNS for Remote VPN Users in AWS Route53 with VNS3

    Managing DNS for Remote VPN Users in AWS Route53 with VNS3

    Managing DNS can be a fairly complex and daunting task. Installing and configuring Bind takes time and knowledge and requires maintenance. Infoblox is expensive and likely overkill for smaller projects. Cloud vendors like AWS have simplified offerings that allow ease of use and scale with your needs. They offer public and private zone management with features like split horizon. Split horizon allows Domain Name Systems to provide different information based on the source address of the requestor. For example, if you are coming from the internet at large you would receive the public IP address of the named system you are looking up, but if you were in the same private subnet as that system you would receive it’s private IP address. This allows you to define the how users get to systems depending on where they are.

    Let’s take the example of a remote VPN connection. With VNS3 People VPN you can easily connect your workforce to your cloud assets, be they across regions and or vendors. Giving you a secure entry point to your companies computational resources. VNS3 makes it easy to push DNS settings to connected clients so that they are told that their DNS server is the address of the VNS3 security controller. So now we have connected clients making DNS calls to VNS3. But hold on VNS3 isn’t a DNS server. Well it can be through it’s plugin system, but thats a different topic for another blog post. In this scenario we can divert all incoming DNS traffic through use of the VNS3 firewall.

    Cohesive Networks VNS3 Controller Connectivity
    Lets say that our VNS3 overlay address space is 172.16.0.0/24, this is what we are using for our remoteVPN users, and our VPC is 10.0.0.0/24. In this case there are two addresses that we care about. 172.16.0.253 is the Virtual IP of the VNS3 security controller and 10.0.0.2 which is the AWS VPC Route53 Resolver or DNS endpoint. In AWS the DNS endpoint will always be the .2 for your VPC address space. So our firewall rules will look like this:

    PREROUTING_CUST -i tun0 -p tcp -s 172.16.0.0/24 –dport 53 -j DNAT –to 10.0.0.2:53
    PREROUTING_CUST -i tun0 -p udp -s 172.16.0.0/24 –dport 53 -j DNAT –to 10.0.0.2:53

    Here we are saying that traffic coming in on the tun0 interface (overlay network) from 172.16.0.0/24 (overlay address space) bound for UDP and TCP port 53 (DNS) should be forwarded to 10.0.0.2 on UDP and TCP port 53 (AWS VPC DNS endpoint).

    Ok so now that we have our remote VPN DNS requests being diverted to the VPC DNS endpoint we need to configure our responses. In Route53 you can configure any zone name you want so long as it isprivate. For public zones you will need to own the domain name. But for private zones you can do whatyou want. This can be very useful where you might have a secure IPSec connection to a partner network and want to use DNS names that reflect your partners name and configure addresses across your tunnels. You can set up as many private zones as you want. Once they have been setup it is now just a mater of associating them with the VPC that your VNS3 security controller resides in. you will now have custom DNS naming for your remote workforce.

    Managing AWS Workspaces with VNS3

    Cloud and network virtualization have created the opportunity to have virtual networks that transit your applications and staff to, through and across the clouds. These networks can stretch across the globe in multiple, to 10s of locations (points of presence) or more. In the case of Cohesive Networks our virtual networks are used to create cryptographically secure overlay networks in full mesh architectures. When implementing the cryptographic mesh (at scale machine-to-machine VPN) it is critical that the cryptographic credentials can be easily managed across the controller mesh. Our goal at Cohesive is to make managing the credentials straightforward and clear; associating credentials with users via tagging, enabling/disabling so that credentials can only be used when desired, checked out/in state to help manage via automation, check log information for specific credentials, and manage certificate revocation. Below is a short video showing the key elements of straightforward key state management in an N-way VNS3 controller mesh.

    Hopefully the video highlights the essential key state management capabilities we have strived for. They are part of the foundation of the VNS3 Controllers which are used to build a wide array of service edge use cases. VNS3 encrypted topologies combined with our plug and play security system, you or your management service provider can achieve both Workload and Workforce mobility using secure network virtualization.

    AWS re:Invent 2019 Recap

    AWS re:Invent 2019 Recap

    AWS Reinvent photo

    Last week was AWS’s annual reinvent conference in the putatively beautiful and blissful Las Vegas. Andy Jassy, Amazon’s CEO, announced plenty of new products and features to excite and alarm the computing and soft-warring world. The conference also highlighted AWS’s leadership in highly resilient software architecture and design with their launch of the AWS Builders’ Library. Let’s run over some of the highlights.

    Cloud Descending Back to Earth via New Edge Environments: AWS Local Zones, Outposts, and Wavelength

    AWS launched two new environment types this year with AWS Local Zones and Wavelength. Local Zones was spurred by AWS customers requiring ultra-low latency for their compute, notably gaming companies based in L.A., where the first Local ZOne is now generally available. New zones will come online as customer demand in a city necessitates. Wavelength is an AWS environment colocated with telecom infrastructure, providing access to 5G endpoints. The general availability of AWS Outposts, a rack of AWS servers providing AWS on-premise, was also announced, enabling the rollout of Local Zones and Wavelength in fairly short order. AWS Outposts enable companies to test deployments in cloud-like environments without fully committing to the cloud, and give customers like Morningstar and Philips Healthcare ultra-low latency, hyper-local availability zones.

    These environments showcase a new battle for the edge. AWS basically won the general compute cloud race, but we now find different telecommunication and networking competitors offering edge environments, with startups the likes of Packet and Vaper IO joining the race. As developers gain access to these new endpoints, along with increased networking capabilities and incredibly low hyper-local latencies, we are sure to see a revolutionary new age of applications and services.

    We Have a Size for That: New Compute Instance Types

    Amazon launched multiple new instance types including Graviton2 instances and EC2 Inf1 instances. The new Graviton2 boast a whopping 40% price performance improvement. They are based on the ARM architecture, effectively challenging Intel and AMD’s dominance in the chip space, and combined with the Nitro System security chip to support encrypted EBS storage volumes by default. The EC2 Inf1 instances are dedicated Machine Learning training instance types, effectively challenging Nvidia’s domination of the market with their GPUs. AWS promises that these chips provide a significant increase in throughput and price performance relative to Nvidia-powered instance types.

    AWS Continues to March into SaaS Markets With New Machine Learning Services

    Also announced were multiple ML based services including Code Guru for automated code reviews, Fraud Detector for automated fraud detection, Kendra for search indexing, Transcribe Medical for call transcription in the medical industry and Augmented AI for AI workflows requiring human intervention. You would be hard pressed to find a SaaS market Amazon isn’t capable of stepping into with their army of engineers and data scientists.

    The release of the SageMaker IDE and SageMaker Debugger seems to be an attempt by AWS to capture the hearts and minds of data scientists with the promise of streamlining the building, training, debugging, deployment, and monitoring of Machine Learning models. This new IDE bypasses the need for users to understand and deploy a Python or R environment, enables progress reporting for long jobs, promises a simplified and automated debugging process, automates alerts about input data drift, and auto-trains your ML model from CSV data files. In early use, the IDE has proven to come with a steep learning curve and a high deal of complexity of use. The SSO feature, notably, only seems to work with newer AWS accounts. According to VentureBeat , the IDE provides “some features that appear to be just rebrandings of older products and some that solve new, legitimate customer pain points. Even the best new features are incremental improvements on existing products.”

    Reducing Cloud Anxiety With New Security-Focused Services

    It seems Amazon has heard the cries of its customers as they struggle to manage the complexity of their cloud environment’s security. They announced Amazon detective, Macie , and IAM Access Analyzer to review organizational security lattices and catch any potential privilege or access issues. IAM Access Analyzer helps to solve misconfiguration problems, one of the most common problems with AWS deployments, and can purportedly monitor and evaluate thousands of security policies across a deployment environment in seconds.

    Thought Leadership in Designing Resilient Software Systems

    Amazon showed some responsibility for their dominance of the cloud with their release of the AWS Builders’ Library. A number of sessions at re:Invent included references to their cell-based architecture approach and explained how AWS achieves high uptime numbers for their most important services.