Securely Federating Cloud Services with VNS3

Securely Federating Cloud Services with VNS3

Service Endpoints are a great concept. They allow you to access things like s3 buckets in AWS from within your VPC without sending traffic outside of it. In fact, from a compliance perspective they are essential. Both Amazon Web Services and Microsoft Azure have them. One drawback in AWS is that they can only be accessed from the VPC in which they have been set up. But what if you wanted to access that s3 bucket securely from another region or from an Azure VNET? Perhaps you have an Azure SQL Data Warehouse that you want to access from your application running in AWS. Service Endpoints have their limitations. For many companies that are developing a multi cloud, multi region strategy, it’s not clear how to take advantage of this service. We at Cohesive Networks have developed a method that allows you to access these endpoints from across accounts, regions and across cloud providers. This blog post will discuss in detail how we achieve these ends.

Using AWS Private Service Endpoints

In order to interact with AWS Service Endpoints you need a few things. You need DNS resolution, which needs to occur from inside our VPC, and you need network extent, or the ability to to get to whatever address your DNS resolves to. Both of these two conditions are easy to achieve from any VPC or VNET using VNS3 configured in what we call a Federated Network Topology and utilizing the VNS3 docker based plugin system to run bind9 DNS forwarders. Let’s start by taking a look at the VNS3 Federated Network Topology.

VNS3 Federated Architecture Diagram

The core components that make up the Federated Network Topology are a Transit Network made up of VNS3 controllers configured in a peered topology and the individual Node VNS3 controllers running in the VPCs and VNETs. All controllers are assigned a unique ASN at the point of instantiation. ASN or Autonomous System Numbers are part of what allow BGP networks to operate. We configure the Node controllers to connect into the Transit Network via route based IPSec connections. By using route based VPNs we can then configure each VNS3 controller to advertise the network range of the VPC or VNET it is in that it wants other networks to be able to get to. This route advertisement gets tied to its ASN which is how other VNS3 controllers know how to get to its network. This gives us the network extent that we need so that even in a complex network comprised of tens to hundreds to thousands of virtual networks spread across accounts, regions and cloud providers we have have a manageable network with minimal complexity. 

In order to interact with a AWS Service Endpoints you need a few things. You need DNS resolution, which needs to occur from inside our VPC, and you need network extent, or the ability to to get to whatever address DNS resolves to. Both of these two conditions are easy to achieve from any VPC or VNET using VNS3 configured in what we call a Federated Network Topology and utilizing the VNS3 docker based plugin system to run bind9 DNS forwarders. Let’s start by taking a look at the VNS3 Federated Network Topology.

VNS3 Transit Diagram explainer

Setting up DNS Resolution

The next component we need is a system to return the correct localized DNS response. AWS uses what is called split horizon DNS, where you will get a different response based on where you are making the request from. That is to say, if you make the DNS call from outside you will get the public facing IP address versus if you make the call from inside you will get the private IP address. Say we are in an Azure VNET in US West and need to access an s3 bucket in us-east-1. You would install the AWS command line tools (CLI) on either your linux or windows virtual machine and run something like:

aws s3 ls s3://my-bucket-name

to get a current listing of objects in your s3 bucket. But you want this interaction to be routed across your secure encrypted network. How would you get the DNS to resolve to the private entry point inside of your sealed VPC in AWS us-east-1 rather than the AWS public gateway? The answer lies in DNS.

VNS3 has an extensible plugin system based on the docker subsystem. You can install any bit of linux software that you want into a container and route traffic to it via its comprehensive firewall. So here we can install Bind 9, the open source full-featured DNS system, into the plugin system. We can configure Bind 9 to act as a forwarder filtered on a certain dns naming patterns. In this case we would be looking for the patterns of and which we will configure to forward on to the plugin address of the VNS3 transit controller running in AWS us-east-1 that is configured to forward all of it’s incoming requests down to the AWS supplied DNS at x.x.x.2 of of it’s VPC. This will return the correct IP of the s3 Service Endpoint that has been configured in the transit VPC in AWS us-east-1. So when the AWS CLI makes the call for “s3://my-bucket-name” the first action that takes place is a DNS call of: “What is the IP to interact with the service for?”, next it will attempt to connect to that IP, which we have made possible as we have created the network extent to that address. From there you can do all the things that you need to do in regards to the bucket. There are some other configuration items that need to be put in place as well. You would need to either configure your whole VNET subnet that the virtual machine resides in or the individual network interface of your VM to point to the private IP of your VNS3 controller as it’s DNS source. The firewall of VNS3 will need to be configured to send incoming TCP and UDP port 53 traffic to the container running Bind 9. And you will need to setup routing rules for your subnet or network interface to point to the VNS3 controller for either the explicit addresses of Service Endpoints in AWS or send all traffic through the VNS3 controller. The latter has extra benefits as VNS3 can act as your ingress/egress firewall, NAT and traffic inspector by adding further plugins like WAF and NIDS.


The above is an illustration of one use case, accessing an AWS s3 bucket from Azure across an encrypted network with full visibility and attestability. Other possibilities include the entirety of Service Endpoints offered by Azure and AWS and the mechanics are the same whether AWS to Azure, vise versa, or across regions inside a single cloud provider. The take away is that VNS3 has powerful capabilities that allow you to create secure extensible networks with reduced complexity and inject network functions directly into them that allow you to take advantage of cloud provider services in an agnostic way.

AutoRecovery in the Public Cloud

“Everything fails, all the time.” – Werner Vogels

While VNS3 is extremely stable, it is not immune to the underlying hardware and network issues that public cloud vendors experience. VNS3 provides a variety of methods to achieve High Availability and instance replacement. However all of that takes place above the customer responsibility line. What can you do for your cloud deployment to protect yourself from the inevitable failures that take place below the line?

On top of the solutions offered by public cloud providers, Cohesive Networks offers a variety of methods for achieving instance and network recovery, whether it be BGP distance weighting, Cisco style Preferred Peer lists or our Management Server (VNS3:ms) which will programmatically replace a running instance or facilitate Active & Passive running of VNS3 instances. Keep an eye on this blog space for further discussions in these key areas.

AutoRecovery in AWS via CloudWatch

Amazon Web Services has perhaps the most comprehensive function for protecting yourself from underlaying failures. They offer what they call a CloudWatch alarm action. This monitor is tied to your instance ID, should AWS status checks fail, your instance will be brought up on new hardware, while retaining its instance ID, private IP, any Elastic IPs and all associated metadata. You get to set the periodicity of the check and the total checks that will kick off the migration. So if you need to have assurance that you instance will get moved to good hardware after as little as two minutes, you can set it as such. From a VNS3 perspective, this ensures that any IPSec tunnels will get reestablished, any overlay clients will reconnect and any route table rules pointing to the instance will maintain health once the instance has recovered. On top of all of this you can configure it to publish any alarm states to an SNS topic so that you receive notification should this occur. Cohesive Networks highly recommends that you set this up for all VNS3 controllers and Management Servers.

You can find out more about configuring AWS CloudWatch alarm actions here:

Service Healing in Azure Cloud

The Microsoft Azure cloud has the concept of “Service Healing.” While it is not user configurable, it is not dissimilar from AWS in that Azure has a method whereby it monitors the underlaying health of the virtual machines and hypervisors in it’s data centers and will auto recover virtual machines should they or their hypervisors fail. This process is is managed by their Fabric Controllers which themselves have built in fault tolerance. As of now Azure does not provide any user controls over this process nor notifications and the process can take up to 15 minutes to complete, since the first action is to reboot the physical server that the virtual machines run on and failing that will then proceed to migrate VMs to other hardware. Azure does state that they employ some level of deterministic methodologies for pro-active auto-recovery.

Live Migration in Google Cloud

The Google Cloud Platform has taken a fairly different approach. Over at the Google cloud all instances are set to “Live Migrate” by default. So should there be a hardware degradation and not a total failure, your VM will be migrated to to new hardware with some loss of performance during the process. If there is a total failure your VM will be rebooted onto new hardware. This also applies to any planned maintenance that might effect they underlaying hardware your VM is running on. As with AWS and Azure all of your instance identity will transfer with the VM such as IPs, volume data and metadata. Should you want to forgo the “Live Migration, you can configure your instances to just reboot onto new hardware. All failed hardware events in GCP are logged at the host level and can be alerted on. 

Managing AWS Workspaces with VNS3

Cloud and network virtualization have created the opportunity to have virtual networks that transit your applications and staff to, through and across the clouds. These networks can stretch across the globe in multiple, to 10s of locations (points of presence) or more. In the case of Cohesive Networks our virtual networks are used to create cryptographically secure overlay networks in full mesh architectures. When implementing the cryptographic mesh (at scale machine-to-machine VPN) it is critical that the cryptographic credentials can be easily managed across the controller mesh. Our goal at Cohesive is to make managing the credentials straightforward and clear; associating credentials with users via tagging, enabling/disabling so that credentials can only be used when desired, checked out/in state to help manage via automation, check log information for specific credentials, and manage certificate revocation. Below is a short video showing the key elements of straightforward key state management in an N-way VNS3 controller mesh.

Hopefully the video highlights the essential key state management capabilities we have strived for. They are part of the foundation of the VNS3 Controllers which are used to build a wide array of service edge use cases. VNS3 encrypted topologies combined with our plug and play security system, you or your management service provider can achieve both Workload and Workforce mobility using secure network virtualization.

AWS re:Invent 2019 Recap

AWS re:Invent 2019 Recap

AWS Reinvent photo

Last week was AWS’s annual reinvent conference in the putatively beautiful and blissful Las Vegas. Andy Jassy, Amazon’s CEO, announced plenty of new products and features to excite and alarm the computing and soft-warring world. The conference also highlighted AWS’s leadership in highly resilient software architecture and design with their launch of the AWS Builders’ Library. Let’s run over some of the highlights.

Cloud Descending Back to Earth via New Edge Environments: AWS Local Zones, Outposts, and Wavelength

AWS launched two new environment types this year with AWS Local Zones and Wavelength. Local Zones was spurred by AWS customers requiring ultra-low latency for their compute, notably gaming companies based in L.A., where the first Local ZOne is now generally available. New zones will come online as customer demand in a city necessitates. Wavelength is an AWS environment colocated with telecom infrastructure, providing access to 5G endpoints. The general availability of AWS Outposts, a rack of AWS servers providing AWS on-premise, was also announced, enabling the rollout of Local Zones and Wavelength in fairly short order. AWS Outposts enable companies to test deployments in cloud-like environments without fully committing to the cloud, and give customers like Morningstar and Philips Healthcare ultra-low latency, hyper-local availability zones.

These environments showcase a new battle for the edge. AWS basically won the general compute cloud race, but we now find different telecommunication and networking competitors offering edge environments, with startups the likes of Packet and Vaper IO joining the race. As developers gain access to these new endpoints, along with increased networking capabilities and incredibly low hyper-local latencies, we are sure to see a revolutionary new age of applications and services.

We Have a Size for That: New Compute Instance Types

Amazon launched multiple new instance types including Graviton2 instances and EC2 Inf1 instances. The new Graviton2 boast a whopping 40% price performance improvement. They are based on the ARM architecture, effectively challenging Intel and AMD’s dominance in the chip space, and combined with the Nitro System security chip to support encrypted EBS storage volumes by default. The EC2 Inf1 instances are dedicated Machine Learning training instance types, effectively challenging Nvidia’s domination of the market with their GPUs. AWS promises that these chips provide a significant increase in throughput and price performance relative to Nvidia-powered instance types.

AWS Continues to March into SaaS Markets With New Machine Learning Services

Also announced were multiple ML based services including Code Guru for automated code reviews, Fraud Detector for automated fraud detection, Kendra for search indexing, Transcribe Medical for call transcription in the medical industry and Augmented AI for AI workflows requiring human intervention. You would be hard pressed to find a SaaS market Amazon isn’t capable of stepping into with their army of engineers and data scientists.

The release of the SageMaker IDE and SageMaker Debugger seems to be an attempt by AWS to capture the hearts and minds of data scientists with the promise of streamlining the building, training, debugging, deployment, and monitoring of Machine Learning models. This new IDE bypasses the need for users to understand and deploy a Python or R environment, enables progress reporting for long jobs, promises a simplified and automated debugging process, automates alerts about input data drift, and auto-trains your ML model from CSV data files. In early use, the IDE has proven to come with a steep learning curve and a high deal of complexity of use. The SSO feature, notably, only seems to work with newer AWS accounts. According to VentureBeat , the IDE provides “some features that appear to be just rebrandings of older products and some that solve new, legitimate customer pain points. Even the best new features are incremental improvements on existing products.”

Reducing Cloud Anxiety With New Security-Focused Services

It seems Amazon has heard the cries of its customers as they struggle to manage the complexity of their cloud environment’s security. They announced Amazon detective, Macie , and IAM Access Analyzer to review organizational security lattices and catch any potential privilege or access issues. IAM Access Analyzer helps to solve misconfiguration problems, one of the most common problems with AWS deployments, and can purportedly monitor and evaluate thousands of security policies across a deployment environment in seconds.

Thought Leadership in Designing Resilient Software Systems

Amazon showed some responsibility for their dominance of the cloud with their release of the AWS Builders’ Library. A number of sessions at re:Invent included references to their cell-based architecture approach and explained how AWS achieves high uptime numbers for their most important services.

Announcing AWS Quick Start Reference Deployment for VNS3

Announcing AWS Quick Start Reference Deployment for VNS3

Want a HIPAA/HITECH compliant application deployed to AWS in minutes? Read on!

We’re proud to announce the release of our first AWS Quick Start reference deployment for configuring and launching our VNS3 overlay network for your cloud application. Working closely with Amazon we’ve leveraged the proven power of AWS CloudFormation to take our secure and scalable solution and make it even more accessible. With our Quick Start deployment, VNS3 can easily secure your cloud application to HIPAA and HITECH standards in as few as fifteen minutes, supported by best practice tools and strategies for automating your infrastructure deployments.

Check out our Quick Start Guide here! Keep reading for more information about this release.

VNS3 AWS Quickstart Architecture

Save Time

Our Quick Start was built by AWS and Cohesive Networks solutions architects to help you automatically deploy a VNS3 topology quickly and easily. Don’t worry about high availability and security, we’ve included it for no extra charge! Build your production deployment fast and start using it now.

Reduce Complexity

Simple (not to be confused with simplistic) is secure. VNS3 provides a generalized approach to encryption across your cloud deployment. This enables you to field a clean VPC Route Table and Security Group configuration to reduce attack surface and minimize misconfigurations.

Control Encryption

AWS provided and controlled, symmetric encryption with common shared keys isn’t enough for regulated industries. Customer controlled encryption with VNS3 is essential to securing PII/PHI in order to pass HIPAA audits. VNS3 as demonstrated in this Quick Start Guide provides a simple and programmatic way for achieving HIPAA compliance.

Added Bonus

Do you use blocked protocols like UDP multicast? The VNS3 encrypted overlay network deployed by this guide allows you to redistribute UDP multicast within your AWS VPC deployment. Now you can apply the same design principles to your cloud applications, whether designing cloud native or lifting and shifting.

Moving Forward

Following the successful launch of our first AWS Quick Start Guide, we’re excited to move forward and create new reference deployments for all the various use cases VNS3 supports. We’re cooking up AWS Quick Start Guides that deal with more complex peered VNS3 topologies, demonstrating different High Availability and Network Federation capabilities. We are also working on an Azure QuickStart template for deploying the encrypted Overlay Network for Microsoft Windows VMs later this summer.

AWS re:Invent Recap

AWS re:Invent Recap

AWS REinvent 2018

We’ve been heads down working on the 3 P’s for a number of months (products, presence, and people). As a result we’ve all but stopped our social media and dynamic content. We’ll look to emerge from our cocoon in early 2019 but we had to pop out and do yet another re:Invent recap (YArIR!).

Cohesive Networks (and our parent company CohesiveFT) have attended/sponsored all AWS re:Invents. Each year the conference gets denser yet more spread out… think about that one. This year was no exception. Now that our “away team” is fully recovered from the ill effects of desert entertainment, had some time to reflect, and get our hand dirty trying out a few new services, we’re ready to state our opinion. That’s what the following is, the opinion of the smartest, coolest, and most experienced cloud networking experts in the game (see opinion).

Micro Blink Reaction – Crowd Sourcing the Self-driving Algos

AWS DeepRacer is awesome and the DeepRacer League is hilariously brilliant. I ordered my discounted DeepRacer a few seconds after it was announced during Andy Jassy’s keynote. The bummer is I won’t take delivery until March. Hopefully the simulation environment holds me over (request preview access).

Macro Blink Reaction – AWS appetite for its ecosystem grows

AWS continues to eat the ecosystem and this year they stepped up their game. Previous years had AWS entering markets and wiping out millions of $s in ecosystem players. This year we think the number is in the capital B BILLIONS.

As a member of the AWS Partner Network (Advanced Technology Partner), we, like all AWS partners, look to re:Invent every year with mixed feelings of excitement and dread. If you aren’t on the Customer Advisory Council, you never really know if this is the year AWS will announce a direct competitor to your business. We all know the risks, and the AWS “not built here” corp dev mentality that drives their roadmap, but there is too much opportunity not to participate. Multi-cloud helps, but AWS is still the King of Cloud both in usage and features/services. I won’t go into detail about what competes with whom, take a look at these other recap posts:

Specific Announcement Reactions

We also won’t cover all the announcements because of the number of announcements per service category.

  • App Integration – 2
  • Analytics – 4
  • Compute – 11
  • Databases – 6
  • Developer Tools – 2
  • IoT – 7
  • ML – 14
  • Management – 6
  • Marketplace – 3
  • Media – 1
  • Migration – 2
  • Mobile – 1
  • Networking – 6
  • Robotics – 1
  • Satellite – 1
  • Security/Identity – 2
  • Storage – 10

Below we’ll review the features and service announcements that piqued our interest from a security and networking perspective.

Transit Gateway (GA)

What is it?
An AWS managed gateway service that allows a hub-and-spoke network topology connecting VPCs in the same region (expect multi-region support in the future) owned by a single or multiple AWS accounts as well as remote networks. This offering replaces the multi-party solution that was previously being offered called the AWS Global Transit Network. Check out the Transit Gateway announcement blog or product home for more information.

Why it matters?
Transit gateway solves a significant number of issues around the need to be able to route between VPCs “in cloud” at AWS. The manner in which it has been solved creates an economic opportunity for AWS as well – charging $.05 per hour for each connection to the gateway.

For Cohesive Networks, we spend our days (and nights) helping customers Connect, Federate, and Secure. Just like the introduction of the VPC itself, Direct Connect, AZs, Regions, GovCloud, China, and all the related facets of AWS – this creates more demand for connecting, federating, and securing. “Transit” is a subset of the overall federation architecture, so definitely a feature – not a business, meaning this release is good news for Cohesive, and gives us parity with capability Azure and Google networking has had for some time (although they do it a bit differently).

The release of Transit Gateway lets us create some federation structures for customers that were previously too complex, and requiring, dare I say it, too many VNS3 controllers needed to complete the task, as a result of AWS networking limitations. Now our customers can spend a bit more money, reduce a little bit of complexity, and still get the attestable control they need as regulated or self-regulated businesses operating in 3rd party data centers over which they have no direct insight, visibility, or control (AKA “the cloud”).

AWS Security Hub (Preview)

What is it?
A monitoring platform service focused on security that aggregates security alerts and compliance status from native AWS services as well as from 3rd party services. Many security vendors announced initial support for Security Hub. Security Hub aims to create a single pane of glass for an organization’s security and compliance posture across all its AWS accounts. Check out the Security Hub announcement blo g or product home for more information.

Why it matters?
AWS Security Hub begins to solve the “feature glut” problem of the ever-growing Amazon services collection. One reason organizations suffer from data exploits is NOT because they lack monitoring information with events and alerts – it is because they have TOO many events and alerts. Security Hub appears that it will provide an encompassing overview of outputs coming from AWS GuardDuty, Inspector and Macie. Each of these has a rich set of features for your cloud deployments – running all three of them independently could be a bit overwhelming.

At Cohesive we have previously highlighted the world we are entering where the critical IT executive decision is “all-in vs. over-the-top”, meaning where on the spectrum of using cloud, AWS for example, do you position your organization? Do you go “all-in” on embedded AWS services which provide abstracted visibility and limited control – or do you go “over-the-top” and run many of your own layers of infrastructure and instrumentation, strung across AWS, Azure, Google, For the “all-in” crowd we think Security Hub may make consuming some of these services easier.

Global Accelerator (GA)

What is it?
A service to help customers easily route traffic across multiple regions to improve availability and performance of cloud-based applications/deployments. Global Accelerator provides an entry point to allow TCP or UDP traffic to use the AWS Global Network to reach AWS deployed application topologies instead of the Public Internet. Global Accelerator provides static Anycast IPs that serve as a fixed entry point for an AWS deployed application available in any number of the currently support regions (us-east-1, us-east-2, us-west-1, us-west-2, eu-west-1, eu-central-1, ap-northeast-1, and ap-southeast-1). The Anycast IPs are advertised from the supported AWS regions so traffic enters the global network as cloud to the uses as possible. Global Accelerator can then be associated with cloud-based applications via application load balancers, network load balancers, or Elastic IPs. In addition to data transfer fees Global Accelerator costs $0.025 per hour.

Why it matters?
Other than the obvious HA and performance benefits, the big theme from this and Transit Gateway is coalescence. Clouds and cloud regions were built to be isolated by design. Increasingly as companies a have grown in the cloud organically or via acquisition, organization cloud estates have experienced sprawl. Providing avenues to bring the regions “closer together” while maintaining the logical separation is a key value for many of AWS’ largest customers.

We continue to experiment how our customers might benefit from using the Anycast IPs as static global cloud endpoint IPs for VPN connections and well as distributed and encrypted overlay networks.

EC2 C5n (GA)

What is it?
A new generation instance family focused on super fast networks speeds up to 100 Gbps. These new instances use the latest nitro hardware and allow for some serious packets per second performance. The instances sizes are available now in us-east-1, us-east-2, us-east-2, eu-west-1, and govcloud. Prices start Read more about the C5n instance family.

Why it matters?
We are getting a glimpse of the future of cloud network performance and throughput. Eliminating the current VPC gateway throughput restrictions will open up more use-cases for the cloud. Total throughput for VNS3 controller just increased dramatically. Of course there are some restrictions (see placement groups) but it’s always exciting when you get a bandwidth upgrade. Maybe AWS will soon host the first cloud-based high speed low latency trading app?