Building The World’s First Application-Aware VPN

Adriano Sela Aviles

February 13, 2025

Max 0min

Border0 is not just an excellent access management tool on the outside… It's built beautifully on the inside, leveraging several cutting edge technologies to make it as efficient, secure, and as enjoyable to use as possible.

‍

This blog post is a technical deep dive into how the core objectives of Border0’s offering are achieved through systematic and careful architecture and product design, as well as leveraging several modern technologies including WireGuard®, parts of Google’s gVisor project, WebAssembly, and more.

‍

What Are We Building?

From the beginning, Border0 aimed to provide a seamless and secure access experience that is minimally disruptive to engineers’ day-to-day work and is extremely easy for administrators to integrate with existing environments and workflows of all types and sizes. We wanted to combine PAM and VPN capabilities into one product, providing the best of both.

‍

To get started, we laid out our product’s minimum requirements, which boiled down to:

• A Secure, End-to-End Encrypted Network Tunnel
• Authentication
• Authorization & Access Controls
• Auditing
• Session Management

‍

So with these basic objectives in mind, we set out to build The World’s First Application Aware VPN.

‍

Let’s get real - it’s no secret that the typical VPN experience really sucks for engineers trying to get sh*t done. You haven’t even finished saying “VPN” and you have your engineers up in arms.

*Your engineers after an hour of dealing with legacy VPNs*

We wanted to change that. So we came up with a few additional, but crucial, non-negotiable requirements centered around the idea that “It must not suck”. We translated this to the following tangible objectives:

• It must save engineers’ time, not waste it
• It must feel snappy and fast
• It must work from everywhere / from anywhere
• It must be fun and easy to use, to such an extent that hobbyists and casual technologists will want to use it for their at-home projects where security is not typically top-of-mind.

‍

So now that we have requirements outlined, let’s go and build a robust access solution that combines the best of VPN and PAM with all the bells and whistles that anyone would want to use.

‍

We’re Gonna Need a Tunnel: Enter WireGuard®

At its most basic core, Border0 is all about access management. So, the first challenge? Building a trustworthy connection between users and their services. We wanted something our customers could instantly trust, and ideally, something that would give us (Border0 as a company) ironclad non-repudiation.

‍

In our previous offering we used signed mTLS certificates for authentication and end-to-end encryption, where each Border0 organization had its own Certificate Authority (CA) and we went to great lengths to ensure that the signing process was strongly secure and auditable, but we learned that this complicated our security story…

‍

Customers would often ask us “Can you see my traffic?” and the answer involved a conversation about our custom super-secure-and-audited signing process and how it was so awesome and secure… but really the answer we would have loved to give is “no, it is simply not possible”. To be clear, our system was secure, but we had to explain ourselves, and this was the problem. This history is why we care so much about non-repudiation for ourselves.

*High-Level End-to-End Encrypted Secure VPN Tunnel*

So we researched secure VPN tunnel technologies and WireGuard was the obvious choice. WireGuard is a fast, secure, and Open Source VPN tunnel technology that uses state of the art cryptography and has a Go implementation - the language our Connector is written in. If you are interested you may read more about WireGuard® here

‍

With WireGuard the story was stupid-simple:

• Devices/connectors generate an asymmetric key-pair locally
• The device/connector publishes its public key to the Border0 API, while the private key never leaves the device/connector
• Devices/connectors talk WireGuard to each other

*Diagram of the Border0 API acting as a broker for communication between devices and connectors*

Having an off-the-shelf secure tunnel, we could focus our efforts on building features of real value for our customers.

‍

The Connector: Where the Magic Happens

So now that we have a secure VPN tunnel with WireGuard all we have to do is record traffic received at the Connector and we’ll have rich audit logs right? ... right? Well… no.

‍

The problem is that WireGuard, like most VPN technologies, operates at the Network Layer (L3). In other words, WireGuard deals with IP packets.

‍

If we recorded that traffic all we would have is flow logs e.g. source and destination IP addresses, protocol, and source and destination port numbers. This is useful information, but at best, it allows us to figure out who is performing actions against which service. It doesn’t allow us to figure out who is performing actions against which application-specific resources. For example, we would know that Lily’s device is connected via SSH to a production server, but we wouldn’t know what commands she is running. We would know that Jake’s device is talking to the production database, but we wouldn’t easily know what queries he is running, and much less, be able to block queries if they are noncompliant with policy.

‍

So how do we do it then? How is it that Border0 provides deep application layer activity logs, policy enforcement, session recordings? The only feasible way to do it is using application layer reverse-proxies. In order to apply policy enforcement to your MySQL server, Border0 needs to run a special MySQL server on the Connector, which applies policy enforcement to queries before proxying them to your MySQL server upstream. The same is true for SSH, HTTP, the other database protocols and so on.

‍

We knew we needed an application-layer reverse-proxy, but again, the issue is that WireGuard deals with IP packets. We cannot easily reconstruct TCP connections out of IP packets in software can we? We would have to deal with IP fragmentation and reassembly, multiple transport protocols, TCP’s in-order delivery, error detection and correction, and face a myriad of edge cases initially and continuously.

‍

Luckily for us, we didn’t have to implement a software TCP/IP stack - someone else already did! As part of gVisor, the smart folks at Google have built Netstack, an Open-Source userland TCP/IP stack that supports both IPv4 and IPv6. Netstack allows us to listen for TCP connections on a given IP and port. Having TCP listeners, we could simply plug-in our application layer reverse-proxies, and would be able to provide policy enforcement, recordings, and application-aware magic all around.

‍

Connecting WireGuard-go and Netstack was fairly simple. We just had to extend the WireGuard’s Device implementation such that packets received at a Connector with a destination IP in a special private IP range reserved for Border0 services would be handled by Netstack (instead of being written to the kernel as normal network traffic would). The end result: a secure VPN tunnel capable of providing application-layer policy enforcement and session recordings!

‍

To summarize, here’s how it works:

• Whenever you create a Border0 socket (like an SSH or database service), we assign a pair of static private IP addresses to it e.g. 100.126.0.1 and fd62:6f72:6465:7231::1.

• When the connector receives traffic destined to an address in the services ranges e.g. 100.126.0.0/24 or fd62:6f72:6465:7231::/64, the WireGuard-go code forwards packets to Netstack.

• In Netstack the traffic lands in the TCP listener for the service corresponding to the destination IP of the traffic, where it is handled by a reverse-proxy of the same type as your underlying service e.g. SSH reverse-proxy for SSH servers, MySQL reverse-proxy for MySQL servers, and so on.

• The reverse-proxy performs Border0 policy evaluation against the specific query/request to ensure that the human or entity tied to the source device is allowed to perform the action being performed against the service.

• If the action is allowed, the query/request is forwarded to your upstream server.

The application-layer reverse-proxies served over the secure VPN tunnel are what make Border0 an Application-Aware VPN. Sometimes, it is useful to think of the Border0 Connector as an application-layer firewall, where Border0 Policies are the firewall’s rules.

‍

Finding Your Peers

We have a solid PAM solution with authentication, authorization, auditing, and session management – all built on top of a secure VPN tunnel. This is awesome, but we aren’t anywhere near done. Now we must move on to the “it must work everywhere” requirement. This is harder than it seems at first thought… we have to come up with a communication scheme that works for devices in connectors in any type of network.

‍

In short, the way devices communicate with one another is by sharing their public IP address and port used for WireGuard with the Border0 API, which then takes care of distributing these details to other peers in the network. Once peers learn each other’s IP address and port, they can communicate in a peer-to-peer fashion.

‍

But what about when a peer device is behind a NAT gateway and doesn’t know its public IP address? Or even worse, what about if outbound UDP, which WireGuard uses under-the-hood, is blocked by the network administrator? What do we do?

‍

We’ll get there, but before we do, we must briefly touch up on NAT.

‍

A Word About NAT

Typically, home-grade router devices are shipped with something called Network Address Translation (NAT) and a stateful firewall. NAT translates private-to-public IP addresses and vice versa, while the stateful firewall ensures your devices don’t receive any unsolicited traffic from the Internet. This is a good thing - it means random things on the Internet cannot talk to your laptop. That is, until your laptop talks to that random thing first!

‍

For example, say your laptop just used port 4242 to talk to YouTube your stateful firewall will allow YouTube to talk back to your laptop’s port 4242 for a set amount of time, usually ~30 seconds, and is extended for more time every time traffic is sent (or received provided the original timer did not expire). In layman's terms, when your laptop talks to an Internet server, a hole is punched in your stateful firewall for that server to reply to your laptop.

‍

NAT “hole-punching” is easy and always works well when your device is talking to something on the Internet (i.e. with a public IP address), like most consumer APIs or web applications. Similarly, assume two WireGuard peers (i.e. Border0 devices/connectors) want to communicate with one another while one of them has a public IP and one of them sits behind a NAT gateway. They may do so easily after the peer behind NAT has “hole-punched” the gateway. The peer with the public IP can simply reply to the peer behind NAT using the address and port that it saw on the incoming traffic.

‍

WireGuard actually does some work in terms of aiding NAT-traversal. Whenever a “connection” between two peers is being established (or re-established), peers will send handshake messages repeatedly on an interval in case the message is lost to the network for reasons including being dropped by the remote peer’s NAT gateway. This helps with NAT traversal because, eventually, the remote peer will have punched a hole in its own NAT gateway and the next message will arrive at the peer. WireGuard also has built-in support for peers sending “persistent-keepalive” messages to one another, which helps peers maintain their NAT / stateful-firewall mapping (i.e. their hole punched) even if there is no real WireGuard traffic being sent or received.

‍

We’ll just have the Border0 device or connector with the public IP address (and port) send its details to the Border0 API; the Border0 API will distribute that to the other peer, and they’ll be able to establish a WireGuard connection. That ought to work, right? Well… what if both device and connector are behind NAT (and a stateful firewall). Do we just give up? Of course not.

‍

Let’s frame our challenge as a question: “how can devices behind NAT learn what public IP address and port are mapped to their private IP address and port used for WireGuard?”

‍

STUN

The answer is a protocol called Session Traversal Utilities for NAT (STUN). Despite the fancy name, STUN is one of the simplest protocols you could ever imagine. A STUN server can tell you what your public IP address and port are (from the perspective of the STUN server).

‍

It goes something like this: A STUN client sends a STUN message containing a “Binding Request” to a STUN server, which responds with a STUN message containing a “Binding Response”. The response includes the client’s public IP address from the perspective of the STUN server. This is how each device or connector learns its own public IP address and port even when it is in a private network behind a NAT gateway.

‍

Each device figures out its own UDP addresses (i.e. IP:port) for IPv4 and, when available, also IPv6, and then publishes them to the Border0 API, which takes care of distributing relevant peers in your Border0 network.

‍

However, we’ve now run into a bit of a pickle: most NAT gateways employ what is known as SNAT. In SNAT, a private ${PRIVATE_IP}:${PRIVATE_PORT} is mapped to a ${PUBLIC_IP}:${PUBLIC_PORT} and a different ${PRIVATE_IP}:${PRIVATE_PORT} will be mapped to the same public IP (in the basic SNAT case) but different public ports. And to top it off, the ports on the public side of things are not guaranteed to be deterministic, for example:

• ${PRIVATE_IP}:1234 maps to ${PUBLIC_IP}:9876

• ${PRIVATE_IP}:2345 maps to ${PUBLIC_IP}:8765
‍

For WireGuard traffic, we need to know the exact port on the public side of the NAT gateway (there’s 2^16 possible ports! Guessing is extremely resource heavy, and simply unfeasible to do when there’s a stateful firewall added to the mix).

‍

The way to make sure devices can learn their own public address (along with the right port for WireGuard), is to ensure that STUN probes leave the device/connector via the same port that will be used for WireGuard®, or else each peer would need to guess what port each other’s NAT gateways assigned to its WireGuard port. Luckily, WireGuard®-go is extremely extensible and we can modify it in such a way that WireGuard’s Bind implementation also handles sending and receiving STUN messages from the same port used for WireGuard traffic.
‍

Try All 100+ Flavours Today!

The thing with NAT is… it comes in different flavors - too many of them! We could dedicate a whole blog (honestly, maybe even a 10 chapter book) to NAT types, traversal techniques, and the various scenarios and workarounds that exist, but we’ll leave that for a different blog. Instead here we are focusing on how to get around the vast majority of the NAT gateways in the wild, without delving too deep into the subject.

‍

*Ideally one would simply traverse a NAT using STUN*

Some of these implementations are much harder to hole-punch than others. In particular those implementations with “endpoint-dependent mapping” behavior. Endpoint-dependent behavior means that even if your internal port remains constant, communicating to different IP addresses (what is meant by “endpoints” in “endpoint-dependent”) will produce different mappings.

‍

For example, say you have a UDP socket on your laptop (let’s say on the local address 192.168.0.6:61726), and you send packets to two different IP addresses, let’s say 1.1.1.1 and 2.2.2.2. The traffic received at 1.1.1.1 might see your packets’ source address as 6.6.6.6:4000, while the traffic received at 2.2.2.2 might see your packets’ source address as `6.6.6.6:5000` (a different port).

‍

This is a problem for us. If we only relied on STUN we would be making the following fatally incorrect assumption: “a client’s address from the perspective of a STUN server is the same as the client’s address from the perspective of the whole Internet”. If the client is behind an endpoint-dependent NAT gateway, this assumption breaks apart. The STUN server and your WireGuard peer are two different “endpoints” and thus see your traffic’s addresses as being different; even if the client uses the same outbound port to communicate to the STUN server and the WireGuard peer.

‍

There are certain scenarios, where no matter how hard you try, peer-to-peer connectivity is simply not possible. It might not necessarily be your NAT to blame, you may simply be in an environment or network where UDP is blocked.

‍

I guess we’ve failed then? Not even close.

‍

WireGuard Relay Service

For scenarios where peer-to-peer connectivity is not possible, we have built our own custom WireGuard traffic relay service.

‍

The relay is a simple WebSocket-over-TLS server that exposes a mechanism for sending and receiving data to other machines, addressed by their public key. Devices (and connectors) authenticate against the relay by using their private key to solve a challenge. At a high level it goes something like this:

• A device connects to the WebSocket server and sends its public key
• The server replies with a challenge: random bytes encrypted with the public key
• The device decrypts the random bytes (proving they hold the private key) and sends the solution back to the server
• The server verifies that the solution indeed solves the challenge

‍

At this point the server knows that the connection corresponds to the public key, and any messages received at the relay destined for this key will be sent over this connection. Any messages received at the relay from this connection will be labelled as being from this public key. Note that the traffic we are dealing with is WireGuard (which provides a secure tunnel by itself), so Man-In-The-Middle attacks are not really a concern. Additionally, the handshake is performed over TLS with certificates from a trusted CA, so that clients can be sure they are communicating with the real relay server. After authentication is achieved, a custom binary protocol is used for forwarding WireGuard data messages to peers, along with other control plane messages like keepalives, etc.

‍

The neat thing about using a WebSocket-over-TLS server, is that this relayed VPN traffic looks like ordinary HTTPS traffic and will defeat most firewalls out there, ensuring that our VPN works everywhere.

Finding the Path of Least Resistance

Let’s recap. We have two peers (a client device and a connector) who have used the Border0 API to become aware of each other, their UDP addresses and public keys (used for both WireGuard and authenticating communicating over the relay).

‍

They have potentially three ways of communicating:

1) Peer to peer over IPv4

2) Peer to peer over IPv6

3) Over the relay

‍

So, say all three methods are viable; should they communicate with each other via the IPv4 or IPv6, or should they use the relay? Well of course… peers should use the connection that yields the highest quality!

‍

As mentioned above, WireGuard-go is extremely extensible, so we modified it further such that the Bind implementation also maintains multiple active “connections” to every other peer. Each device/connector periodically sends what we’ve called “Quality of Service (QOS)” probes over every single available connection to every other peer. For each connection, each peer maintains a quality score. This score takes into account the round-trip-time, packet loss, and wire MTU for each connection. For any given peer, the “active” connection is considered to be the one with the highest score. Whenever the WireGuard engine is looking to send WireGuard traffic to another peer, it looks up the peer by public key, and then sends the traffic over whatever is the active connection for that peer.

‍

We’ve exposed all this QOS fun behind the command “border0 node debug peers“ which will show you every other device or connector that your device is peered with, along with every available connection to that peer and what each connection’s score and components are.

‍

See the example output below, each peer has multiple connections: udp4 (UDP over IPv4), udp6 (UDP over IPv6), and udpR (over our relay).

The real magic here is that we're constantly monitoring all available paths and dynamically switching to the best one. So, if you experience a sudden spike in packet loss over the udp4 connection, for example, your client will automatically change to using udp6 or the relay. This happens seamlessly without manual intervention, and it ensures you always have the best experience possible.

‍

Making it Work Everywhere, Fast

You didn’t think we had a single relay server to serve all of our customers globally, did you? Well… that’s how the relay project began, but of course it couldn’t stay that way for long. We couldn’t really tell our customers in Australia that their traffic had to go all the way to Ohio and back to them just to escape NAT or a firewall!

‍

So, we came up with a way of meshing relay servers together such that all of our customers had a relay server relatively close to them, and simultaneously would be able to connect two far-apart devices faster than the Internet ever could. That sounds too good to be true, doesn’t it? Well, it is too good, but it is also true!
‍

How it works is:
• We have relay servers distributed globally, deployed using anycast, with even more planned to come.
• Every device/connector always connects to their closest relay server.
• If two peered devices are close to one another, they likely use the same relay, adding minimal latency (compared to peer-to-peer).
• If two peered devices are far apart, they likely use different relays. Traffic is forwarded between relays over UDP, leveraging a reliable backbone network, sometimes yielding lower latency than peer-to-peer ("faster than the Internet").

Whenever a peer connects to a relay, the relay announces this connection to all other relays. This ensures all relay servers know to send traffic destined for that peer to the correct relay.

We monitor our network very closely, with measurements of packet loss and latency at both within each relay server as well as at monitoring clients all around the world. This gives us the confidence that customer traffic is always going over the best path, and to automatically take unhealthy sites out of rotation if we detect network unreliability.

The Client Portal

We have robust clients for macOS, Linux, and Windows. However, recognizing the need for flexibility, we’ve also built a web browser client. This allows users to quickly connect to their services using their browser of choice, even from mobile devices, and is ideal for situations for quick access.

‍

Now… there’s a ton of magic behind our (web) client portal. I mean… It's literally an ephemeral WireGuard VPN client compiled to WebAssembly in order to run in the browser. Browsers don’t natively speak UDP (which WireGuard typically uses for transport). Luckily, we already had the perfect solution in place! Remember that relay server I mentioned? It was designed for sending WireGuard traffic over WebSockets in order to solve for the browser-based client use case. Wrapping our state-of-the-art browser-based VPN client, we’ve built elegant protocol-specific clients for MySQL, PostgreSQL, Shells (ssh, etc), and VNC that provide efficient and enjoyable, secretless access to your remote services, all just one click away - that’s truly crazy if you think about it.

‍

We have built such a good web client, that we’ve seen customers that also have the application installed, use the web client simply because it is often much more fun to use than the native tooling available. In particular the database clients (e.g. MySQL, PostgreSQL, MS-SQL) feel so much nicer than any other native CLI or GUI based client out there. Don’t take my word for it, try it!

‍

The client portal is the perfect solution for hybrid cloud organizations in particular. Every server/database/desktop that you work with is just another pane in the dashboard; all just a click away. Under the hood you could have an organization with servers/databases/desktops in AWS, GCP, Azure, any other cloud provider or on-premises, and they are all exposed behind the same standardized user interface.

Wrap-Up

There you have it folks, that’s essentially how the Border0 sausage is made. We started by laying out a set of product requirements and usability goals. We walked through our process, glossing over the challenges we faced and how we solved them, one at a time. We've pulled back the curtain on Border0's architecture, showcasing how we've combined cutting-edge technologies like WireGuard®, gVisor, and WebAssembly to create a truly innovative access management solution. We're not just building a better VPN; we're building the future of access. Join us on this journey! Sign up for a free trial here and see what's possible.

‍

Ready to level up
your security?

Start for free now

Building The World’s First Application-Aware VPN

Adriano Sela Aviles

What Are We Building?

We’re Gonna Need a Tunnel: Enter WireGuard®

The Connector: Where the Magic Happens

Finding Your Peers

A Word About NAT

STUN

Try All 100+ Flavours Today!

WireGuard Relay Service

Finding the Path of Least Resistance

Making it Work Everywhere, Fast

The Client Portal

Wrap-Up

Remembering When Setting up a VPN Was Hard

The Border0 Router – A Raspberry Pi-Powered Secure Gateway

Introducing Border0’s AI SQL Assistant: Query Your Database in Plain English

Ready to level up
your security?

Product

Tech

Company

Get in Touch

Building The World’s First Application-Aware VPN

Adriano Sela Aviles

What Are We Building?

We’re Gonna Need a Tunnel: Enter WireGuard®

The Connector: Where the Magic Happens

Finding Your Peers

A Word About NAT

STUN

Try All 100+ Flavours Today!

WireGuard Relay Service

Finding the Path of Least Resistance

Making it Work Everywhere, Fast

The Client Portal

Wrap-Up

Remembering When Setting up a VPN Was Hard

The Border0 Router – A Raspberry Pi-Powered Secure Gateway

Introducing Border0’s AI SQL Assistant: Query Your Database in Plain English

Ready to level up your security?

Ready to level up
your security?