Architecture

Pritunl Cloud design architecture

Declarative State Based Designed

Pritunl Cloud uses a state based design similar to React. Every 3 seconds the node pulls the full state of all resources running on the node. This is then compared to the existing state and any changes are applied. This has a significant advantage in reliability to event based designs. No disruptive event will ever result in a loss of the expected configuration state. Even if the main process crashes and remains exited for several hours the virtual machines will continue to run and when the process is started again it will pull the latest state and apply any changes that occurred.

Each time the main process is started the current state is fully rebuilt by scanning the full system configuration. If the main process were stopped and some iptables rules were manually modified in an instances namespace it would detect this change when the main process starts and correct the configuration. The reliability of this is extensively tested in development where the main process is frequently closed and recompiled.

Nodes

Each node runs independent of other nodes. A Pritunl Cloud cluster has no single point of failure assuming a MongoDB replica set is also configured. In the case of scheduled tasks that should only run once a reservation system is used to allow any running node to reserve and run the task. No one node is designated a primary node.

Instance Networking

Instances run in network namespaces to provide full network isolation. Each instance network namespace contains the VPC default gateway, this address is assigned to the VPC bridge in the namespace. This prevents a single point of failure with with having one default gateway for the entire VPC. A VLAN interface is then attached to the bridge and bridged to the host internal interface. Each VPC is assigned a unique VLAN ID in the range of 1001-3999. This allows for about 3000 VPCs in a Pritunl Cloud cluster.

This provides a very similar network layout common on cloud providers. The instance will have one interface with the private VPC IP address and the public IP with accessible through a static NAT.

The primary difference is the utilization of a private VPC IPv6 range instead of directly applying the public IPv6 address to the instance interface. This has the benefit of supporting DHCP IPv6 environments where the public IPv6 changes. If the network provider reassigns the DHCP IPv6 range instances will regain IPv6 connectivity in about 1 minute without the instance seeing any changes to the network interface.

Firewall rules are applied using iptables and ipset inside the namespace.

┌──────────────────────────────┐
│         KVM Instance         │
│      ┌────────────────┐      │
│      │   IMDS Agent   │      │
│      └───────▼────────┘      │
│      ┌────────────────┐      │
│      │ Instance Iface │      │
│      └───────┬────────┘      │
└──────────────┼───────────────┘


┌──────────────┼─────────────────────────────────────┐
│      Network Namespace       ┌───────▲────────┐    │
│              │               │   IMDS Server  │    │
│     ┌────────▼─────────┐     └───────▼────────┘    │
│     │                  │                           │
│     │    VPC Bridge    │─────────Static NAT        │
│     │                  │             │             │
│     └────────┬─────────┘             │             │
│              │                       │             │
│     ┌────────▼─────────┐     ┌───────▲────────┐    │
│     │  Internal Iface  │     │ External Iface │    │
│     └────────┬─────────┘     └───────┬────────┘    │
└──────────────┼───────────────────────┼─────────────┘
               │                       │
               │                       │
      ┌────────▼──────────┐  ┌─────────▲─────────┐
      │Host Internal Iface│  │Host External Iface│
      └───────────────────┘  └───────────────────┘

Instance Virtual Machine

The virtual machines run in systemd units with sandboxing options. The QEMU process runs with sandboxing options and dropped permissions to a user that is created for each instance.

Instance IMDS Service

Each instance runs the IMDS service in a systemd unit with sandboxing options. The IMDS server binds to IP 169.254.169.254 in the namespace and to a unix socket. The unix socket allows the Pritunl host to communicate with the IMDS service without needing to create a network path. Both host and client requests are authenticated with different secrets. The client secret is stored in /etc/pritunl-imds.json with root read permissions. This will be read by the pci CLI tool used to make IMDS requests.

VPCs

Instance VPCs are always a VLAN, each VPC is given a random VLAN ID in the range 1001-3999. This would require layer 2 networking and unrestricted VLAN access between Pritunl Cloud nodes. To allow the VPC networking to function in layer 3 only environments the VXLAN network mode is used. This is enabled in the datacenter settings. When this is used a VXLAN with ID 9417 is created between all the Pritunl Cloud nodes and each node will have an IP address on that VXLAN. Then the VPC VLANs are routed through the VXLAN. This results in a 54 MTU overhead when using the VXLAN network mode and no additional overhead with the default or VLAN only networking mode.

Last updated