Pritunl cluster infrastructure

Pritunl is designed for high availability with no master servers and no single point of failure. All servers in a Pritunl cluster are independent and do not rely on other servers. Below is a diagram of a Enterprise cluster setup with a MongoDB replica set.

Simple Design

Pritunl is designed to keep the configuration simple. All hosts in a cluster are equal and there is no master server. Adding hosts to a cluster only requires connecting the new host to the database used by the other host/hosts. When a new host connects to an existing Pritunl database the host/hosts using the database will automatically be aware of the new host. All inter-host communication is done through the database. Networking connectivity between hosts is not needed unless required by a VPN configuration such as a replicated VPN.

Dual Web Server

The Pritunl web console uses an internal and external web server. This was done to protect the internal Python web server with the external Golang web server and to validate incoming JSON with static types. The Golang HTTP server is secure, regularly maintained and properly implements SSL. The external web server process has SELinux policies installed to secure the process when SELinux is available. When configured with LetsEncrypt or a signed SSL certificate the Pritunl web console will pass all checks in the Qualys SSL Test.

External Web Server

The external web server is a Golang server that runs in a separate process. Other then the OpenVPN process this is the only process with open ports. All the paths and JSON data structures with static types are programmed on this server. When a request is received the server will check the path and the JSON data will be parsed into a struct with static types. This will ensure that the path is valid and the received JSON data matches the expected format for that path. The server will then create a new HTTP request and encode the struct back to JSON. A specific set of headers will also be copied to the new request. The new request is then sent to the internal server and the response is returned to the client. This server also binds to port 80 to redirect requests to HTTPS and to respond to the LetsEncrypt domain verification requests. The source code for this server can be found in the pritunl/pritunl-web repository.

Internal Web Server

The internal web server is a Python server and is part of the main codebase. This server is bound only to and will never receive requests directly from clients. There is some additional type checking and filtering done on this server for important path handlers.

Inter-Server Messaging System

All communication between servers is done through the MongoDB database. This allows servers in a cluster to function without local network access. A tailable cursor in MongoDB is used to create a publish/subscribe messaging pattern allowing for fast and efficient communication across all servers in the cluster. This is also used for an event system to notify the web console that changes have occurred allowing for live updates.

Resource Pooling System

Pritunl performs several resource intensive tasks that often take a long time to complete. These tasks include generating private keys and dh parameters that are then used by users, organizations and servers. Pritunl creates these resources in advanced and keeps them in a pool for future use. This allows task such as creating a user, organization or server to occur almost instantly. In the event that the pool is empty the resources will be created on-demand. The pooling system will also allow reserving a resource that is partially completed as opposed to generating a new one on-demand. This is useful for dh parameters where it can often take several minutes to generate.

Queue System

A distributed queue system is available for resource intensive tasks which include generating private keys, dh parameters and re-assigning ip addresses to a server when changing the network address. The queue system will evenly distribute the tasks across all the servers in the cluster. This allows making full use of the resources in a large Pritunl cluster.

Task System

Pritunl is designed to handle an inconsistent state caused by server failures. Scheduled tasks are run to repair inconsistencies in the database. These tasks include removing lost users, repairing a vpn servers ip pool and removing lost network links. The task system is also used to ensure all vpn servers are running and in the event of a server failure the task will perform the configured recovery. Tasks are reserved and run by only one server in the cluster to prevent unnecessary overlap. No master server is selected to schedule the tasks instead all servers in a cluster will attempt to reserve a scheduled task with a random delay. All the servers will also check for incomplete tasks and run the task again when a task is left uncompleted past the tasks timeout.