Helpy Pro works by running a PUMA web server with multiple worker processes to handle web and api requests. To avoid overloading available resources, a supervisory process called "PUMA worker killer" runs in the background and will recycle workers if they consume too much memory (or stay alive too long).
To get the most out of Helpy Pro, you may need to tune the server by setting/customizing environment variables. Helpy Pro ships with a set of defaults that approximately match up to the base level VM for each cloud environment (Azure, AWS, GCP). If you are running on a larger or smaller VM, or find that you need to scale up vertically, you should adjust these variables as described below.
Getting Started
To get started, you will need terminal access to the VM(s) where the Helpy service is running. You will also need to know how much memory (RAM) the instance(s) have, and whether or not they are running any other services (such as a database server.) The general process of tuning the instance(s) follows these steps:
1. Set the maximum memory available
2. Set the number of worker instances
3. Configure the number of threads each worker can use
4. Restart Instance(s)
Available Memory
Helpy Pro runs a worker killer process (PMK) that ensures that VM(s) will never run out of ram and crash. It does this by monitoring memory consumption against a set available amount of RAM. You can change this with the following command:
sudo helpy config:set PWK_RAM=[VM RAM in megabytes]
Where n is the number of Megabytes allocated to the Helpy Service. Typically you would use the full RAM of a VM for this, as there is already a 10% buffer built in for system processes.
Worker Concurrency
The PUMA web server runs multiple processes to handle requests and messages from your users. To configure the number of workers, use the following command:
sudo helpy config:set PUMA_CONCURRENCY=[number of workers]
To determine the ideal number of workers, divide the available RAM on the VM by 300, rounding down. This will give you a conservative number as Helpy processes rarely take this much memory. If you notice that the PWK is killing processes frequently, or see lower or higher RAM usage on HTOP, you can adjust this accordingly.
Note:
If you set worker concurrency too high, compared the available memory setting, you will see processes be repeatedly killed and general instability
Threads per worker
The PUMA server can run in a multi-threaded mode. This means that each worker process can spawn multiple threads to handle additional requests. Increasing the number of threads will result in the process using more memory, and there is little benefit to scaling above 5. If your server is memory constrained, reducing the number of threads can help, and in fact setting this to 1 thread per worker will perform fine in many cases.
You can adjust the number of threads per worker with the following:
sudo helpy config:set PUMA_THREADS=[number of threads]
Caution!
When you start the Helpy service, you use the following command:
sudo helpy scale web=1
Do not exceed 1, as this will not work with our PUMA configuration and will overwhelm your system and prevent normal operations. The way to scale is by using the worker concurrency and threads settings above.
Adding a dedicated background worker
When you originally set up Helpy, you started the server with the command scale web=1 worker=1. This runs both the web service and the background worker service. At some point, you may need to scale your implementation in a way that divides these services between two or more separate VMs. In this case, you can run a dedicated background worker with
sudo helpy scale web=0 worker=1
Likewise, you can run a dedicated web service VM with:
sudo helpy scale web=1 worker=0