Articles, News and Updates

Application Isolation via Systemd Security Flags

When hardening a Linux server, we often stop after configuring firewalls, tightening SSH, and managing standard user permissions. However, if a web application or network service (like Nginx, Apache, or a Node.js API) is compromised via a remote code execution (RCE) vulnerability, standard user boundaries might not be enough. If the process runs as www-data, the attacker instantly inherits all privileges of www-data across the entire system.

To mitigate this, modern Linux distributions allow us to implement process-level sandboxing using systemd. By modifying a service’s unit file, we can restrict its view of the filesystem, strip its kernel privileges, and block unauthorized network access—even if the attacker gains execution privileges.

Here is how to turn systemd into a powerful application sandbox.

The Concept of Sandboxing with Systemd

Traditional hardening relies heavily on Discretionary Access Control (DAC)—the classic owner/group file permissions. Systemd sandboxing utilizes Linux kernel features like namespaces, control groups (cgroups), and Seccomp filtering right from the service configuration file.

Instead of rewriting your application’s code, you can declaratively restrict what the application can see and do in the operating system environment.

1. Restricting Filesystem Access

By default, a compromised process can browse directories like /tmp, /home, or /var looking for sensitive data or configuration files. Systemd provides flags to render these areas completely invisible or read-only.

ProtectSystem

This directive protects the OS directory tree from being modified by the service.

  • ProtectSystem=true: Mounts /usr and the boot directories (/boot, /efi) as read-only.
  • ProtectSystem=full: Additionally mounts /etc as read-only. This is highly recommended for web servers that only need to read configuration files, not change them.
  • ProtectSystem=strict: The ultimate setting. It flips the entire filesystem to read-only for the service, except for directories explicitly whitelisted using ReadWritePaths=.

ProtectHome

Prevents the service from accessing user data.

  • ProtectHome=true: Makes /home, /root, and /run/user completely empty and inaccessible to the service.

Example Implementation:

[Service]
ExecStart=/usr/bin/node /var/www/my-api/index.js
User=node-user

# Filesystem Isolation
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/var/www/my-api/logs /var/www/my-api/uploads

In this setup, if an attacker hijacks the Node.js application, they cannot write to /etc, they cannot see /home, and they can only write files to the specific logs and uploads paths.

2. Locking Down Kernel Namespaces

Attackers often look for local privilege escalation vulnerabilities by interacting with system hardware, kernel tunables, or shared temporary files. We can abstract these away using kernel namespaces.

  • PrivateTmp=true: Gives the process its own isolated, ephemeral /tmp and /var/tmp directory. It prevents the service from seeing or tampering with temporary files belonging to other processes.
  • PrivateDevices=true: Generates a custom /dev folder for the service that excludes physical devices (like raw disk drives, system memory interfaces, and USB devices), leaving only virtual loops like /dev/null, /dev/random, and /dev/zero.
  • ProtectKernelTunables=true: Mounts kernel variables configurable via sysctl (/proc/sys, /sys) as read-only, preventing the process from altering kernel behavior.
# Kernel & Environment Isolation
PrivateTmp=true
PrivateDevices=true
ProtectKernelTunables=true

3. Restricting Network and Address Families

Does your backend processing service actually need to access the internet? Does your database need to initiate outbound TCP connections? Often, the answer is no. Systemd can restrict network sockets down to exactly what the application requires.

  • RestrictAddressFamilies=: Restricts the low-level socket types the application can create. For a standard web service, you typically only need AF_INET (IPv4) and AF_INET6 (IPv6). This prevents the application from utilizing obscure network protocols that might have unpatched kernel vulnerabilities.
  • IPAddressDeny=any: Blocks all network access for the service. You can pair this with IPAddressAllow= to whitelist only specific internal IP addresses or databases (e.g., IPAddressAllow=127.0.0.1 10.0.0.5).
# Network Isolation
RestrictAddressFamilies=AF_INET AF_INET6
IPAddressDeny=any
IPAddressAllow=127.0.0.1

4. Stripping Root Privileges (Capability Bounding)

In Linux, the root user’s power is broken down into distinct permissions called Capabilities. For instance, binding to a port lower than 1024 requires CAP_NET_BIND_SERVICE.

If a service must start as root (perhaps to bind to port 80/443 before dropping privileges to a regular user), it carries a window of vulnerability. We can use systemd to drop all capabilities except the absolute bare essentials.

# Allow binding to privileged ports, drop everything else
CapabilityBoundingSet=CAP_NET_BIND_SERVICE
AmbientCapabilities=CAP_NET_BIND_SERVICE
NoNewPrivileges=true

Note: NoNewPrivileges=true is one of the most critical security flags. It ensures that the service—and any child processes it spawns—can never gain more privileges than the parent process, completely breaking standard SUID binary exploitation techniques.

Putting It Together: A Hardened Template

sudo systemctl edit my-service.service

Inside the text editor, paste your hardening block:

[Service]
# Filesystem
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/var/log/my-service
PrivateTmp=true

# Kernel & Devices
PrivateDevices=true
ProtectKernelTunables=true
ProtectControlGroups=true
NoNewPrivileges=true

# Privileges & Architecture
CapabilityBoundingSet=CAP_NET_BIND_SERVICE
AmbientCapabilities=CAP_NET_BIND_SERVICE
RestrictAddressFamilies=AF_INET AF_INET6
SystemCallArchitectures=native

Save the file and restart your service:

sudo systemctl daemon-reload
sudo systemctl restart my-service.service

Verifying Your Hardening: systemd-analyze

Unsure if your configuration is secure enough? Systemd includes a built-in security auditing tool that scores your services from 0 (perfectly secure) to 10 (completely exposed).

Run the following command to check your target service:

systemd-analyze security my-service.service

This will output a line-by-line breakdown of every security flag you missed, giving you an immediate roadmap to lock down your application layer effectively.

🛡️ Sandbox-Ready Dedicated Infrastructure

Restricting namespaces, stripping capabilities, and isolating systemd processes are highly effective techniques for building secure, bulletproof application sandboxes. However, even the most isolated local environment can be overwhelmed if your underlying host lacks the raw compute resources and edge defense to withstand heavily sustained request spikes or targeted infrastructure attacks.

👉 View Our Live Unmanaged Server Inventory to deploy your containerized workloads and hardened systemd layers on enterprise-grade hardware backed by automatic network-edge mitigation.