Overview
You’ve deployed your application. Now you need to know:
- Is it running?
- Are resources OK (CPU, RAM, disk)?
- Are services healthy?
- Is it responding fast?
This guide covers FOSS monitoring tools from simple to comprehensive. You don’t need to run Prometheus if a lightweight uptime checker suffices.
Difficulty: Intermediate (some tools require Docker or manual setup) Time to implement: 15 minutes (lightweight) to 2 hours (full stack)
Lightweight: Uptime Kuma
For most personal projects and small VPS setups, Uptime Kuma is the sweet spot. It’s lightweight, self-hosted, and monitors HTTP/TCP/DNS/ping endpoints with a beautiful UI.
Install with Docker
docker run -d \
--name uptime-kuma \
-p 3001:3001 \
-v uptime-kuma-data:/app/data \
--restart unless-stopped \
louislam/uptime-kuma:1
Access at http://your-server-ip:3001.
Set Up Your First Monitor
- Click Add New Monitor
- Choose HTTP(s) for a web service
- Enter the URL (e.g.,
https://yourapp.com) - Set Heartbeat Interval (every 60 seconds is good)
- Enable TLS Info to check certificate expiration
- Add notification channels (email, Gotify, Telegram, etc.)
Features You’ll Use
- Certificate monitoring — Get alerted 30 days before SSL expires
- Response time history — Spot performance degradation
- Status pages — Public
status.yourdomain.comfor your users - Multiple notification channels — Don’t rely on email alone
Resource Usage
Uptime Kuma uses ~50-100 MB RAM and almost no CPU at idle. Perfect for a small VPS.
Lightweight: Stat Ping (For Quick Checks)
For basic server monitoring without a full UI, use uptimed and custom scripts:
Server Uptime Daemon
sudo apt install uptimed -y
sudo systemctl enable uptimed
sudo systemctl start uptimed
# Check uptime records
uprecords
Quick Resource Check Script
Create ~/health.sh:
#!/bin/bash
echo "=== Server Health Check ==="
echo "Uptime: $(uptime -p)"
echo "Load: $(cat /proc/loadavg)"
echo "Memory: $(free -h | grep Mem)"
echo "Disk: $(df -h / | tail -1)"
echo "Top 5 CPU processes: $(ps aux --sort=-%cpu | head -6)"
echo "Top 5 RAM processes: $(ps aux --sort=-%mem | head -6)"
Run with bash ~/health.sh or schedule via cron.
Medium: Netdata
Netdata is a distributed, real-time performance and health monitoring system. It gives you second-level metrics for CPU, RAM, disk, network, and running processes — with beautiful charts and zero configuration.
Install
bash <(curl -Ss https://my-netdata.io/kickstart.sh)
Netdata auto-detects everything and starts collecting immediately. Access at http://your-server-ip:19999.
Key Features
- Per-second metrics — See exactly what’s happening right now
- Alarms — Netdata ships with ~100 pre-configured alarms
- Long-term storage — Store metrics for weeks/months
- Docker monitoring — Container-level CPU, memory, network stats
- Nginx/Apache monitoring — Real-time request stats
Configure Alarms
Netdata alarms go to /etc/netdata/health_alarm_notify.conf. To enable email:
# Edit the alarms config
sudo nano /etc/netdata/health_alarm_notify.conf
# Set receiver email
DEFAULT_RECIPIENT_EMAIL="you@yourdomain.com"
Resource Usage
Netdata uses ~5-10% of one CPU core at idle and ~150-200 MB RAM. On a small VPS, this is noticeable but manageable.
# Limit Netdata resources if needed
sudo nano /etc/netdata/netdata.conf
[global]
update every = 5
history = 3600
Full Stack: Grafana + Prometheus
For comprehensive observability, you want the full monitoring stack:
- Prometheus — Time-series database that scrapes metrics
- Grafana — Visualization and dashboards
- Node Exporter — Server-level metrics (CPU, RAM, disk, network)
- cAdvisor — Docker container metrics
Architecture
Node Exporter (port 9100) ──┐
cAdvisor (port 8080) ───────┼──→ Prometheus (port 9090) ──→ Grafana (port 3000)
│
[Your App] (port 8000) ─────┘
Docker Compose Setup
Create monitoring/docker-compose.yml:
version: '3.8'
services:
prometheus:
image: prom/prometheus:latest
container_name: prometheus
restart: unless-stopped
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
grafana:
image: grafana/grafana:latest
container_name: grafana
restart: unless-stopped
ports:
- "3000:3000"
volumes:
- grafana_data:/var/lib/grafana
environment:
- GF_SECURITY_ADMIN_PASSWORD=YOUR_STRONG_PASSWORD
node-exporter:
image: prom/node-exporter:latest
container_name: node-exporter
restart: unless-stopped
ports:
- "9100:9100"
command:
- '--path.procfs=/host/proc'
- '--path.sysfs=/host/sys'
- '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'
cadvisor:
image: gcr.io/cadvisor/cadvisor:latest
container_name: cadvisor
restart: unless-stopped
ports:
- "8080:8080"
volumes:
- /:/rootfs:ro
- /var/run:/var/run:ro
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
- /dev/disk/:/dev/disk:ro
volumes:
prometheus_data:
grafana_data:
volumes:
from: {}
Prometheus Config
Create monitoring/prometheus.yml:
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'node'
static_configs:
- targets: ['node-exporter:9100']
- job_name: 'cadvisor'
static_configs:
- targets: ['cadvisor:8080']
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
Start Monitoring Stack
cd monitoring
docker compose up -d
# Access Grafana at http://your-server-ip:3000
# Default credentials: admin / admin (change immediately!)
# Prometheus at http://your-server-ip:9090
Import Dashboards
In Grafana:
- Click + → Import
- Enter dashboard ID (e.g.,
1860for Node Exporter Full) - Select Prometheus data source
- Click Import
Log Monitoring: Loki + Grafana
For log aggregation, add Loki:
loki:
image: grafana/loki:latest
container_name: loki
restart: unless-stopped
ports:
- "3100:3100"
volumes:
- ./loki-config.yml:/etc/loki/local-config.yaml
Add loki to Prometheus config as another scrape target.
Quick Reference: Which Tool When?
| Use Case | Tool | RAM Usage | Setup Time |
|---|---|---|---|
| Uptime monitoring | Uptime Kuma | ~100 MB | 5 min |
| Real-time metrics | Netdata | ~150 MB | 5 min |
| Full observability | Grafana + Prometheus | ~500 MB | 1-2 hr |
| Log aggregation | Loki + Grafana | ~300 MB | 1 hr |
| Basic health check | Custom script | ~0 MB | 5 min |
Alerting: Don’t Just Monitor, Alert
Monitoring without alerting is just pretty charts. Configure notifications:
Uptime Kuma Alerts
- Email, Telegram, Gotify, Slack, Pushover, and more
- Set alert threshold: notify when response time > X ms or downtime > X min
Netdata Alarms
Pre-configured for: CPU usage > 80%, RAM > 80%, disk > 80%, service down, etc.
Grafana Alerting
// Grafana alert rule example
- name: HighCPU
condition: A > 80
data:
- refId: A
query: avg(node_cpu_seconds_total{mode="user"})
Pro Tips
Use a Reverse Proxy
All these tools have web UIs. Don’t expose them directly to the internet — put them behind a reverse proxy with authentication:
# Example Caddy config for Uptime Kuma
yourstatus.com {
reverse_proxy localhost:3001
basicauth /* {
user HASHED_PASSWORD
}
}
Set Up Alert Fatigue Prevention
- Alert on symptoms, not causes (e.g., “site down” not “process crashed”)
- Use PagerDuty/opsgenie-style escalation for critical alerts
- Have “quiet hours” for non-critical alerts
Resource Budget for Monitoring
On a 2 GB VPS:
- Uptime Kuma: ~100 MB (negligible)
- Netdata: ~150 MB (fine)
- Full Prometheus stack: ~500-800 MB (significant — monitor monitoring itself)