Netwatch (Host Monitoring)
Netwatch (Host Monitoring)
Section titled “Netwatch (Host Monitoring)”TL;DR (quick start)
Section titled “TL;DR (quick start)”For the impatient: monitor a host and run scripts on state change.
# Basic host monitoring/tool/netwatch/add host=8.8.8.8 interval=30s comment="Google DNS"
# With failover scripts/tool/netwatch/add host=8.8.8.8 interval=10s \ down-script="/log warning \"Primary WAN down\"" \ up-script="/log warning \"Primary WAN up\""Verify with:
/tool/netwatch/printOverview
Section titled “Overview”How Netwatch Works
Section titled “How Netwatch Works”What this does: Netwatch monitors network host availability using various probe types (ICMP, TCP, HTTP, DNS) and executes scripts when hosts go up or down. It’s essential for WAN failover, service health monitoring, and automated alerting.
When to use this:
- WAN failover detection and automatic route switching
- Service availability monitoring (web servers, DNS, etc.)
- Automated alerts via email or log when hosts fail
- Triggering configuration changes based on network conditions
Prerequisites:
- Network connectivity to monitored hosts
- Scripts must have appropriate permissions (read,write,test,reboot)
- For email alerts:
/tool/e-mailconfigured
Probe types
Section titled “Probe types”| Type | Use Case | Version |
|---|---|---|
simple | Basic ICMP ping (legacy) | 6.x+ |
icmp | Advanced ping with thresholds | 7.4+ |
tcp-conn | TCP port connectivity | 7.4+ |
http-get | HTTP service check | 7.4+ |
https-get | HTTPS with certificate validation | 7.4+ |
dns | DNS server response | 7.4+ |
Configuration steps
Section titled “Configuration steps”Step 1: Create basic monitor
Section titled “Step 1: Create basic monitor”Monitor a host with simple ICMP:
/tool/netwatch/add host=8.8.8.8 interval=30s timeout=3s comment="Google DNS"Step 2: Add scripts for state changes
Section titled “Step 2: Add scripts for state changes”Add scripts that run when the host goes down or comes back up:
/tool/netwatch/set [find host=8.8.8.8] \ down-script="/log error \"Host 8.8.8.8 is DOWN\"" \ up-script="/log info \"Host 8.8.8.8 is UP\""Step 3: Verify status
Section titled “Step 3: Verify status”Check the current status:
/tool/netwatch/printExpected output:
Flags: X - disabled # HOST TYPE INTERVAL TIMEOUT STATUS SINCE 0 8.8.8.8 simple 30s 3s up jan/16/2026 10:00:00Common scenarios
Section titled “Common scenarios”Scenario: WAN failover
Section titled “Scenario: WAN failover”Automatically switch routes when primary WAN fails:
Step 1: Configure routes with comments
/ip/route/add dst-address=0.0.0.0/0 gateway=192.168.1.1 distance=1 comment=primary/ip/route/add dst-address=0.0.0.0/0 gateway=192.168.2.1 distance=2 comment=backupStep 2: Create netwatch with failover scripts
/tool/netwatch/add host=8.8.8.8 interval=10s timeout=2s \ down-script={ /ip/route/set [find comment=primary] disabled=yes /log warning "Primary WAN down - switched to backup" } \ up-script={ /ip/route/set [find comment=primary] disabled=no /log warning "Primary WAN up - restored" }Scenario: Advanced ICMP with thresholds (v7.4+)
Section titled “Scenario: Advanced ICMP with thresholds (v7.4+)”Monitor with latency and packet loss thresholds:
/tool/netwatch/add host=10.0.0.1 type=icmp interval=30s \ packet-count=5 \ packet-interval=100ms \ thr-avg=50ms \ thr-max=200ms \ thr-loss-percent=20 \ down-script="/log error \"High latency or packet loss on link\""The host is marked “down” if:
- Average latency exceeds 50ms
- Maximum latency exceeds 200ms
- Packet loss exceeds 20%
Scenario: HTTP service monitoring
Section titled “Scenario: HTTP service monitoring”Monitor a web server:
/tool/netwatch/add host=www.example.com type=http-get port=80 \ interval=1m timeout=10s \ http-code-min=200 http-code-max=299 \ down-script="/log error \"Web server not responding\""Scenario: HTTPS with certificate validation
Section titled “Scenario: HTTPS with certificate validation”Monitor HTTPS and validate the certificate:
/tool/netwatch/add host=secure.example.com type=https-get port=443 \ check-certificate=yes \ interval=5m \ down-script="/log error \"HTTPS service or certificate issue\""Scenario: DNS server monitoring
Section titled “Scenario: DNS server monitoring”Monitor a DNS server’s ability to resolve queries:
/tool/netwatch/add host=example.com type=dns \ dns-server=192.168.1.53 \ record-type=A \ interval=1m \ down-script="/log error \"DNS server not responding\""Scenario: Email alerts
Section titled “Scenario: Email alerts”Send email when a host goes down:
Step 1: Configure email
/tool/e-mail/set server=smtp.example.com port=587 \Step 2: Create netwatch with email scripts
/tool/netwatch/add host=192.168.1.100 interval=30s \ down-script={ subject="Server DOWN" \ body="Host 192.168.1.100 is unreachable" } \ up-script={ subject="Server UP" \ body="Host 192.168.1.100 recovered" }Scenario: Internet path monitoring
Section titled “Scenario: Internet path monitoring”Monitor internet connectivity without relying on a specific endpoint (using low TTL):
/tool/netwatch/add host=1.1.1.1 type=icmp \ ttl=3 \ accept-icmp-time-exceeded=yes \ interval=15s \ comment="Internet path check"This marks “up” if any router along the path responds, even with “TTL exceeded.”
Scenario: Suppress initial state scripts
Section titled “Scenario: Suppress initial state scripts”Prevent scripts from running when the router first boots:
/tool/netwatch/add host=8.8.8.8 interval=30s \ ignore-initial-down=yes \ startup-delay=5m \ down-script="/log warning \"Host down\""ignore-initial-down=yes: Don’t run down-script for Unknown→Down transitionstartup-delay=5m: Wait 5 minutes after boot before monitoring
Scenario: TCP port check
Section titled “Scenario: TCP port check”Monitor if a specific TCP port is accepting connections:
/tool/netwatch/add host=192.168.1.50 type=tcp-conn port=3306 \ interval=30s \ down-script="/log error \"MySQL server not accepting connections\""Verification
Section titled “Verification”Confirm Netwatch is working correctly:
Check 1: View all monitors
Section titled “Check 1: View all monitors”/tool/netwatch/printExpected: Entries listed with status (up/down/unknown).
Check 2: View detailed status
Section titled “Check 2: View detailed status”/tool/netwatch/print detailShows all properties including scripts and thresholds.
Check 3: View statistics
Section titled “Check 3: View statistics”/tool/netwatch/print statsShows done-tests, failed-tests, and latency statistics.
Check 4: Filter by status
Section titled “Check 4: Filter by status”/tool/netwatch/print where status=downShows only hosts that are currently unreachable.
Check 5: Check script execution in logs
Section titled “Check 5: Check script execution in logs”/log/print where topics~"script"Look for script execution entries.
Troubleshooting
Section titled “Troubleshooting”| Symptom | Cause | Solution |
|---|---|---|
| Scripts not executing | Script policy exceeds allowed permissions | Set policy to read,write,test,reboot only |
| Down-script triggers at boot | Network not ready during startup | Increase startup-delay or use ignore-initial-down=yes |
| Monitoring doesn’t detect WAN failure | Monitoring target uses the failing link | Create static route for target via specific gateway |
| Status flapping rapidly | Interval too short or link unstable | Increase interval and packet-count |
| Inline script syntax errors | Quote escaping issues | Use named scripts instead of inline |
| Global variables not accessible | Netwatch runs as *sys user | Use environment variables or write to file |
| DNS probe fails unexpectedly | Uses /ip/dns settings | Set explicit dns-server property |
| HTTP fails on redirects | Default accepts only 2xx codes | Set http-code-max=399 to accept redirects |
Debug: Test script manually
Section titled “Debug: Test script manually”Create the script separately and test it:
/system/script/add name=test-script source="/log info \"Test\""/system/script/run test-scriptIf it works, reference it in netwatch:
/tool/netwatch/set 0 down-script="/system/script/run test-script"Debug: Check script permissions
Section titled “Debug: Check script permissions”Scripts must have limited permissions:
/system/script/print where name=myscriptEnsure policy shows only: read,write,test,reboot
Debug: Monitor through correct interface
Section titled “Debug: Monitor through correct interface”For WAN failover, ensure monitoring traffic uses the correct path:
# Force monitoring traffic through primary WAN/ip/route/add dst-address=8.8.8.8/32 gateway=192.168.1.1 comment="monitor-route"Common Mistakes
- Monitoring target through the link being monitored - If 8.8.8.8 is reached via primary WAN, and primary WAN fails, the monitor can’t reach 8.8.8.8 to detect the failure. Use a static route or monitor the gateway directly.
- Script permissions too broad - Netwatch can only run scripts with
read,write,test,rebootpolicies. More permissions = script won’t run. - Complex inline scripts - Quote escaping in inline scripts is error-prone. Use named scripts for anything complex.
- No startup delay - Scripts may trigger during boot before network is ready. Use
startup-delayandignore-initial-down. - HTTP checking redirect URLs - Sites that redirect (301/302) fail with default settings. Adjust
http-code-maxor use final URL.
Script variables
Section titled “Script variables”Netwatch provides variables accessible in scripts:
| Variable | Description |
|---|---|
$"output" | Probe output/response |
$"status" | Current status (up/down) |
$"since" | Time of last status change |
Example using variables:
down-script="/log warning (\"Host down since: \" . \$since)"Related topics
Section titled “Related topics”Action scripts
Section titled “Action scripts”- Scheduler - scripting fundamentals
- Email Tool - send alerts on state change
- Logging - log monitoring events
Failover use cases
Section titled “Failover use cases”- Static Routes - netwatch-triggered route changes
- VRRP - gateway failover
- Firewall Mangle - policy routing failover
Related monitoring
Section titled “Related monitoring”- Torch - real-time traffic analysis
- Bandwidth Test - link capacity testing
- Ping Tool - basic connectivity testing
Reference
Section titled “Reference”Key commands reference
Section titled “Key commands reference”| Command | Description |
|---|---|
/tool/netwatch/add | Create monitoring entry |
/tool/netwatch/print | View all entries with status |
/tool/netwatch/print stats | View performance statistics |
/tool/netwatch/set | Modify existing entry |
/tool/netwatch/remove | Delete monitoring entry |
Core properties
Section titled “Core properties”| Property | Type | Default | Description |
|---|---|---|---|
host | string | - | Target IP, domain, or VRF address (required) |
type | enum | simple | Probe type: simple/icmp/tcp-conn/http-get/https-get/dns |
interval | time | 10s | Time between probes |
timeout | time | 3s | Maximum wait for response |
src-address | IP | - | Source address for probes |
startup-delay | time | 5m | Delay after boot before monitoring |
start-delay | time | 3s | Delay before first probe |
Script properties
Section titled “Script properties”| Property | Type | Description |
|---|---|---|
up-script | string | Script on Down→Up transition |
down-script | string | Script on Up→Down transition |
test-script | string | Script after every probe |
ignore-initial-up | yes/no | Suppress up-script for Unknown→Up |
ignore-initial-down | yes/no | Suppress down-script for Unknown→Down |
ICMP probe properties (v7.4+)
Section titled “ICMP probe properties (v7.4+)”| Property | Type | Default | Description |
|---|---|---|---|
packet-count | integer | 10 | Packets per test |
packet-interval | time | 50ms | Delay between packets |
packet-size | integer | 54 | Packet size in bytes |
thr-avg | time | 100ms | Average RTT threshold |
thr-max | time | 1s | Maximum RTT threshold |
thr-loss-percent | percent | 85% | Packet loss threshold |
thr-jitter | time | 1s | Jitter threshold |
HTTP/HTTPS properties (v7.4+)
Section titled “HTTP/HTTPS properties (v7.4+)”| Property | Type | Default | Description |
|---|---|---|---|
port | integer | 80/443 | HTTP(S) port |
http-code-min | integer | 100 | Minimum acceptable status code |
http-code-max | integer | 299 | Maximum acceptable status code |
check-certificate | yes/no | no | Validate SSL certificate |
thr-http-time | time | 10s | Response time threshold |
Status flags
Section titled “Status flags”| Flag | Meaning |
|---|---|
| X | Disabled |
status=up | Host reachable |
status=down | Host unreachable |
status=unknown | Not yet tested |