Netwatch (Host Monitoring)
Netwatch (Host Monitoring)
Section titled âNetwatch (Host Monitoring)âTL;DR (Quick Start)
Section titled âTL;DR (Quick Start)âFor the impatient: monitor a host and run scripts on state change.
# Basic host monitoring/tool/netwatch/add host=8.8.8.8 interval=30s comment="Google DNS"
# With failover scripts/tool/netwatch/add host=8.8.8.8 interval=10s \ down-script="/log warning \"Primary WAN down\"" \ up-script="/log warning \"Primary WAN up\""Verify with:
/tool/netwatch/printOverview
Section titled âOverviewâWhat this does: Netwatch monitors network host availability using various probe types (ICMP, TCP, HTTP, DNS) and executes scripts when hosts go up or down. Itâs essential for WAN failover, service health monitoring, and automated alerting.
When to use this:
- WAN failover detection and automatic route switching
- Service availability monitoring (web servers, DNS, etc.)
- Automated alerts via email or log when hosts fail
- Triggering configuration changes based on network conditions
Prerequisites:
- Network connectivity to monitored hosts
- Scripts must have appropriate permissions (read,write,test,reboot)
- For email alerts:
/tool/e-mailconfigured
Probe Types
Section titled âProbe Typesâ| Type | Use Case | Version |
|---|---|---|
simple | Basic ICMP ping (legacy) | 6.x+ |
icmp | Advanced ping with thresholds | 7.4+ |
tcp-conn | TCP port connectivity | 7.4+ |
http-get | HTTP service check | 7.4+ |
https-get | HTTPS with certificate validation | 7.4+ |
dns | DNS server response | 7.4+ |
Configuration Steps
Section titled âConfiguration StepsâStep 1: Create Basic Monitor
Section titled âStep 1: Create Basic MonitorâMonitor a host with simple ICMP:
/tool/netwatch/add host=8.8.8.8 interval=30s timeout=3s comment="Google DNS"Step 2: Add Scripts for State Changes
Section titled âStep 2: Add Scripts for State ChangesâAdd scripts that run when the host goes down or comes back up:
/tool/netwatch/set [find host=8.8.8.8] \ down-script="/log error \"Host 8.8.8.8 is DOWN\"" \ up-script="/log info \"Host 8.8.8.8 is UP\""Step 3: Verify Status
Section titled âStep 3: Verify StatusâCheck the current status:
/tool/netwatch/printExpected output:
Flags: X - disabled # HOST TYPE INTERVAL TIMEOUT STATUS SINCE 0 8.8.8.8 simple 30s 3s up jan/16/2026 10:00:00Common Scenarios
Section titled âCommon ScenariosâScenario: WAN Failover
Section titled âScenario: WAN FailoverâAutomatically switch routes when primary WAN fails:
Step 1: Configure routes with comments
/ip/route/add dst-address=0.0.0.0/0 gateway=192.168.1.1 distance=1 comment=primary/ip/route/add dst-address=0.0.0.0/0 gateway=192.168.2.1 distance=2 comment=backupStep 2: Create netwatch with failover scripts
/tool/netwatch/add host=8.8.8.8 interval=10s timeout=2s \ down-script={ /ip/route/set [find comment=primary] disabled=yes /log warning "Primary WAN down - switched to backup" } \ up-script={ /ip/route/set [find comment=primary] disabled=no /log warning "Primary WAN up - restored" }Scenario: Advanced ICMP with Thresholds (v7.4+)
Section titled âScenario: Advanced ICMP with Thresholds (v7.4+)âMonitor with latency and packet loss thresholds:
/tool/netwatch/add host=10.0.0.1 type=icmp interval=30s \ packet-count=5 \ packet-interval=100ms \ thr-avg=50ms \ thr-max=200ms \ thr-loss-percent=20 \ down-script="/log error \"High latency or packet loss on link\""The host is marked âdownâ if:
- Average latency exceeds 50ms
- Maximum latency exceeds 200ms
- Packet loss exceeds 20%
Scenario: HTTP Service Monitoring
Section titled âScenario: HTTP Service MonitoringâMonitor a web server:
/tool/netwatch/add host=www.example.com type=http-get port=80 \ interval=1m timeout=10s \ http-code-min=200 http-code-max=299 \ down-script="/log error \"Web server not responding\""Scenario: HTTPS with Certificate Validation
Section titled âScenario: HTTPS with Certificate ValidationâMonitor HTTPS and validate the certificate:
/tool/netwatch/add host=secure.example.com type=https-get port=443 \ check-certificate=yes \ interval=5m \ down-script="/log error \"HTTPS service or certificate issue\""Scenario: DNS Server Monitoring
Section titled âScenario: DNS Server MonitoringâMonitor a DNS serverâs ability to resolve queries:
/tool/netwatch/add host=example.com type=dns \ dns-server=192.168.1.53 \ record-type=A \ interval=1m \ down-script="/log error \"DNS server not responding\""Scenario: Email Alerts
Section titled âScenario: Email AlertsâSend email when a host goes down:
Step 1: Configure email
/tool/e-mail/set server=smtp.example.com port=587 \ start-tls=yes [email protected] password=secretStep 2: Create netwatch with email scripts
/tool/netwatch/add host=192.168.1.100 interval=30s \ down-script={ /tool/e-mail/send [email protected] \ subject="Server DOWN" \ body="Host 192.168.1.100 is unreachable" } \ up-script={ /tool/e-mail/send [email protected] \ subject="Server UP" \ body="Host 192.168.1.100 recovered" }Scenario: Internet Path Monitoring
Section titled âScenario: Internet Path MonitoringâMonitor internet connectivity without relying on a specific endpoint (using low TTL):
/tool/netwatch/add host=1.1.1.1 type=icmp \ ttl=3 \ accept-icmp-time-exceeded=yes \ interval=15s \ comment="Internet path check"This marks âupâ if any router along the path responds, even with âTTL exceeded.â
Scenario: Suppress Initial State Scripts
Section titled âScenario: Suppress Initial State ScriptsâPrevent scripts from running when the router first boots:
/tool/netwatch/add host=8.8.8.8 interval=30s \ ignore-initial-down=yes \ startup-delay=5m \ down-script="/log warning \"Host down\""ignore-initial-down=yes: Donât run down-script for UnknownâDown transitionstartup-delay=5m: Wait 5 minutes after boot before monitoring
Scenario: TCP Port Check
Section titled âScenario: TCP Port CheckâMonitor if a specific TCP port is accepting connections:
/tool/netwatch/add host=192.168.1.50 type=tcp-conn port=3306 \ interval=30s \ down-script="/log error \"MySQL server not accepting connections\""Verification
Section titled âVerificationâConfirm Netwatch is working correctly:
Check 1: View All Monitors
Section titled âCheck 1: View All Monitorsâ/tool/netwatch/printExpected: Entries listed with status (up/down/unknown).
Check 2: View Detailed Status
Section titled âCheck 2: View Detailed Statusâ/tool/netwatch/print detailShows all properties including scripts and thresholds.
Check 3: View Statistics
Section titled âCheck 3: View Statisticsâ/tool/netwatch/print statsShows done-tests, failed-tests, and latency statistics.
Check 4: Filter by Status
Section titled âCheck 4: Filter by Statusâ/tool/netwatch/print where status=downShows only hosts that are currently unreachable.
Check 5: Check Script Execution in Logs
Section titled âCheck 5: Check Script Execution in Logsâ/log/print where topics~"script"Look for script execution entries.
Troubleshooting
Section titled âTroubleshootingâ| Symptom | Cause | Solution |
|---|---|---|
| Scripts not executing | Script policy exceeds allowed permissions | Set policy to read,write,test,reboot only |
| Down-script triggers at boot | Network not ready during startup | Increase startup-delay or use ignore-initial-down=yes |
| Monitoring doesnât detect WAN failure | Monitoring target uses the failing link | Create static route for target via specific gateway |
| Status flapping rapidly | Interval too short or link unstable | Increase interval and packet-count |
| Inline script syntax errors | Quote escaping issues | Use named scripts instead of inline |
| Global variables not accessible | Netwatch runs as *sys user | Use environment variables or write to file |
| DNS probe fails unexpectedly | Uses /ip/dns settings | Set explicit dns-server property |
| HTTP fails on redirects | Default accepts only 2xx codes | Set http-code-max=399 to accept redirects |
Debug: Test Script Manually
Section titled âDebug: Test Script ManuallyâCreate the script separately and test it:
/system/script/add name=test-script source="/log info \"Test\""/system/script/run test-scriptIf it works, reference it in netwatch:
/tool/netwatch/set 0 down-script="/system/script/run test-script"Debug: Check Script Permissions
Section titled âDebug: Check Script PermissionsâScripts must have limited permissions:
/system/script/print where name=myscriptEnsure policy shows only: read,write,test,reboot
Debug: Monitor Through Correct Interface
Section titled âDebug: Monitor Through Correct InterfaceâFor WAN failover, ensure monitoring traffic uses the correct path:
# Force monitoring traffic through primary WAN/ip/route/add dst-address=8.8.8.8/32 gateway=192.168.1.1 comment="monitor-route"Common Mistakes
- Monitoring target through the link being monitored - If 8.8.8.8 is reached via primary WAN, and primary WAN fails, the monitor canât reach 8.8.8.8 to detect the failure. Use a static route or monitor the gateway directly.
- Script permissions too broad - Netwatch can only run scripts with
read,write,test,rebootpolicies. More permissions = script wonât run. - Complex inline scripts - Quote escaping in inline scripts is error-prone. Use named scripts for anything complex.
- No startup delay - Scripts may trigger during boot before network is ready. Use
startup-delayandignore-initial-down. - HTTP checking redirect URLs - Sites that redirect (301/302) fail with default settings. Adjust
http-code-maxor use final URL.
Script Variables
Section titled âScript VariablesâNetwatch provides variables accessible in scripts:
| Variable | Description |
|---|---|
$"output" | Probe output/response |
$"status" | Current status (up/down) |
$"since" | Time of last status change |
Example using variables:
down-script="/log warning (\"Host down since: \" . \$since)"Related Topics
Section titled âRelated TopicsâAction Scripts
Section titled âAction Scriptsâ- Scheduler - scripting fundamentals
- Email Tool - send alerts on state change
- Logging - log monitoring events
Failover Use Cases
Section titled âFailover Use Casesâ- Static Routes - netwatch-triggered route changes
- VRRP - gateway failover
- Firewall Mangle - policy routing failover
Related Monitoring
Section titled âRelated Monitoringâ- Torch - real-time traffic analysis
- Bandwidth Test - link capacity testing
- Ping Tool - basic connectivity testing
Reference
Section titled âReferenceâKey Commands Reference
Section titled âKey Commands Referenceâ| Command | Description |
|---|---|
/tool/netwatch/add | Create monitoring entry |
/tool/netwatch/print | View all entries with status |
/tool/netwatch/print stats | View performance statistics |
/tool/netwatch/set | Modify existing entry |
/tool/netwatch/remove | Delete monitoring entry |
Core Properties
Section titled âCore Propertiesâ| Property | Type | Default | Description |
|---|---|---|---|
host | string | - | Target IP, domain, or VRF address (required) |
type | enum | simple | Probe type: simple/icmp/tcp-conn/http-get/https-get/dns |
interval | time | 10s | Time between probes |
timeout | time | 3s | Maximum wait for response |
src-address | IP | - | Source address for probes |
startup-delay | time | 5m | Delay after boot before monitoring |
start-delay | time | 3s | Delay before first probe |
Script Properties
Section titled âScript Propertiesâ| Property | Type | Description |
|---|---|---|
up-script | string | Script on DownâUp transition |
down-script | string | Script on UpâDown transition |
test-script | string | Script after every probe |
ignore-initial-up | yes/no | Suppress up-script for UnknownâUp |
ignore-initial-down | yes/no | Suppress down-script for UnknownâDown |
ICMP Probe Properties (v7.4+)
Section titled âICMP Probe Properties (v7.4+)â| Property | Type | Default | Description |
|---|---|---|---|
packet-count | integer | 10 | Packets per test |
packet-interval | time | 50ms | Delay between packets |
packet-size | integer | 54 | Packet size in bytes |
thr-avg | time | 100ms | Average RTT threshold |
thr-max | time | 1s | Maximum RTT threshold |
thr-loss-percent | percent | 85% | Packet loss threshold |
thr-jitter | time | 1s | Jitter threshold |
HTTP/HTTPS Properties (v7.4+)
Section titled âHTTP/HTTPS Properties (v7.4+)â| Property | Type | Default | Description |
|---|---|---|---|
port | integer | 80/443 | HTTP(S) port |
http-code-min | integer | 100 | Minimum acceptable status code |
http-code-max | integer | 299 | Maximum acceptable status code |
check-certificate | yes/no | no | Validate SSL certificate |
thr-http-time | time | 10s | Response time threshold |
Status Flags
Section titled âStatus Flagsâ| Flag | Meaning |
|---|---|
| X | Disabled |
status=up | Host reachable |
status=down | Host unreachable |
status=unknown | Not yet tested |