RouterOS Performance Tuning
RouterOS Performance Tuning
Section titled “RouterOS Performance Tuning”RouterOS provides multiple layers of performance optimization, from connection-level acceleration through FastTrack to hardware-assisted forwarding via switch chip offloading. Understanding where CPU time is spent, how to shift traffic onto faster paths, and how to measure the results are the core competencies for tuning a RouterOS device.
This guide covers CPU load analysis, FastTrack connection optimization, hardware offloading, IRQ affinity, and throughput benchmarking using built-in RouterOS tools.
CPU Load Analysis
Section titled “CPU Load Analysis”Overview: Slow Path vs Fast Path
Section titled “Overview: Slow Path vs Fast Path”RouterOS processes packets at two levels:
- Slow path (software path): Packets are handled by the RouterOS kernel process, consuming CPU cycles. Firewall rules, NAT, queuing, and features such as the packet sniffer and Torch all run in the slow path.
- Fast path: Traffic that qualifies for FastPath or FastTrack bypasses most software processing, greatly reducing CPU load and increasing throughput.
Understanding which packets remain in the slow path is the starting point for any performance investigation.
Checking Overall CPU Usage
Section titled “Checking Overall CPU Usage”/system resource gives the top-level view of CPU utilization:
/system resource printFor multi-core devices, per-core breakdown is available:
/system resource cpu printExample output on a quad-core device:
# CPU LOAD IRQ DISK0 0 23% 0% 0%1 1 18% 0% 0%2 2 12% 0% 0%3 3 41% 0% 0%High load on a single core while others are idle often indicates a forwarding bottleneck or a specific process pinned to one core.
Per-Process Profiling with /tool profile
Section titled “Per-Process Profiling with /tool profile”The Profiler identifies which RouterOS processes consume CPU. It is the primary diagnostic tool for understanding where time is spent:
/tool profileRun for a fixed duration and show all CPU cores:
/tool profile cpu=all duration=30sKey columns in the profiler output:
| Column | Meaning |
|---|---|
| NAME | Process or subsystem name |
| CPU | CPU core the process runs on (or “all”) |
| USAGE | Percentage of CPU consumed during the sample |
Common high-usage processes and what they indicate:
| Process | Likely cause |
|---|---|
firewall | Complex or large firewall ruleset |
routing | Large routing table, rapid BGP churn |
bridging | L2 forwarding in software (HW offload disabled) |
queue | QoS/HTB shaping with many queues |
ethernet | Interrupt pressure from high pps traffic |
Scripted Sampling
Section titled “Scripted Sampling”For sustained monitoring, use a script to sample repeatedly:
/system script add name=cpu-sample source={ :for i from=1 to=10 do={ :put ("=== sample " . $i . " ===") /system resource cpu print /tool profile duration=5s :delay 1s }}/system script run cpu-sampleFast Path Eligibility
Section titled “Fast Path Eligibility”Traffic is eligible for FastPath when:
- It is plain IP forwarding (no NAT, no firewall action for that connection)
- No diagnostic tools are active (Sniffer, Torch, IP Scan, MAC Scan disable FastPath while running)
- No L4 features are applied (no mangle marks needed, no per-connection accounting required)
Once FastTrack is configured (see next section), established connections bypass the firewall entirely and consume near-zero CPU.
FastTrack Connection Optimization
Section titled “FastTrack Connection Optimization”How FastTrack Works
Section titled “How FastTrack Works”FastTrack is a connection-level acceleration feature built on top of connection tracking. When a connection is marked as fasttrack, RouterOS skips the full firewall ruleset for subsequent packets in that connection and forwards them via a fast software path. On hardware that supports it, FastTrack connections can also be offloaded to the switch chip for near-wire-speed forwarding.
FastTrack applies to forwarded traffic (traffic passing through the router, not to/from the router itself).
Supported protocols: TCP and UDP
Prerequisites
Section titled “Prerequisites”FastTrack requires:
- Connection tracking enabled (
/ip firewall connection tracking set enabled=yes) - FastPath/route cache active (not disabled via
/ip settings) - No active diagnostic tools — running Sniffer, Torch, IP Scan, or MAC Scan disables FastPath for the duration
- No IPsec policies matching the same traffic (IPsec requires full packet inspection; exclude IPsec subnets from FastTrack rules)
Enabling FastTrack
Section titled “Enabling FastTrack”Add a FastTrack rule in the forward chain, then an accept rule for the same traffic. Both rules are required — FastTrack does not implicitly accept:
/ip firewall filteradd chain=forward action=fasttrack-connection \ connection-state=established,related \ comment="FastTrack established/related"
add chain=forward action=accept \ connection-state=established,related \ comment="Accept established/related"Place these rules before any other forward chain rules that would match the same traffic. Rules are evaluated top-to-bottom; FastTrack only applies if the packet reaches the FastTrack rule.
FastTrack Limitations
Section titled “FastTrack Limitations”| Feature | Behavior with FastTrack |
|---|---|
| Firewall rules (forward chain) | Bypassed for FastTracked connections |
Connection accounting (/ip firewall connection byte counters) | Not updated |
| Simple queues / HTB | Not applied — FastTracked traffic bypasses queue trees |
| Mangle rules | Not applied after connection is FastTracked |
| IPsec | Incompatible — exclude IPsec subnets explicitly |
| Packet sniffer / Torch | Disables FastPath while active |
Excluding Traffic from FastTrack
Section titled “Excluding Traffic from FastTrack”Traffic that requires QoS, accounting, or deep inspection must be excluded. Use connection marks to prevent FastTracking specific flows:
# Mark connections that need QoS/ip firewall mangleadd chain=prerouting src-address=192.168.10.0/24 \ action=mark-connection new-connection-mark=qos-traffic passthrough=yes
# FastTrack only unmarked traffic/ip firewall filteradd chain=forward action=fasttrack-connection \ connection-state=established,related \ connection-mark=!qos-traffic \ comment="FastTrack non-QoS traffic"
add chain=forward action=accept \ connection-state=established,related \ comment="Accept established/related"Verifying FastTrack Activity
Section titled “Verifying FastTrack Activity”Check the connection table for fasttrack flags:
/ip firewall connection print where fasttrack=yesA non-zero count confirms FastTrack is active. If all connections show fasttrack=no, verify rule placement and that no diagnostic tools are running.
Hardware Offloading
Section titled “Hardware Offloading”Bridge Hardware Offloading
Section titled “Bridge Hardware Offloading”On devices with a built-in switch chip (most MikroTik home and SOHO routers, CRS series), bridge forwarding can be offloaded to the switch chip. Without offloading, all bridged packets consume CPU. With offloading, the switch chip forwards packets between bridge ports at wire speed with zero CPU involvement.
Enable hardware offloading per bridge port:
/interface bridgeadd name=bridge1
/interface bridge portadd bridge=bridge1 interface=ether2 hw=yesadd bridge=bridge1 interface=ether3 hw=yesadd bridge=bridge1 interface=ether4 hw=yesadd bridge=bridge1 interface=ether5 hw=yesThe hw=yes flag instructs RouterOS to use the switch chip for that port when possible. Verify actual offload status:
/interface bridge port printThe HW column shows whether the port is actively offloaded. A port may show hw=yes in configuration but not be offloaded if the bridge has conflicting features enabled.
Switch Chip Limitations
Section titled “Switch Chip Limitations”Bridge hardware offloading has constraints that affect whether it activates:
| Constraint | Detail |
|---|---|
| Single HW bridge | Most single-chip devices only support one hardware-offloaded bridge. A second bridge forces software forwarding for one of them |
| VLAN filtering | Certain VLAN filtering configurations require specific setup to remain hardware-offloaded |
| STP/RSTP | Supported on most switch chips, but verify per platform |
| Bridge firewall | Enabling bridge firewall (/interface bridge settings set use-ip-firewall=yes) moves forwarding back to software |
| IGMP snooping | May require software processing depending on platform |
L3 Hardware Offloading
Section titled “L3 Hardware Offloading”L3 hardware offloading extends switch chip processing to routed traffic, enabling line-rate IP forwarding between VLANs and routed interfaces without CPU involvement. This is available on CRS3xx, CRS5xx, and some CCR platforms.
L3 HW offloading requires L2 HW offloading to be active first. The routing decision is handled by the switch chip’s L3 table, which is built from RouterOS routing and ARP tables.
Enable L3 HW offloading on the bridge:
/interface bridgeset bridge1 l3-hw-offloading=yesCheck offload status:
/interface bridge port print detailPorts with both hw=yes and the bridge having l3-hw-offloading=yes will offload routed traffic between segments.
L3 HW Offloading Limitations
Section titled “L3 HW Offloading Limitations”| Limitation | Notes |
|---|---|
| Table size | L3 HW table is fixed and smaller than the software routing table. Overflow causes software fallback |
| FastTrack integration | FastTrack connections can be offloaded to HW on supported platforms, providing near-wire-speed NAT/routing |
| NAT | Hardware-assisted NAT is supported on some CRS platforms; verify per model |
| IPv6 | L3 HW offloading support for IPv6 varies by platform and RouterOS version |
| Multiple bridges | Only one HW bridge is supported; multiple bridges degrade to software forwarding |
VLAN Hardware Offloading
Section titled “VLAN Hardware Offloading”VLAN operations (tagging/untagging) can also be offloaded to the switch chip:
/interface bridgeset bridge1 vlan-filtering=yes
/interface bridge vlanadd bridge=bridge1 tagged=ether1 untagged=ether2,ether3 vlan-ids=100
/interface bridge portset [find interface=ether2] pvid=100With vlan-filtering=yes and proper port/VLAN configuration, the switch chip handles VLAN operations at wire speed. The key requirement is that all VLAN operations stay within the single HW bridge.
IRQ Balancing
Section titled “IRQ Balancing”What IRQ Affinity Does
Section titled “What IRQ Affinity Does”On multi-core RouterOS devices, network interface interrupts (IRQs) are serviced by CPU cores. By default, all IRQs may be handled by a single core, creating a bottleneck at high packet rates even if other cores are idle. IRQ affinity allows binding specific interfaces to specific cores, distributing interrupt load across the available CPUs.
Viewing IRQ Assignments
Section titled “Viewing IRQ Assignments”Check current IRQ-to-CPU assignments:
/system resource irq printExample output:
# USERS CPU ACTIVE IRQ0 eth0 1 yes 201 eth1 1 yes 212 eth2 1 yes 223 eth3 0 yes 23All IRQs on CPU 1 will saturate that core at high pps while CPU 0 sits idle.
Setting IRQ CPU Affinity
Section titled “Setting IRQ CPU Affinity”Manually assign IRQs to different cores:
/system resource irq set 0 cpu=0/system resource irq set 1 cpu=1/system resource irq set 2 cpu=2/system resource irq set 3 cpu=3Or use cpu=auto to let RouterOS balance automatically:
/system resource irq set [find] cpu=autoIRQ Balancing Strategy
Section titled “IRQ Balancing Strategy”For a router with two high-traffic interfaces (WAN and LAN):
# Assign WAN interface IRQ to core 0, LAN to core 1/system resource irq set [find users~"ether1"] cpu=0/system resource irq set [find users~"ether2"] cpu=1For devices with many interfaces and equal traffic distribution, spread IRQs evenly:
# Distribute across 4 cores:local cores 4:local i 0:foreach irq in=[/system resource irq find] do={ /system resource irq set $irq cpu=($i % $cores) :set i ($i + 1)}IRQ Tuning Notes
Section titled “IRQ Tuning Notes”- Changes take effect immediately but are not persistent across reboot by default. Add the commands to a startup script under
/system schedulerwithon-eventtriggered at boot. - After changing IRQ affinity, re-profile with
/tool profileto verify CPU load distribution has improved. - On RouterOS CHR (Cloud Hosted Router), IRQ assignment depends on the hypervisor; virtio drivers expose IRQs differently than physical hardware.
Throughput Benchmarking
Section titled “Throughput Benchmarking”Built-in Bandwidth Test
Section titled “Built-in Bandwidth Test”RouterOS includes a bandwidth test tool (/tool bandwidth-test) that measures achievable throughput between two RouterOS devices. One device acts as server, the other as client.
On the server (receiving end):
/tool bandwidth-server set enabled=yes authenticate=yesOn the client (initiating end):
/tool bandwidth-test address=192.168.1.1 \ user=admin password=yourpassword \ direction=both protocol=tcp duration=30sKey parameters:
| Parameter | Description |
|---|---|
direction=receive | Test download only |
direction=transmit | Test upload only |
direction=both | Bidirectional simultaneous test |
protocol=tcp | TCP test (affected by TCP flow control) |
protocol=udp | UDP test with fixed packet size |
local-tx-speed | Cap the transmit rate (useful for comparing with/without FastTrack) |
Interpreting Bandwidth Test Results
Section titled “Interpreting Bandwidth Test Results”- TCP results reflect achievable throughput considering retransmits and flow control. They are more representative of real application performance.
- UDP results measure raw packet throughput without flow control. High UDP rates with packet loss indicate interface or switch chip saturation.
- A result significantly below line rate (e.g., 200 Mbps on a Gigabit interface) suggests a CPU bottleneck in the slow path. Enable FastTrack and re-test to confirm.
Traffic Generator
Section titled “Traffic Generator”For synthetic load testing without a second RouterOS device, use the traffic generator:
/tool traffic-generator packet-template add name=t1 \ mac-dst=FF:FF:FF:FF:FF:FF ip-dst=10.0.0.2 \ ip-src=10.0.0.1 l4-protocol=udp \ udp-dst-port=5000 size=1500
/tool traffic-generator stream add name=s1 \ template=t1 tx-speed=1000000000 count=0
/tool traffic-generator startThe traffic generator injects packets directly at the interface level, bypassing the RouterOS software stack. This is useful for testing switch chip throughput independently of routing performance.
Stop with:
/tool traffic-generator stopMeasuring FastPath vs Slow Path Impact
Section titled “Measuring FastPath vs Slow Path Impact”The clearest way to measure FastTrack impact is before/after comparison:
Step 1 — Establish baseline (slow path)
Disable FastTrack temporarily by commenting the FastTrack rule or placing it below a logging rule:
/ip firewall filter disable [find comment="FastTrack established/related"]/tool bandwidth-test address=192.168.1.1 user=admin direction=both duration=30sRecord the throughput result and CPU usage during the test (/system resource print in a second terminal or via The Dude / Zabbix).
Step 2 — Enable FastTrack and re-test
/ip firewall filter enable [find comment="FastTrack established/related"]/tool bandwidth-test address=192.168.1.1 user=admin direction=both duration=30sOn a heavily loaded device, enabling FastTrack typically reduces CPU usage by 30–70% for forwarded TCP/UDP traffic and can increase throughput significantly on CPU-limited hardware.
Benchmarking Caveats
Section titled “Benchmarking Caveats”| Caveat | Explanation |
|---|---|
| Diagnostic tools affect results | Torch, Sniffer, IP Scan disable FastPath during operation — stop them before benchmarking |
| Bandwidth test traffic is router-originated | The bandwidth test tool generates traffic to/from the router itself, not forwarded traffic. For forwarding benchmarks, use an external traffic source or the traffic generator |
| TCP window size | Default TCP window size may limit bandwidth-test results on high-latency links. Adjust with tcp-window-scaling on both devices |
| Hardware limits | A CCR1009 at 100% CPU will plateau regardless of optimization; know the device’s theoretical line rate limits |
| Multiple streams | Single-stream bandwidth tests often underutilize multi-queue hardware. Test with multiple parallel sessions for realistic measurements |
Performance Tuning Checklist
Section titled “Performance Tuning Checklist”Use this checklist when optimizing a RouterOS deployment:
CPU Analysis
- Check overall load:
/system resource print - Check per-core load:
/system resource cpu print - Profile processes:
/tool profile cpu=all duration=30s - Identify top CPU consumers and their cause
FastTrack
- FastTrack rule present in forward chain, above other rules
- Corresponding accept rule immediately after FastTrack rule
- IPsec traffic excluded from FastTrack
- No diagnostic tools running during normal operation
- Verify active FastTrack connections:
/ip firewall connection print where fasttrack=yes
Hardware Offloading
- Bridge ports set
hw=yeson capable hardware - Single HW bridge (avoid multiple bridges on single-chip devices)
- VLAN filtering configured for HW offload compatibility
- L3 HW offloading enabled on supported CRS platforms
- No bridge firewall enabled (disables HW offloading)
IRQ Balancing
- Check IRQ distribution:
/system resource irq print - Redistribute IRQs across cores if unbalanced
- Add IRQ assignment to startup scheduler for persistence
- Re-profile after changes to confirm improvement
Benchmarking
- Baseline measurement taken with diagnostic tools stopped
- Bandwidth test run in both directions
- CPU usage recorded during test
- FastTrack before/after comparison performed
- Results documented per firmware and hardware version