Skip to content

RouterOS Performance Tuning

RouterOS provides multiple layers of performance optimization, from connection-level acceleration through FastTrack to hardware-assisted forwarding via switch chip offloading. Understanding where CPU time is spent, how to shift traffic onto faster paths, and how to measure the results are the core competencies for tuning a RouterOS device.

This guide covers CPU load analysis, FastTrack connection optimization, hardware offloading, IRQ affinity, and throughput benchmarking using built-in RouterOS tools.


RouterOS processes packets at two levels:

  • Slow path (software path): Packets are handled by the RouterOS kernel process, consuming CPU cycles. Firewall rules, NAT, queuing, and features such as the packet sniffer and Torch all run in the slow path.
  • Fast path: Traffic that qualifies for FastPath or FastTrack bypasses most software processing, greatly reducing CPU load and increasing throughput.

Understanding which packets remain in the slow path is the starting point for any performance investigation.

/system resource gives the top-level view of CPU utilization:

/system resource print

For multi-core devices, per-core breakdown is available:

/system resource cpu print

Example output on a quad-core device:

# CPU LOAD IRQ DISK
0 0 23% 0% 0%
1 1 18% 0% 0%
2 2 12% 0% 0%
3 3 41% 0% 0%

High load on a single core while others are idle often indicates a forwarding bottleneck or a specific process pinned to one core.

The Profiler identifies which RouterOS processes consume CPU. It is the primary diagnostic tool for understanding where time is spent:

/tool profile

Run for a fixed duration and show all CPU cores:

/tool profile cpu=all duration=30s

Key columns in the profiler output:

ColumnMeaning
NAMEProcess or subsystem name
CPUCPU core the process runs on (or “all”)
USAGEPercentage of CPU consumed during the sample

Common high-usage processes and what they indicate:

ProcessLikely cause
firewallComplex or large firewall ruleset
routingLarge routing table, rapid BGP churn
bridgingL2 forwarding in software (HW offload disabled)
queueQoS/HTB shaping with many queues
ethernetInterrupt pressure from high pps traffic

For sustained monitoring, use a script to sample repeatedly:

/system script add name=cpu-sample source={
:for i from=1 to=10 do={
:put ("=== sample " . $i . " ===")
/system resource cpu print
/tool profile duration=5s
:delay 1s
}
}
/system script run cpu-sample

Traffic is eligible for FastPath when:

  • It is plain IP forwarding (no NAT, no firewall action for that connection)
  • No diagnostic tools are active (Sniffer, Torch, IP Scan, MAC Scan disable FastPath while running)
  • No L4 features are applied (no mangle marks needed, no per-connection accounting required)

Once FastTrack is configured (see next section), established connections bypass the firewall entirely and consume near-zero CPU.


FastTrack is a connection-level acceleration feature built on top of connection tracking. When a connection is marked as fasttrack, RouterOS skips the full firewall ruleset for subsequent packets in that connection and forwards them via a fast software path. On hardware that supports it, FastTrack connections can also be offloaded to the switch chip for near-wire-speed forwarding.

FastTrack applies to forwarded traffic (traffic passing through the router, not to/from the router itself).

Supported protocols: TCP and UDP

FastTrack requires:

  1. Connection tracking enabled (/ip firewall connection tracking set enabled=yes)
  2. FastPath/route cache active (not disabled via /ip settings)
  3. No active diagnostic tools — running Sniffer, Torch, IP Scan, or MAC Scan disables FastPath for the duration
  4. No IPsec policies matching the same traffic (IPsec requires full packet inspection; exclude IPsec subnets from FastTrack rules)

Add a FastTrack rule in the forward chain, then an accept rule for the same traffic. Both rules are required — FastTrack does not implicitly accept:

/ip firewall filter
add chain=forward action=fasttrack-connection \
connection-state=established,related \
comment="FastTrack established/related"
add chain=forward action=accept \
connection-state=established,related \
comment="Accept established/related"

Place these rules before any other forward chain rules that would match the same traffic. Rules are evaluated top-to-bottom; FastTrack only applies if the packet reaches the FastTrack rule.

FeatureBehavior with FastTrack
Firewall rules (forward chain)Bypassed for FastTracked connections
Connection accounting (/ip firewall connection byte counters)Not updated
Simple queues / HTBNot applied — FastTracked traffic bypasses queue trees
Mangle rulesNot applied after connection is FastTracked
IPsecIncompatible — exclude IPsec subnets explicitly
Packet sniffer / TorchDisables FastPath while active

Traffic that requires QoS, accounting, or deep inspection must be excluded. Use connection marks to prevent FastTracking specific flows:

# Mark connections that need QoS
/ip firewall mangle
add chain=prerouting src-address=192.168.10.0/24 \
action=mark-connection new-connection-mark=qos-traffic passthrough=yes
# FastTrack only unmarked traffic
/ip firewall filter
add chain=forward action=fasttrack-connection \
connection-state=established,related \
connection-mark=!qos-traffic \
comment="FastTrack non-QoS traffic"
add chain=forward action=accept \
connection-state=established,related \
comment="Accept established/related"

Check the connection table for fasttrack flags:

/ip firewall connection print where fasttrack=yes

A non-zero count confirms FastTrack is active. If all connections show fasttrack=no, verify rule placement and that no diagnostic tools are running.


On devices with a built-in switch chip (most MikroTik home and SOHO routers, CRS series), bridge forwarding can be offloaded to the switch chip. Without offloading, all bridged packets consume CPU. With offloading, the switch chip forwards packets between bridge ports at wire speed with zero CPU involvement.

Enable hardware offloading per bridge port:

/interface bridge
add name=bridge1
/interface bridge port
add bridge=bridge1 interface=ether2 hw=yes
add bridge=bridge1 interface=ether3 hw=yes
add bridge=bridge1 interface=ether4 hw=yes
add bridge=bridge1 interface=ether5 hw=yes

The hw=yes flag instructs RouterOS to use the switch chip for that port when possible. Verify actual offload status:

/interface bridge port print

The HW column shows whether the port is actively offloaded. A port may show hw=yes in configuration but not be offloaded if the bridge has conflicting features enabled.

Bridge hardware offloading has constraints that affect whether it activates:

ConstraintDetail
Single HW bridgeMost single-chip devices only support one hardware-offloaded bridge. A second bridge forces software forwarding for one of them
VLAN filteringCertain VLAN filtering configurations require specific setup to remain hardware-offloaded
STP/RSTPSupported on most switch chips, but verify per platform
Bridge firewallEnabling bridge firewall (/interface bridge settings set use-ip-firewall=yes) moves forwarding back to software
IGMP snoopingMay require software processing depending on platform

L3 hardware offloading extends switch chip processing to routed traffic, enabling line-rate IP forwarding between VLANs and routed interfaces without CPU involvement. This is available on CRS3xx, CRS5xx, and some CCR platforms.

L3 HW offloading requires L2 HW offloading to be active first. The routing decision is handled by the switch chip’s L3 table, which is built from RouterOS routing and ARP tables.

Enable L3 HW offloading on the bridge:

/interface bridge
set bridge1 l3-hw-offloading=yes

Check offload status:

/interface bridge port print detail

Ports with both hw=yes and the bridge having l3-hw-offloading=yes will offload routed traffic between segments.

LimitationNotes
Table sizeL3 HW table is fixed and smaller than the software routing table. Overflow causes software fallback
FastTrack integrationFastTrack connections can be offloaded to HW on supported platforms, providing near-wire-speed NAT/routing
NATHardware-assisted NAT is supported on some CRS platforms; verify per model
IPv6L3 HW offloading support for IPv6 varies by platform and RouterOS version
Multiple bridgesOnly one HW bridge is supported; multiple bridges degrade to software forwarding

VLAN operations (tagging/untagging) can also be offloaded to the switch chip:

/interface bridge
set bridge1 vlan-filtering=yes
/interface bridge vlan
add bridge=bridge1 tagged=ether1 untagged=ether2,ether3 vlan-ids=100
/interface bridge port
set [find interface=ether2] pvid=100

With vlan-filtering=yes and proper port/VLAN configuration, the switch chip handles VLAN operations at wire speed. The key requirement is that all VLAN operations stay within the single HW bridge.


On multi-core RouterOS devices, network interface interrupts (IRQs) are serviced by CPU cores. By default, all IRQs may be handled by a single core, creating a bottleneck at high packet rates even if other cores are idle. IRQ affinity allows binding specific interfaces to specific cores, distributing interrupt load across the available CPUs.

Check current IRQ-to-CPU assignments:

/system resource irq print

Example output:

# USERS CPU ACTIVE IRQ
0 eth0 1 yes 20
1 eth1 1 yes 21
2 eth2 1 yes 22
3 eth3 0 yes 23

All IRQs on CPU 1 will saturate that core at high pps while CPU 0 sits idle.

Manually assign IRQs to different cores:

/system resource irq set 0 cpu=0
/system resource irq set 1 cpu=1
/system resource irq set 2 cpu=2
/system resource irq set 3 cpu=3

Or use cpu=auto to let RouterOS balance automatically:

/system resource irq set [find] cpu=auto

For a router with two high-traffic interfaces (WAN and LAN):

# Assign WAN interface IRQ to core 0, LAN to core 1
/system resource irq set [find users~"ether1"] cpu=0
/system resource irq set [find users~"ether2"] cpu=1

For devices with many interfaces and equal traffic distribution, spread IRQs evenly:

# Distribute across 4 cores
:local cores 4
:local i 0
:foreach irq in=[/system resource irq find] do={
/system resource irq set $irq cpu=($i % $cores)
:set i ($i + 1)
}
  • Changes take effect immediately but are not persistent across reboot by default. Add the commands to a startup script under /system scheduler with on-event triggered at boot.
  • After changing IRQ affinity, re-profile with /tool profile to verify CPU load distribution has improved.
  • On RouterOS CHR (Cloud Hosted Router), IRQ assignment depends on the hypervisor; virtio drivers expose IRQs differently than physical hardware.

RouterOS includes a bandwidth test tool (/tool bandwidth-test) that measures achievable throughput between two RouterOS devices. One device acts as server, the other as client.

On the server (receiving end):

/tool bandwidth-server set enabled=yes authenticate=yes

On the client (initiating end):

/tool bandwidth-test address=192.168.1.1 \
user=admin password=yourpassword \
direction=both protocol=tcp duration=30s

Key parameters:

ParameterDescription
direction=receiveTest download only
direction=transmitTest upload only
direction=bothBidirectional simultaneous test
protocol=tcpTCP test (affected by TCP flow control)
protocol=udpUDP test with fixed packet size
local-tx-speedCap the transmit rate (useful for comparing with/without FastTrack)
  • TCP results reflect achievable throughput considering retransmits and flow control. They are more representative of real application performance.
  • UDP results measure raw packet throughput without flow control. High UDP rates with packet loss indicate interface or switch chip saturation.
  • A result significantly below line rate (e.g., 200 Mbps on a Gigabit interface) suggests a CPU bottleneck in the slow path. Enable FastTrack and re-test to confirm.

For synthetic load testing without a second RouterOS device, use the traffic generator:

/tool traffic-generator packet-template add name=t1 \
mac-dst=FF:FF:FF:FF:FF:FF ip-dst=10.0.0.2 \
ip-src=10.0.0.1 l4-protocol=udp \
udp-dst-port=5000 size=1500
/tool traffic-generator stream add name=s1 \
template=t1 tx-speed=1000000000 count=0
/tool traffic-generator start

The traffic generator injects packets directly at the interface level, bypassing the RouterOS software stack. This is useful for testing switch chip throughput independently of routing performance.

Stop with:

/tool traffic-generator stop

The clearest way to measure FastTrack impact is before/after comparison:

Step 1 — Establish baseline (slow path)

Disable FastTrack temporarily by commenting the FastTrack rule or placing it below a logging rule:

/ip firewall filter disable [find comment="FastTrack established/related"]
/tool bandwidth-test address=192.168.1.1 user=admin direction=both duration=30s

Record the throughput result and CPU usage during the test (/system resource print in a second terminal or via The Dude / Zabbix).

Step 2 — Enable FastTrack and re-test

/ip firewall filter enable [find comment="FastTrack established/related"]
/tool bandwidth-test address=192.168.1.1 user=admin direction=both duration=30s

On a heavily loaded device, enabling FastTrack typically reduces CPU usage by 30–70% for forwarded TCP/UDP traffic and can increase throughput significantly on CPU-limited hardware.

CaveatExplanation
Diagnostic tools affect resultsTorch, Sniffer, IP Scan disable FastPath during operation — stop them before benchmarking
Bandwidth test traffic is router-originatedThe bandwidth test tool generates traffic to/from the router itself, not forwarded traffic. For forwarding benchmarks, use an external traffic source or the traffic generator
TCP window sizeDefault TCP window size may limit bandwidth-test results on high-latency links. Adjust with tcp-window-scaling on both devices
Hardware limitsA CCR1009 at 100% CPU will plateau regardless of optimization; know the device’s theoretical line rate limits
Multiple streamsSingle-stream bandwidth tests often underutilize multi-queue hardware. Test with multiple parallel sessions for realistic measurements

Use this checklist when optimizing a RouterOS deployment:

CPU Analysis

  • Check overall load: /system resource print
  • Check per-core load: /system resource cpu print
  • Profile processes: /tool profile cpu=all duration=30s
  • Identify top CPU consumers and their cause

FastTrack

  • FastTrack rule present in forward chain, above other rules
  • Corresponding accept rule immediately after FastTrack rule
  • IPsec traffic excluded from FastTrack
  • No diagnostic tools running during normal operation
  • Verify active FastTrack connections: /ip firewall connection print where fasttrack=yes

Hardware Offloading

  • Bridge ports set hw=yes on capable hardware
  • Single HW bridge (avoid multiple bridges on single-chip devices)
  • VLAN filtering configured for HW offload compatibility
  • L3 HW offloading enabled on supported CRS platforms
  • No bridge firewall enabled (disables HW offloading)

IRQ Balancing

  • Check IRQ distribution: /system resource irq print
  • Redistribute IRQs across cores if unbalanced
  • Add IRQ assignment to startup scheduler for persistence
  • Re-profile after changes to confirm improvement

Benchmarking

  • Baseline measurement taken with diagnostic tools stopped
  • Bandwidth test run in both directions
  • CPU usage recorded during test
  • FastTrack before/after comparison performed
  • Results documented per firmware and hardware version