I'm scaling a security inspection device. The rate limiting step is a single threaded process that analyses and extracts data from network flows.
As a short-term goal, I would like to scale this by load-balancing from the physical capture interface to internal dummy or tap interfaces and then have multiple instances of the single threaded process running using those as input.
Each instance of the single threaded process requires complete network flows (i.e. matching tuple - TCP or UDP). I don't care about the traffic after it will hit the dummy interfaces - it's then dropped so I don't need to
worry about return paths or anything related.
After some reading, I think that flow based hashing and tc-mirred redirect action might be the solution I am looking for.
So, I've got this far:
# Set up the interfaces
PHYS="eth1"
INT_COUNT=4 # Derived from number of CPU cores
# Create some dummy interfaces
for i in {1..$INT_COUNT}
do ip link add dummy$i type dummy
ip link set dummy$i up
done
# Create the qdisc
tc qdisc add dev $PHY ingress
# Filter based on flow tuple. Classes are created automatically due to divisor
tc filter add dev $PHYS parent ffff: handle 1 \
flow hash keys src,dst,proto,proto-src,proto-dst \
divisor $INT_COUNT
# Now somehow apply the redirect action to the created classes
# for each of the dummy interfaces
for i in {1..$INT_COUNT}
do tc filter add dev $PHYS parent ffff: protocol ip u32 match u32 0 0 \
action mirred egress redirect dev dummy$i
done
It's that last part I cannot seem to understand.
The classes are created automatically with the divisor statement in the filtering, As divisor is using modulo of the hash I should get classes that I can reference.
How can I say "class "N", go to dummyN"?
I've done a lot of reading on this but I think there's some key part that I'm not understanding. I'm thinking I need flowid in the action but am not sure.
Any tips or suggestions on this would be greatly appreciated.