I am testing the communication latency between a microcontroller(teensy4.1) and my desktop computer (ubuntu 20.04.3 with 5.15 kernel patched with PREEMPT-RT)
The microcontroller is directly connected to the computer via a USB cable. The serial communication on the computer is done by reading and writing to /dev/ttyACM0.
The code is written in c++ and is set to run on core#8 exclusively. The said core is also isolated with isolcpu
by modifying /etc/default/grub
. I confirmed the core isolation by checking the processes with htop
which shows that only my testing program is running on core #8. The program's priority is set to 99 with SCHED_FIFO
as the scheduling policy. The kernel tick frequency is set to 1000Hz. The CPU frequency governor is set to performance
. I confirm that there's no CPU throttling as the core frequency is always around the highest possible (4.1Ghz).
EDIT: The CPU is an 8th gen intel i5 with 4 physical cores and 8 total threads.
The test is very simple:
Computer side:
- The computer calls
read()
in blocking mode and wait until it returns.
- The time elapsed between each consecutive
read()
return is measured. This value is going to be referred to as Te
for simplicity.
- Repeat 1-2 in a while loop.
Microcontroller side:
- The microcontroller sends 1 byte to the computer by writing to Serial.
- Repeat 1 at 1000 Hz.
Expected results:
Ideally, Te
should be exactly 1000 microseconds.
Test results:
Te
is almost always around 1000 microseconds(950~1050). This is good enough for my application.
However, I observed 2 problems:
- Occasionally,
Te
can spike to something like 50000 microseconds even when the computer is idle.
Te
spiking becomes a lot more regular when the workload on the computer is increased. It's especially prevalent when the CPU usage is peaking at ~90% on unisolated cores.
I have confirmed that my microcontroller sends out bytes at precise timing with an oscilloscope. So it's unlikely this is related to the microcontroller. The correlation between Te
spiking and computer workload also suggests this.
I know that USB doesn't mix well with hard timing requirements but the test results I got show that the system can meet my requirements and be responsive most of the time.
I'd really like to know what's causing the spiking in timings and what are the possible ways to resolve them. Any help and suggestions are greatly appreciated!