Score:1

Significant process slow down when the CPU is under increased work load

nu flag

I am testing the communication latency between a microcontroller(teensy4.1) and my desktop computer (ubuntu 20.04.3 with 5.15 kernel patched with PREEMPT-RT)

The microcontroller is directly connected to the computer via a USB cable. The serial communication on the computer is done by reading and writing to /dev/ttyACM0.

The code is written in c++ and is set to run on core#8 exclusively. The said core is also isolated with isolcpu by modifying /etc/default/grub. I confirmed the core isolation by checking the processes with htop which shows that only my testing program is running on core #8. The program's priority is set to 99 with SCHED_FIFO as the scheduling policy. The kernel tick frequency is set to 1000Hz. The CPU frequency governor is set to performance. I confirm that there's no CPU throttling as the core frequency is always around the highest possible (4.1Ghz).

EDIT: The CPU is an 8th gen intel i5 with 4 physical cores and 8 total threads.

The test is very simple:

Computer side:

  1. The computer calls read() in blocking mode and wait until it returns.
  2. The time elapsed between each consecutive read() return is measured. This value is going to be referred to as Te for simplicity.
  3. Repeat 1-2 in a while loop.

Microcontroller side:

  1. The microcontroller sends 1 byte to the computer by writing to Serial.
  2. Repeat 1 at 1000 Hz.

Expected results:

Ideally, Te should be exactly 1000 microseconds.

Test results:

Te is almost always around 1000 microseconds(950~1050). This is good enough for my application.

However, I observed 2 problems:

  1. Occasionally, Te can spike to something like 50000 microseconds even when the computer is idle.
  2. Te spiking becomes a lot more regular when the workload on the computer is increased. It's especially prevalent when the CPU usage is peaking at ~90% on unisolated cores.

I have confirmed that my microcontroller sends out bytes at precise timing with an oscilloscope. So it's unlikely this is related to the microcontroller. The correlation between Te spiking and computer workload also suggests this.

I know that USB doesn't mix well with hard timing requirements but the test results I got show that the system can meet my requirements and be responsive most of the time.

I'd really like to know what's causing the spiking in timings and what are the possible ways to resolve them. Any help and suggestions are greatly appreciated!

David avatar
cn flag
The USB and or its cable.
Doug Smythies avatar
gn flag
Please edit your question adding the CPU make and model. Is it CPU#8 or CORE#8? if you have 2 CPU's per core, then try to also isolate the other CPU from the same core, even though you may not use it. I have observed similar, but typically only in about the 400 to 900 uSec range.
KWang avatar
nu flag
@David The cable I'm using is really short and the microcontroller is placed near the computer so I don't think it's related to the cable. I suspect the increased performance slow-down is related to the linux USB driver not fully running on the isolated CPU core. This could explain the connection between cpu workload on unisolated core and the performance of the testing program on the isolated core. However, I don't know how to spot the exact culprit of the problem.
KWang avatar
nu flag
@DougSmythies That's a good suggestion! The CPU I am working with is an Intel 8th gen i5. It has 4 physical cores and supports hyperthreading. So that gives me 8 cores when viewed from the kernel. I will try to isolate the entire physical core and report back.
KWang avatar
nu flag
@DougSmythies I have isolated both core #4 and core #8 and tested `Te` again. The problem persists.
Doug Smythies avatar
gn flag
I ran a CPU time gap measuring test for a few hours on one CPU, with the other CPU of the same core loaded 50% at very high speed token passing. The CPUs were not isolated, nor am I running a RT kernel. Never saw over 10 uSec delay. I suspect your contention is on the USB bus somehow and/or driver like you mentioned. Note sure how to help further.
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.