Score:5

Ubuntu

High CPU usage with while loops running dd command

james hofer

3/11/24, 3:13 PM

I'm using this command to read continuous data from named pipes:

{ while :; do dd iflag=fullblock iflag=nonblock bs=65536 count=1 2> /dev/null | redis-cli -x PUBLISH myChannel ; done } < myFifo

The problem is, the CPU usage goes very high even if I run 50 of those commands concurrently. That process is supposed to be long running, and a lot of them should work concurrently.

So, what is the reason and how to prevent it? Thanks.

915

2 + 9

command-line

scripts

dd

terdon

3/11/24, 3:44 PM

By the way, wouldn't it be the same but much simpler to run `head -c 65536 myFifo | redis-cli -x PUBLISH myChannel` instead of `dd`?

0

Reply

Raffa

3/11/24, 3:45 PM

If I'm not mistaken ... you're using `dd` for buffering ... So why not just do `dd if=myFifo iflag=fullblock bs=65536 2> /dev/null | redis-cli -x PUBLISH myChannel` without a loop.

0

Reply

james hofer

3/11/24, 3:47 PM

@terdon Well I don't know. I'm a JS developer. Is that the same thing?

0

Reply

james hofer

3/11/24, 3:48 PM

@Raffa Exactly. That's for getting 64k buffers from the FIFO. I thought dd just run one time without the `while` loop. So no `while` is needed?

0

Reply

terdon

3/11/24, 3:48 PM

It should be... I am not sure either, but in general, `dd` is almost useless today and is best avoided because its syntax is confusing and very easy to get wrong. I don't know exactly what you want to do here, but as far as I can tell, you are repeatedly taking the first 65536 bytes of the file since you are stopping the `dd` with `count=1`. If so, then yes, the `head -c` approach would do the same thing.

0

Reply

terdon

3/11/24, 3:49 PM

For more on why `dd` isn't very useful today, see [dd vs cat -- is dd still relevant these days?](https://unix.stackexchange.com/q/12532).

0

Reply

james hofer

3/11/24, 3:51 PM

@terdon `redis-cli PUBLISH` only accepts chunks of data (not continuous stream) and if ONLY the command piping to it finishes its work. So I have to continuously get chunks and send it to publish.

0

Reply

terdon

3/11/24, 3:53 PM

OK, but your `dd` will continuously get the first chunk and exit. If you want to get all chunks, you'd want to remove the `count=1` since that tells `dd` to exit after the first chunk.

0

Reply

user2313067

3/12/24, 1:05 AM

Using both fullblock and nonblock seems strange to me. On the one hand, you say you want to wait for a complete 64k block (fullblock), on the other hand, you say you don't want to block if data is unavailable (nonblock). The nonblock flag in particular makes me think this will become an active wait loop, which is cpu heavy. Have you tried without that flag? Keep in mind that this will re-spawn a redis-cli process and reconnect to redis for every block. Doing it in a program (with nodejs for example) with a persistent redis connection would probably be more efficient.

1

Reply

Score:7

Ubuntu

Raffa

3/11/24, 4:04 PM

If I'm not mistaken, you're using dd for buffering purposes ... A loop might not be needed in your case ... dd will continue reading and feeding the specified size blocks of data if you don't set a count(This instructs dd to exit after fulfilling the specified number of reads/writes) or iflag=nonblock(You’d want Blocking I/O to successfully initiate the read and keep reading from a named pipe with dd) and use it like so:

dd if=myFifo iflag=fullblock bs=65536 2> /dev/null | redis-cli -x PUBLISH myChannel

In this case, it should only exit when the end of the input file is reached(e.g. when the writer to the named pipe terminates/closes the pipe).

Or to keep the pipe constantly open waiting for writes, use it like so:

tail -c +1 -F myFifo | dd iflag=fullblock bs=65536 2> /dev/null | redis-cli -x PUBLISH myChannel

Or if your application expects end of stream/pipe(e.g. EOF or close_write ... Which is BTW not the best choice for a streaming application), use GNU parallel instead like so:

tail -c +1 -F myFifo | parallel --max-procs 1 -P 1 -j 1 --pipe --block 64k -k 'redis-cli -x PUBLISH myChannel'

This should resemble your loop but only where you need it to … It will do so in a rather controlled and resources aware way … It should also keep the named pipe constantly open even between writes, preserve every bit of the stream and shorten the pipeline.

+ 8

terdon

3/11/24, 5:00 PM

Do FiFos have an end?

0

Reply

james hofer

3/11/24, 5:03 PM

Using your command, shell exits and I get an error in the process that is piping to fifo. "Broken pipe".

0

Reply

Raffa

3/11/24, 5:09 PM

@jameshofer Then try try using it like so `tail -F myFifo | dd iflag=nonblock iflag=fullblock bs=65536 2> /dev/null | redis-cli -x PUBLISH myChannel` instead.

0

Reply

Raffa

3/11/24, 5:26 PM

@terdon Yep, like everything else in life :-) … Please see for example https://unix.stackexchange.com/q/366219 and vice versa see also https://askubuntu.com/q/1456532

0

Reply

Raffa

3/11/24, 7:21 PM

@jameshofer Please check the second command now ... Added `-c +1` to `tail` for bytes instead of lines (*should work better with pipes*) ... And removed `iflag=nonblock` for `dd`

0

Reply

james hofer

3/11/24, 7:24 PM

@Raffa Alright friend. Let's see what happens.

0

Reply

james hofer

3/11/24, 7:31 PM

@Raffa You know, your logic is correct. I mean that must do the thing. But the problem is `redis-cli -x PUBLISH myChannel` must run for each chunk. It can't receive a data stream.

0

Reply

Raffa

3/12/24, 9:13 AM

@jameshofer I see ... It is in chunks/blocks ... But, your application appears to be waiting for pipe close to execute (*Which is oddly enough used for streaming*) :-) ... I have updated the answer for that one ... I hope it works.

1

Reply

Score:2

Ubuntu

terdon

3/11/24, 3:25 PM

The while : is spamming your CPU. Any command in a while :; loop will result in high CPU usage. For example:

while :; do echo foo > /dev/null; done

Or, even more blatant, a no-op:

while :; do true; done

The command you run is almost irrelevant: if the command itself doesn't take much time, then the while will cause a high CPU usage. The while : means that the loop will be relaunched as soon as the command ends, over and over again. And since this dd will finish almost instantaneously, that means it is being launched over and over, multiple times a second, threfore taking loads of CPU.

The solution is to add a small pause between invocations:

{ 
  while :; do 
    dd iflag=fullblock iflag=nonblock bs=65536 count=1 2> /dev/null | 
      redis-cli -x PUBLISH myChannel 
    sleep 0.1
  done 
} < myFifo

Adding even a 0.1 second sleep between runs will stop this from being a CPU hog.

+ 9

james hofer

3/11/24, 3:48 PM

Thanks. Let me test it.

0

Reply

james hofer

3/11/24, 5:09 PM

Testing your solution resulted in less CPU usage, which was obvious. ( 40% with my own code and 20% with yours). Thank you. But, I think it just procrastinates CPU heating. I mean while loop still runs a lot. I'm wondering if there's a way to wait for incoming data to be available, then read chunks. I'm not sure.

0

Reply

terdon

3/11/24, 5:32 PM

There might be, @jameshofer, but we'd need to know exactly what you are doing, what is writing to the FiFo, how it's created etc. You might want to post a new question, either here or on [unix.se], giving more detail. I suggest you post on [unix.se] actually since there is a higher concentration of command line geeks there so my bed is that you'd be more likely to get an answer there. It's entirely up to you though, the question would be on topic on both sites (as long as you're using Ubuntu anyway, and not any other Linux).

0

Reply

james hofer

3/11/24, 5:38 PM

It's video data. Mpegts generating by ffmpeg. And I want to broadcast it to users. This is the whole case. If you think it's better to ask the question there. I'll do.

0

Reply

terdon

3/11/24, 5:43 PM

I think Raffa gave you exactly what you're looking for, so it's probably not worth it, @jameshofer.

0

Reply

james hofer

3/11/24, 5:49 PM

I thought the same, and got excited. Not gonna lie. But, while that works in theory, I'm not receiving data in redis. So...The main problem is `redis-cli -x PUBLISH mychannel` must run with each chunk of data.

0

Reply

terdon

3/11/24, 5:54 PM

IN that case, I would indeed ask over on [unix.se]. Just make sure to include as much detail as you can so the folks there understand the full picture.

1

Reply

terdon

3/14/24, 10:06 AM

@ilkkachu my point is that it isn't the command that is causing the high CPU usage but the fact that the loop is running non-stop. So it isn't the command, it's the loop itself. I never said there's anything intrinsically wrong with infinite loops, only with loops that don't wait between invocations for example because they are running commands that exit immediately.

0

Reply

terdon

3/14/24, 10:42 AM

@ilkkachu I was hoping the next sentence would put it into context, but fair enough. Answer edited, thanks.

0

Reply

Elon Musk

I sit in a Tesla and translated this thread with Ai:

EN: High CPU usage with while loops running dd command

High CPU usage with while loops running dd command

Post an answer