Score:5

High CPU usage with while loops running dd command

cn flag

I'm using this command to read continuous data from named pipes:

{ while :; do dd iflag=fullblock iflag=nonblock bs=65536 count=1 2> /dev/null | redis-cli -x PUBLISH myChannel ; done } < myFifo

The problem is, the CPU usage goes very high even if I run 50 of those commands concurrently. That process is supposed to be long running, and a lot of them should work concurrently.

So, what is the reason and how to prevent it? Thanks.

terdon avatar
cn flag
By the way, wouldn't it be the same but much simpler to run `head -c 65536 myFifo | redis-cli -x PUBLISH myChannel` instead of `dd`?
Raffa avatar
jp flag
If I'm not mistaken ... you're using `dd` for buffering ... So why not just do `dd if=myFifo iflag=fullblock bs=65536 2> /dev/null | redis-cli -x PUBLISH myChannel` without a loop.
james hofer avatar
cn flag
@terdon Well I don't know. I'm a JS developer. Is that the same thing?
james hofer avatar
cn flag
@Raffa Exactly. That's for getting 64k buffers from the FIFO. I thought dd just run one time without the `while` loop. So no `while` is needed?
terdon avatar
cn flag
It should be... I am not sure either, but in general, `dd` is almost useless today and is best avoided because its syntax is confusing and very easy to get wrong. I don't know exactly what you want to do here, but as far as I can tell, you are repeatedly taking the first 65536 bytes of the file since you are stopping the `dd` with `count=1`. If so, then yes, the `head -c` approach would do the same thing.
terdon avatar
cn flag
For more on why `dd` isn't very useful today, see [dd vs cat -- is dd still relevant these days?](https://unix.stackexchange.com/q/12532).
james hofer avatar
cn flag
@terdon `redis-cli PUBLISH` only accepts chunks of data (not continuous stream) and if ONLY the command piping to it finishes its work. So I have to continuously get chunks and send it to publish.
terdon avatar
cn flag
OK, but your `dd` will continuously get the first chunk and exit. If you want to get all chunks, you'd want to remove the `count=1` since that tells `dd` to exit after the first chunk.
user2313067 avatar
la flag
Using both fullblock and nonblock seems strange to me. On the one hand, you say you want to wait for a complete 64k block (fullblock), on the other hand, you say you don't want to block if data is unavailable (nonblock). The nonblock flag in particular makes me think this will become an active wait loop, which is cpu heavy. Have you tried without that flag? Keep in mind that this will re-spawn a redis-cli process and reconnect to redis for every block. Doing it in a program (with nodejs for example) with a persistent redis connection would probably be more efficient.
Score:7
jp flag

If I'm not mistaken, you're using dd for buffering purposes ... A loop might not be needed in your case ... dd will continue reading and feeding the specified size blocks of data if you don't set a count(This instructs dd to exit after fulfilling the specified number of reads/writes) or iflag=nonblock(You’d want Blocking I/O to successfully initiate the read and keep reading from a named pipe with dd) and use it like so:

dd if=myFifo iflag=fullblock bs=65536 2> /dev/null | redis-cli -x PUBLISH myChannel

In this case, it should only exit when the end of the input file is reached(e.g. when the writer to the named pipe terminates/closes the pipe).

Or to keep the pipe constantly open waiting for writes, use it like so:

tail -c +1 -F myFifo | dd iflag=fullblock bs=65536 2> /dev/null | redis-cli -x PUBLISH myChannel

Or if your application expects end of stream/pipe(e.g. EOF or close_write ... Which is BTW not the best choice for a streaming application), use GNU parallel instead like so:

tail -c +1 -F myFifo | parallel --max-procs 1 -P 1 -j 1 --pipe --block 64k -k 'redis-cli -x PUBLISH myChannel'

This should resemble your loop but only where you need it to … It will do so in a rather controlled and resources aware way … It should also keep the named pipe constantly open even between writes, preserve every bit of the stream and shorten the pipeline.

terdon avatar
cn flag
Do FiFos have an end?
james hofer avatar
cn flag
Using your command, shell exits and I get an error in the process that is piping to fifo. "Broken pipe".
Raffa avatar
jp flag
@jameshofer Then try try using it like so `tail -F myFifo | dd iflag=nonblock iflag=fullblock bs=65536 2> /dev/null | redis-cli -x PUBLISH myChannel` instead.
Raffa avatar
jp flag
@terdon Yep, like everything else in life :-) … Please see for example https://unix.stackexchange.com/q/366219 and vice versa see also https://askubuntu.com/q/1456532
Raffa avatar
jp flag
@jameshofer Please check the second command now ... Added `-c +1` to `tail` for bytes instead of lines (*should work better with pipes*) ... And removed `iflag=nonblock` for `dd`
james hofer avatar
cn flag
@Raffa Alright friend. Let's see what happens.
james hofer avatar
cn flag
@Raffa You know, your logic is correct. I mean that must do the thing. But the problem is `redis-cli -x PUBLISH myChannel` must run for each chunk. It can't receive a data stream.
Raffa avatar
jp flag
@jameshofer I see ... It is in chunks/blocks ... But, your application appears to be waiting for pipe close to execute (*Which is oddly enough used for streaming*) :-) ... I have updated the answer for that one ... I hope it works.
Score:2
cn flag

The while : is spamming your CPU. Any command in a while :; loop will result in high CPU usage. For example:

while :; do echo foo > /dev/null; done

Or, even more blatant, a no-op:

while :; do true; done

The command you run is almost irrelevant: if the command itself doesn't take much time, then the while will cause a high CPU usage. The while : means that the loop will be relaunched as soon as the command ends, over and over again. And since this dd will finish almost instantaneously, that means it is being launched over and over, multiple times a second, threfore taking loads of CPU.

The solution is to add a small pause between invocations:

{ 
  while :; do 
    dd iflag=fullblock iflag=nonblock bs=65536 count=1 2> /dev/null | 
      redis-cli -x PUBLISH myChannel 
    sleep 0.1
  done 
} < myFifo

Adding even a 0.1 second sleep between runs will stop this from being a CPU hog.

james hofer avatar
cn flag
Thanks. Let me test it.
james hofer avatar
cn flag
Testing your solution resulted in less CPU usage, which was obvious. ( 40% with my own code and 20% with yours). Thank you. But, I think it just procrastinates CPU heating. I mean while loop still runs a lot. I'm wondering if there's a way to wait for incoming data to be available, then read chunks. I'm not sure.
terdon avatar
cn flag
There might be, @jameshofer, but we'd need to know exactly what you are doing, what is writing to the FiFo, how it's created etc. You might want to post a new question, either here or on [unix.se], giving more detail. I suggest you post on [unix.se] actually since there is a higher concentration of command line geeks there so my bed is that you'd be more likely to get an answer there. It's entirely up to you though, the question would be on topic on both sites (as long as you're using Ubuntu anyway, and not any other Linux).
james hofer avatar
cn flag
It's video data. Mpegts generating by ffmpeg. And I want to broadcast it to users. This is the whole case. If you think it's better to ask the question there. I'll do.
terdon avatar
cn flag
I think Raffa gave you exactly what you're looking for, so it's probably not worth it, @jameshofer.
james hofer avatar
cn flag
I thought the same, and got excited. Not gonna lie. But, while that works in theory, I'm not receiving data in redis. So...The main problem is `redis-cli -x PUBLISH mychannel` must run with each chunk of data.
terdon avatar
cn flag
IN that case, I would indeed ask over on [unix.se]. Just make sure to include as much detail as you can so the folks there understand the full picture.
terdon avatar
cn flag
@ilkkachu my point is that it isn't the command that is causing the high CPU usage but the fact that the loop is running non-stop. So it isn't the command, it's the loop itself. I never said there's anything intrinsically wrong with infinite loops, only with loops that don't wait between invocations for example because they are running commands that exit immediately.
terdon avatar
cn flag
@ilkkachu I was hoping the next sentence would put it into context, but fair enough. Answer edited, thanks.
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.