When I use rsync to copy a 1TB dir from one USB drive to another USB drive the system goes haywire. The system load climbs up rapidly and processes trying to write to the destination drive go into the top app D state. The rsync copying hangs. Even an ls on that drive locks up.
I tried running rsync with --bwlimit and reducing the bandwidth just slows up the system load rise rate accordingly and even really slow rates like 1MB/sec don't help. Other processes strangely still work smoothly. Stopping rsync doesn't help and the only way to recover is to reboot. After rebooting I try the same copy with cp and everything runs smoothly so there is probably no problem with the drive.
I don't know if the drives are on the same internal hub. I'm on a Ubuntu 18.04.5 LTS server. Can anyone help?
EDIT: After 100GB of copying with cp, the cp process went into the top D state. The load isn't rising but it is stuck at 3.0 so I will need to reboot. Before the CP process ran the load was at 0.1. So something is still loading the system. Could the problem be in the drive? This drive has run ok with a lot of activity before this problem and is still ok.
EDIT2: Here are the kernel messages for the USB lockup resulting from rsync.
INFO: task usb-storage:285 blocked for more than 120 seconds.
Tainted: G W 4.15.0-197-generic #208-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
usb-storage D 0 285 2 0x80000000
Call Trace:
__schedule+0x24e/0x890
schedule+0x2c/0x80
schedule_timeout+0x1cf/0x370
Can anyone decipher that and suggest how I can avoid the lockup? Right now I am incapable of cloning a drive.
EDIT3: Even ddrescue died after two hours with the same error. This happens with different drives and cables. Either my hp laptop has bad usb hardware or there is a bug in the ubuntu usb driver (very unlikely). Unless someone has an idea of something else to try I give up. I'm getting a different job.
EDIT4: After much pain I'm pretty sure I know all I can know. I improved my diagnostication skills with iotop, top, & dmesg to fully understand when it happens. It happens with cp and rsync equally.
I found the source drive has many mysterious "bad spots". I couldn't find these with fsck, badblocks, or ddrescue so they aren't normal "bad blocks". I am making progress by painfully letting it proceed until it fails and then rebooting. Luckily the drive I'm copying is a large collection of files and not something like a db.