Score:0

20.04 update causes network errors and file copy corruption to windows clients

dz flag

I have a customer that has a Ubuntu 20.04 VM running inside a qNAP NAS server providing information management functionality to windows 10 clients. The VM is joined to their domain but the qNAP server is not. The NAS server is only used as an easy place to run the VM. The VM is providing much more functionality than the NAS device can for this client, the details of which are a bit involved and would require more space than this forum really allows. (Would be happy to discuss this with anyone on a side channel...)

This configuration has been running successfully for months without issues until last Friday. These NAS servers have been targeted by a ransomware gang so qNAP decided to force an upgrade of their NAS OS. Probably not the best approach but that's what happened. I then remoted into their system and updated the Ubuntu VM. Then the overlying software also was updated. In hindsight, it was probably not a good idea to update everything at once but that worked here in our lab. However, that's when the fun started...

The software that runs in the VM has windows client software that is upgraded via a Samba share that is mapped to a drive letter on the windows 10 systems. The .exe file that was downloaded had some strange errors and would not run. The file appeared corrupted. I manually copied the file again from the CIFS share on the VM and it failed again but with a different error. Each time I copied the file, the size was correct but would fail in a different manner. I then uploaded a copy directly to the windows machine bypassing the VM and all was well. As you can see, there are a lot of places to point fingers but here is what I've done in an attempt to isolate the problem:

  1. Did some copy testing with the following results.
  2. File uploads of just about any size (a few bytes to 1GB) all uploaded to the VM correctly.
  3. File downloads behaved differently based upon their size. Small files of a few bytes downloaded correctly. Files of the size of the executable (~700k) were corrupted. Files larger than that (a few MB) gave a windows error 0x8007003B. "Unexpected Network Error" Google of that seems to indicate a windows firewall issue, which is turned off, or a antivirus getting in the way, which was removed. Still getting the error.
  4. Looked in the logs on the VM and the windows machine and don't see any issues.
  5. Ran a long ping session to see if there might be network congestion. No dropped packets after many hours of running. The problems still persist over the weekend when everyone was gone.
  6. Did a winscp from the VM to the windows machine bypassing samba. Still has the error. Would indicate it is not a samba issue but a lower level networking issue on the VM.
  7. Fired up a share on the NAS device directly. That works perfectly. That would seem to eliminate some type of external networking problems such as bad cables or misconfigured networking devices. That does not eliminate the internal network bridge within the NAS device used by the VM, however. Looked to see if there was some configuration issue there but found none.
  8. Made sure jumbo frames were not enabled. My experience with misconfigured jumbo frames is dropped packets, not corruption or errors like this.
  9. Did a smbstatus on the VM to ensure everyone is running SMB3. They are. Not running encryption but are using signing with AES-128-CMAC. (NOTE: 52 windows machines have the share mapped to a drive letter.)
  10. Made sure Samba was running on TCP port 445. It was.
  11. I have two qNAP devices here in my lab that are not experiencing these problems. About the only difference with these that I can see is that they have VMs that are not on a domain. I have a hard time believing the VM being on the domain could cause data corruption but I think we have all seen weirder issues. Today's project is to get them configured on my test domain here and see what happens.
  12. The IT support person at this customer is exceptional. I've worked with him over the years with great success. There are no other reported networking issues at their site.
  13. He is willing to help here including firing up wireshark to see if we can tell what's happening.
  14. We are both stumped...

Here are a few things that I've considered trying but thought I would ask for advice since these are a bit time consuming...

  1. Get a VM connected to my test domain in my lab.
  2. Run wireshark on their network. See if we can get a look at what is happening on the wire.
  3. See if we can find an additional Linux machine to talk to the VM in an attempt to eliminate the windows client part of the configuration.
  4. Create an additional VM in the NAS device but not update it to the latest ubuntu code. See if that can help isolate some sort of upgrade issue.

Can anyone offer other things to look for or do?

Thanks in advance!

Bruce

Bruce avatar
dz flag
Oh yea, to make things ever more confusing, I use x2go to the Ubuntu VM. It works and does not exhibit any networking issues that I can see.
Bruce avatar
dz flag
OOps.. Nevermind. X2GO is getting errors as well....
Bruce avatar
dz flag
UPDATE: Setup a test VM on my domain running in the qNAP server. No corruption seen. Ran wireshark and it appeared that the data transfer was being closed before it was complete with an RST sent on the connection. Caused transfers still in process to be ignored, thus the data corruption. Used a linux OS to talk to ubuntu. Still corrupted data on the transfer. Still no closer to figuring this out....
Score:0
dz flag

Well, after 4 days on the customer site and lots of different configurations attempted, it is without a doubt a qNAP NAS device problem. That system update that was automatically installed to defend against the ransomware attack apparently broke the networking on their device. Note that I can't say it is every qNAP server since I was able to setup as close to a similar setup in our lab as possible, albeit with a smaller model qNAP server, without any network data corruption. However, after restoring their system onto local hardware and bringing their qNAP server back to our lab, the problem still persists on that server.

Here is a link to the qNAP forum if anyone is interested.

[https://forum.qnap.com/viewtopic.php?f=318&t=164927][1]

Bottom line... Has nothing to do with Ubuntu.

David avatar
cn flag
What link? Rants also do not help.
Bruce avatar
dz flag
Wild. The link vanished...
Bruce avatar
dz flag
Editing my response showed the link. Removed it and re-added. Looks like it showed up. Probably operator error.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.