It seems to be the same or at least a similar problem as described by Dropbox (https://dropbox.tech/infrastructure/boosting-dropbox-upload-speed).
As far as I understand (please correct me!) when the Linux Gateway uses NIC multi-queue with Wireguard a lot of package reordering happens and apparently Windows 10 can't handle that too well.
The package reordering somehow causes Windows 10 to slow down the sending speed by waiting for an ack after almost every sent data packet instead of sending multiple packets and accepting selective acks.
I sadly forgot to make screenshots of the Wireshark sessions I analysed but it was very good visible that when downloading, the windows host usually got around 10-20 tcp data packets before sending an ack. But when uploading I got a TCP ack for each data package sent.
The solution to fix this is to disable multiqueuing on the Linux host.
ethtool -L PHYSICAL_LOCAL_INTERFACE combined 1
ethtool -L PHYSICAL_NETWORK_INTERFACE combined 1
To see if it was applied one can use
ethtool -l INTERFACENAME
Channel parameters for INTERFACENAME:
Pre-set maximums:
RX: 0
TX: 0
Other: 1
Combined: 63
Current hardware settings:
RX: 0
TX: 0
Other: 1
Combined: 1
The last line should be 1. The above command only sets this temporarily, to make it persistent the distro specific tools need to be used.
For Debian it could be something like this:
cat /etc/network/interfaces
auto INTERFACE
iface INTERFACE inet static
address IPADDR
netmask NETMASK
gateway GATEWAY
# This is the relevant line
post-up ethtool -L INTERFACE combined 1
This may create a bottleneck if the gateway doesn't have a strong CPU. We use AMD EPYC 7262 8-Core Processors and get the full 1Gbit up- & download with ~70% usage of one core.