Score:1

Wireguard connectivity between handshakes

cf flag

I've been having this really weird issue. I'm running WG on a VPS and on my macbook. I run WG on the linuxserver container on a debian host. The connection is great, the speed is good, everything works really well. I've noticed though that every once in a while (like every 10-20min) there will be a handshake and my connection to the internet instantly drops. I can still access internal services so I know my macbook is still connected to the server. I can't access the internet for 16seconds until the next handshake when the internet instantly comes back.

I've monitored the server during this and I can see that while downloading a torrent there's a lot of kworker/1:1-wg-crypt-wg0 processes running and when this happens all of these processes are killed. It's almost as if the server is being restarted or WG is being restarted, but I know that's not the case because I can still access internal containers so the connection is there and WG didn't go down.

I know for a fact that I still have internet on other devices so it has something to do with wireguard.

This happens regardless of whether I'm doing intense networking like downloading a torrent or just browsing the web. It looks like it's not the VPS losing connection. I've been pinging google and when it happens the VPS keeps pinging google and my macbook can't ping google

I'm looking for help on what it could be and what I should look into and how I can approach this... could it be something related to my router? Could it be a config issue?

Here are my configs:

Server

   [Interface]

   # Core settings
   PrivateKey = xxxxx
   Address = 10.6.0.0/24

   # Misc. settings (optional)
   ListenPort = 51820

   # Interface hooks (optional)
   PostUp = iptables -A FORWARD -i %i -j ACCEPT; iptables -A FORWARD -o %i -j ACCEPT; iptables -t nat -A POSTROUTING -o eth+ -j MASQUERADE
   PostDown = iptables -D FORWARD -i %i -j ACCEPT; iptables -D FORWARD -o %i -j ACCEPT; iptables -t nat -D POSTROUTING -o eth+ -j MASQUERADE

   MTU = 1400

   #
   # Peers
   #

   [Peer]
   PublicKey = xxxxx
   PresharedKey = xxxx
   AllowedIPs = 10.6.0.2/32
   PersistentKeepalive = 16

MACBOOK

   [Interface]

   # Core settings

   PrivateKey = xxxxx
   Address = 10.6.0.2/32

   # Misc. settings (optional)
   DNS = xxxxx

   MTU = 1400

   [Peer]
   PublicKey = xxxx
   Endpoint = xxxx:51820
   AllowedIPs = 10.6.0.1/32, 0.0.0.0/0
   PresharedKey = xxxx
   PersistentKeepalive = 16

UPDATE:

These are the logs I get on the server side with modprobe wireguard.

Mar 16 13:55:54 [  +2.156106] wireguard: wg0: Receiving handshake initiation from peer 315 (<my-client-ip>:16235)
Mar 16 13:55:54 [  +0.000003] wireguard: wg0: Sending handshake response to peer 315 (<my-client-ip>)
Mar 16 13:55:54 [  +0.000165] wireguard: wg0: Keypair 10604 destroyed for peer 315
Mar 16 13:55:54 [  +0.000002] wireguard: wg0: Keypair 10606 created for peer 315

Mar 16 13:56:10 [  +2.451426] wireguard: wg0: Receiving handshake initiation from peer 315 (<my-client-ip>)
Mar 16 13:56:10 [  +0.000003] wireguard: wg0: Sending handshake response to peer 315 (<my-client-ip>)
Mar 16 13:56:10 [  +0.000185] wireguard: wg0: Keypair 10605 destroyed for peer 315
Mar 16 13:56:10 [  +0.000001] wireguard: wg0: Keypair 10607 created for peer 315
Mar 16 13:56:10 [  +0.161195] wireguard: wg0: Receiving keepalive packet from peer 315 (<my-client-ip>)

And from the client side, pinging google.com to check internet connection

Mar 16 10:55:54 64 bytes from <google-ip>: icmp_seq=187 ttl=108 time=161.723 ms
Mar 16 10:55:56 Request timeout for icmp_seq 188
Mar 16 10:55:57 Request timeout for icmp_seq 189
Mar 16 10:55:58 Request timeout for icmp_seq 190
Mar 16 10:55:59 Request timeout for icmp_seq 191
Mar 16 10:56:00 Request timeout for icmp_seq 192
Mar 16 10:56:01 Request timeout for icmp_seq 193
Mar 16 10:56:02 Request timeout for icmp_seq 194
Mar 16 10:56:03 Request timeout for icmp_seq 195
Mar 16 10:56:04 Request timeout for icmp_seq 196
Mar 16 10:56:05 Request timeout for icmp_seq 197
Mar 16 10:56:06 Request timeout for icmp_seq 198
Mar 16 10:56:07 Request timeout for icmp_seq 199
Mar 16 10:56:08 Request timeout for icmp_seq 200
Mar 16 10:56:09 Request timeout for icmp_seq 201
Mar 16 10:56:10 Request timeout for icmp_seq 202
Mar 16 10:56:11 Request timeout for icmp_seq 203
Mar 16 10:56:11 64 bytes from <google-ip>: icmp_seq=204 ttl=108 time=161.172 ms

so there are two successful handshakes, but in between them I can't access the web

UPDATE 2:

Managed to get some level of logging on the client side (macbook). By running

sudo LOG_LEVEL=verbose wg show

I'm getting logs on when the macbook is receiving handshake responses and initiating handshakes.

In this new example the server logs:

~INTERNET GOES DOWN HERE~
Mar 16 16:47:09 [  +0.393175] wireguard: wg0: Receiving handshake initiation from peer 315 (<client-ip>)
Mar 16 16:47:09 [  +0.000003] wireguard: wg0: Sending handshake response to peer 315 (<client-ip>)
Mar 16 16:47:09 [  +0.000175] wireguard: wg0: Keypair 10790 destroyed for peer 315
Mar 16 16:47:09 [  +0.000001] wireguard: wg0: Keypair 10793 created for peer 315
Mar 16 16:47:09 [  +0.280476] wireguard: wg0: Receiving keepalive packet from peer 315 (<client-ip>)

~INTERNET GOES BACK UP RIGHT AFTER THE NEXT LINES~

Mar 16 16:47:25 [  +1.391045] wireguard: wg0: Receiving handshake initiation from peer 315 (<client-ip>)
Mar 16 16:47:25 [  +0.000003] wireguard: wg0: Sending handshake response to peer 315 (<client-ip>)
Mar 16 16:47:25 [  +0.000166] wireguard: wg0: Keypair 10792 destroyed for peer 315
Mar 16 16:47:25 [  +0.000002] wireguard: wg0: Keypair 10794 created for peer 315
Mar 16 16:47:25 [  +0.159758] wireguard: wg0: Receiving keepalive packet from peer 315 (<client-ip>)

On the client side I see

DEBUG: (utun6) 2023/03/16 13:42:35 peer(xxxx) - Received handshake response
DEBUG: (utun6) 2023/03/16 13:42:35 peer(xxxx) - Sending keepalive packet
DEBUG: (utun6) 2023/03/16 13:44:35 peer(xxxx) - Sending handshake initiation
DEBUG: (utun6) 2023/03/16 13:44:35 peer(xxxx) - Received handshake response
DEBUG: (utun6) 2023/03/16 13:44:35 peer(xxxx) - Sending keepalive packet
DEBUG: (utun6) 2023/03/16 13:46:35 peer(xxxx) - Sending handshake initiation
DEBUG: (utun6) 2023/03/16 13:46:35 peer(xxxx) - Received handshake response
DEBUG: (utun6) 2023/03/16 13:46:35 peer(xxxx) - Sending keepalive packet
DEBUG: (utun6) 2023/03/16 13:47:24 peer(xxxx) - Retrying handshake because we stopped hearing back after 15 seconds
DEBUG: (utun6) 2023/03/16 13:47:24 peer(xxxx) - Sending handshake initiation
DEBUG: (utun6) 2023/03/16 13:47:24 peer(xxxx) - Received handshake response
DEBUG: (utun6) 2023/03/16 13:47:24 peer(xxxx) - Sending keepalive packet

I find the Retrying handshake because we stopped hearing back after 15 seconds interesting cause that's when the internet goes down. So handshake fails and client retries and then it works? But why?

UPDATE:

I see that there's something going on with the faulty handshake port.

I have a watch wg show all in the server and I can see that my macbook peer config is <my-ip>:16918. Whenever a handshake fails and I lose connection I see a log like this on the server side:

Mar 16 18:17:19 [  +1.217668] wireguard: wg0: Receiving handshake initiation from peer 318 (<client-ip>:16235)
Mar 16 18:17:19 [  +0.000004] wireguard: wg0: Sending handshake response to peer 318 (<client-ip>:16235)
Mar 16 18:17:19 [  +0.000191] wireguard: wg0: Keypair 10893 destroyed for peer 318
Mar 16 18:17:19 [  +0.000002] wireguard: wg0: Keypair 10896 created for peer 318
Mar 16 18:17:19 [  +0.275182] wireguard: wg0: Receiving keepalive packet from peer 318 (<client-ip>:16235)




Mar 16 18:17:34 [  +0.687921] wireguard: wg0: Receiving handshake initiation from peer 318 (<client-ip>:16918)
Mar 16 18:17:34 [  +0.000004] wireguard: wg0: Sending handshake response to peer 318 (<client-ip>:16918)
Mar 16 18:17:34 [  +0.000250] wireguard: wg0: Keypair 10894 destroyed for peer 318
Mar 16 18:17:34 [  +0.000002] wireguard: wg0: Keypair 10897 created for peer 318
Mar 16 18:17:34 [  +0.164268] wireguard: wg0: Receiving keepalive packet from peer 318 (<client-ip>:16918)

For some reason the handshake is coming from a different port and so the server responds to the same port, but the client isn't listening on that port

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.