Score:0

IPSec VPN between StrongSwan and DrayTek Router fails on second rekeying

br flag

I want to establish an always-on IPSec VPN between a DrayTek Vigor2860 and an EdgeRouter X (which uses StrongSwan). The Draytek is behind a NAT and dials into the ER-X. The VPN connects and works, but disconnects at the second rekeying. It then reconnects a few seconds later, but these disconnects are annoying.

The VPN is configured as IPSec in tunneling mode with IKEv2 key exchange. It uses ESP with AES128 with SHA1 and DH Group 14 with perfect forward secrecy enabled. Authentication is done with a PSK. The Draytek connects to Strongswan. Strongswan is set to rekey=no, therefore only the Draytek initialized rekeyings. (See below for the detailed config)

I also tried IKEv1, but it had the same problems. rekey=yes has the same problems.


What have I tried?

In the initial setup, the connection would be lost at the first rekeying. This is probably due to the fact that the rekeymargin on the draytek seems to be 300s. Therefore strongswan would be the first to try to rekey, which would fail. Setting charon.make_before_break = yes for strongswan seemed to mitigate this.

To make debugging easier, I manually added rekey=no to the strongswan config. Therefore the Draytek is now the only one initializing a rekeying. Then the following happens.

  1. connection is initialized.

  2. shortly before the lifetime expires, the draytek initializes a rekeying which succeeds (!)

  3. strongswan now has two CHILD_SAs for a few seconds. The older one gets deleted. The connection works the whole time (I had a ping running)

  4. After another lifetime the connection disconnects while rekeying.

Looking at the tunnels in step 3, there seems to be a slight mismatch in settings. Pay attention to the MODP_2048 at the end of the second tunnel. I suspect that I need to change my ESP settings for strongswan slightly, but how? MODP_2048 corresponds to DH Group 14, as set on the Draytek.

# swanctl -l
peer-remote.example.com-tunnel-1: #2, ESTABLISHED, IKEv2, 2fa27291636d715a_i e14d5bf01207b22c_r*
  local  'xxx.xxx.xxx.xxx' @ xxx.xxx.xxx.xxx[4500]
  remote 'remote.example.com' @ rem.ote.ip.addr[61001]
  AES_CBC-128/HMAC_SHA1_96/PRF_HMAC_SHA1/MODP_2048
  established 478s ago
  peer-remote.example.com-tunnel-1: #2, reqid 1, INSTALLED, TUNNEL-in-UDP, ESP:AES_CBC-128/HMAC_SHA1_96
    installed 478s ago
    in  ca261c8b, 104209 bytes,   507 packets,     0s ago
    out ceeadfb4, 114045 bytes,   624 packets,     0s ago
    local  192.168.70.0/24
    remote 192.168.71.0/24
  peer-remote.example.com-tunnel-1: #3, reqid 1, INSTALLED, TUNNEL-in-UDP, ESP:AES_CBC-128/HMAC_SHA1_96/MODP_2048
    installed 26s ago
    in  c6b99851,   5799 bytes,    27 packets,     0s ago
    out ceeadfb5,   6334 bytes,    32 packets,     0s ago
    local  192.168.70.0/24
    remote 192.168.71.0/24

I also found this hint in the strongswan wiki which seems to point in the same direction:
https://wiki.strongswan.org/projects/strongswan/wiki/connsection In the esp = <cipher suites> section:

If dh-group is specified, CHILD_SA rekeying and initial negotiation include a separate Diffe-Hellman exchange (since 5.0.0 this also applies to IKEv1 Quick Mode). However, for IKEv2, the keys of the CHILD_SA created implicitly with the IKE_SA will always be derived from the IKE_SA's key material. So any DH group specified here will only apply when the CHILD_SA is later rekeyed or is created with a separate CREATE_CHILD_SA exchange. Therefore, a proposal mismatch might not immediately be noticed when the SA is established, but may later cause rekeying to fail.


If I set the lifetime to 86400s (the maximum for the draytek), then the connection runs fine for hours. Which means, the underlying DSL connection is not causing the issues. If I change the lifetime to 600s (the minimum for draytek) then the connection fails about every 1000 seconds. (2x 600 - 300).


ER-X Config (anonymized):

# show vpn
 ipsec {
     allow-access-to-local-interface enable
     auto-firewall-nat-exclude enable
     esp-group FOO0 {
         compression disable
         lifetime 86400
         mode tunnel
         pfs enable
         proposal 1 {
             encryption aes128
             hash sha1
         }
     }
     global-config "charon.make_before_break := yes"
     ike-group FOO0 {
         ikev2-reauth no
         key-exchange ikev2
         lifetime 86400
         proposal 1 {
             dh-group 14
             encryption aes128
             hash sha1
         }
     }
     ipsec-interfaces {
         interface eth0
     }
     site-to-site {
         peer remote.example.com {
             authentication {
                 mode pre-shared-secret
                 pre-shared-secret "secret"
             }
             connection-type respond
             description remote
             ike-group FOO0
             ikev2-reauth inherit
             local-address xxx.xxx.xxx.xxx
             tunnel 1 {
                 allow-nat-networks disable
                 allow-public-networks disable
                 esp-group FOO0
                 local {
                     prefix 192.168.70.0/24
                 }
                 remote {
                     prefix 192.168.71.0/24
                 }
             }
         }
     }
 }

This results in the following ipsec.conf for strongswan with a manual edit to add rekey=no.

conn peer-remote-example.com-tunnel-1
    left=xxx.xxx.xxx.xxx
    right=remote.example.com
    rightid="%any"
    leftsubnet=192.168.70.0/24
    rightsubnet=192.168.71.0/24
    ike=aes128-sha1-modp2048!
    keyexchange=ikev2
    reauth=no
    ikelifetime=86400s
    esp=aes128-sha1-modp2048!
    keylife=86400s
    rekey=no
    rekeymargin=540s
    type=tunnel
    compress=no
    authby=secret
    auto=route
    keyingtries=1
cn flag
What exactly does "After another lifetime the connection disconnects while rekeying" mean? Disconnects how? Is there an error logged? Is it deleted explicitly? Please add the logs that show the first and second rekeyings to compare. And you already have the correct ESP proposal configured to use DH/PFS during CHILD_SA rekeying (you can also read more about the difference in the status output [here](https://docs.strongswan.org/docs/5.9/config/rekeying.html#_ikev2)).
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.