Score:1

Ipsec VPN to AWS: Can't ping AWS end inside tunnel

in flag

Summary: I think I'm missing some routes on my Ubuntu server connecting to an AWS VPN with Strongswan Ipsec. Any idea what routes I need on my server?

I'm trying to setup a BGP routed VPN from a server to AWS. I've had this working with a statically routed VPN, so I know my AWS-side and ipsec is correct. However, BGP is proving tricky.

I'm using Strongswan on Ubuntu as the "customer" side. My ipsec tunnels are up (confirmed by ipsec status and in the AWS console). The problem appears to be that my BGP client (frr) can't connect to the BGP process in AWS (and vice-versa).

Taking one of the two tunnels, ip a looks like this:

5: tunnel1@NONE: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1419 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ipip 1.2.3.4 peer 2.3.4.5
    inet 169.254.10.146 peer 169.254.10.145/30 scope global tunnel1
       valid_lft forever preferred_lft forever

The interface is created with a script called by Strongswan, which looks like this:

ip tunnel add ${VTI_INTERFACE} local ${PLUTO_ME} remote ${PLUTO_PEER} mode vti key ${PLUTO_MARK_IN_ARR[0]}
sysctl -w net.ipv4.conf.${VTI_INTERFACE}.disable_policy=1
sysctl -w net.ipv4.conf.${VTI_INTERFACE}.rp_filter=2 || sysctl -w net.ipv4.conf.${VTI_INTERFACE}.rp_filter=0
ip addr add ${VTI_LOCALADDR} remote ${VTI_REMOTEADDR} dev ${VTI_INTERFACE}
ip link set ${VTI_INTERFACE} up mtu ${MTU}
iptables -t mangle -I FORWARD -o ${VTI_INTERFACE} -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu
iptables -t mangle -I INPUT -p esp -s ${PLUTO_PEER} -d ${PLUTO_ME} -j MARK --set-xmark ${PLUTO_MARK_IN}

I have a route like this:

169.254.10.144/30 dev tunnel1 proto kernel scope link src 169.254.10.146

I can ping my local (customer) end (169.254.10.146), but pinging the remote end says:

PING 169.254.10.145 (169.254.10.145) 56(84) bytes of data.
From 169.254.10.146 icmp_seq=1 Destination Host Unreachable

Despite saying it's unreachable, I see a packet on the tunnel interface, although it doesn't appear this is actually sent down the tunnel to AWS (at least, AWS Cloudwatch stats don't show additional activity).

One of the BGP debug steps is to try to 'telnet' to the neighbour (169.254.10.145) on port 179 - but this fails with "no route to host". Curiously though, I can see AWS trying to connect to my BGP client in tcpdump on the tunnel interface:

12:49:45.860899 IP 169.254.10.145.43731 > 169.254.10.146.179: Flags [S], seq 857645259, win 26880, options [mss 1375,sackOK,TS val 3261400514 ecr 0,nop,wscale 7], length 0
12:49:45.860931 IP 169.254.10.146.179 > 169.254.10.145.43731: Flags [S.], seq 2284055933, ack 857645260, win 64249, options [mss 1379,sackOK,TS val 936428713 ecr 3261400514,nop,wscale 7], length 0

It looks like my server is returning an ACK packet, but the connection is never established. If I stop the BGP service, we start sending RST responses:

12:52:02.055473 IP 169.254.10.145.43679 > 169.254.10.146.179: Flags [S], seq 2280717367, win 26880, options [mss 1375,sackOK,TS val 3261536708 ecr 0,nop,wscale 7], length 0
12:52:02.055498 IP 169.254.10.146.179 > 169.254.10.145.43679: Flags [R.], seq 0, ack 1, win 0, length 0

From this my guess is that the kernel's ACK packet, and my ping packet are going onto the tunnel interface, but aren't actually getting into ipsec or going to AWS.

It seems then that I need some additional routes adding, but I can't figure out what might be required. It feels like I have everything in place already. I've looked at countless examples and instructions, but few show anything I don't have, and even fewer confirm I need the routes I think I need, or what routes are required for BGP to work. Any help much appreciated.

Score:0
in flag

It turns out the solution is an additional xfrm policy. The thinking behind it is that whilst packets are getting onto the interface, the traffic isn't being "scooped up" by ipsec. To resolve the problem, it seems we need to mark the packets (in the same way as we do in the ipsec config and iptables), and also to specifically "activate" xfrm by policy. In my case, it was this:

ip xfrm policy add dst 169.254.10.144/30 src 169.254.10.144/30 dir out tmpl src 1.2.3.4 dst 2.3.4.5 proto esp spi 0xc0f93fba reqid 1 mode tunnel mark 0x64

This makes a policy that looks like this:

src 169.254.10.144/30 dst 169.254.10.144/30
    dir out priority 0
    mark 0x64/0xffffffff
    tmpl src 1.2.3.4 dst 2.3.4.5
        proto esp spi 0xc0f93fba reqid 1 mode tunnel

This contains some magic, which it seems we have to get from the existing policy. The first thing is the "spi number". This gets set by Strongswan, and isn't, I don't think, predictable, nor available in the environment passed to ipsec-vti.sh. Instead, you'll need to do some grep and awk to pull it out of ip xfrm policy output. Likewise the reqid - this set with 1 for the first interface and 2 for the second that gets created - sadly, tunnel1 isn't always the first, so you'll need to pull that number out too.

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.