I'm using keepalived to provide availability between two Alma 8 Nginx servers (hosted on VMWare if that's of any relevance). When firewalld is enabled, despite a rich rule being set for VRRP, when I bring firewalld up both hosts start to respond on the virtual IP:
root@dca-nfs01:~# arping 172.31.5.233
60 bytes from 00:50:56:84:ac:d0 (172.31.5.233): index=39 time=1.960 usec
60 bytes from 00:50:56:84:ac:d0 (172.31.5.233): index=40 time=20.660 usec
60 bytes from 00:50:56:84:52:ed (172.31.5.233): index=41 time=24.930 usec
60 bytes from 00:50:56:84:ac:d0 (172.31.5.233): index=42 time=534.616 msec
60 bytes from 00:50:56:84:52:ed (172.31.5.233): index=43 time=534.646 msec
My keepalived config is taken from a standard tutorial template and looks as follows:
[root@dca-ngx01-al ~]# cat /etc/keepalived/keepalived.conf
global_defs {
# Keepalived process identifier
router_id nginx
}
# Script to check whether Nginx is running or not
vrrp_script check_nginx {
script "/sbin/pidof nginx"
interval 2
weight 50
}
# Virtual interface - The priority specifies the order in which the assigned interface to take over in a failover
vrrp_instance VI_01 {
state MASTER
interface ens192
virtual_router_id 151
priority 110
# The virtual ip address shared between the two NGINX Web Server which will float
virtual_ipaddress {
172.31.5.233
}
track_script {
check_nginx
}
authentication {
auth_type AH
auth_pass secret
}
}
Both boxes have a simple one zone firewall, and I have added a rich rule to allow VRRP communication between the two hosts:
[root@dca-ngx01-al ~]# firewall-cmd --list-all
public (active)
target: default
icmp-block-inversion: no
interfaces: ens192
sources:
services: dhcpv6-client http https ssh
ports: 10050/tcp
protocols:
forward: no
masquerade: no
forward-ports:
source-ports:
icmp-blocks:
rich rules:
rule protocol value="vrrp" accept
I have also set net.ipv4.ip_forward = 1
in /etc/sysctl.conf
.
When firewalld is stopped on both boxes, keepalived behaves correctly, but when enabled to appear that both sides lose touch with each other, and just send out repeated gratuitous ARP packets:
● keepalived.service - LVS and VRRP High Availability Monitor
Loaded: loaded (/usr/lib/systemd/system/keepalived.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2022-03-25 12:48:25 GMT; 2h 35min ago
Process: 7140 ExecReload=/bin/kill -HUP $MAINPID (code=exited, status=0/SUCCESS)
Process: 12966 ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 12967 (keepalived)
Tasks: 2 (limit: 11406)
Memory: 1.8M
CGroup: /system.slice/keepalived.service
├─12967 /usr/sbin/keepalived -D
└─12968 /usr/sbin/keepalived -D
Mar 25 15:08:15 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233
Mar 25 15:08:15 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233
Mar 25 15:08:15 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233
Mar 25 15:08:15 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233
Mar 25 15:08:18 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: (VI_01) Sending/queueing gratuitous ARPs on ens192 for 1>
Mar 25 15:08:18 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233
Mar 25 15:08:18 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233
Mar 25 15:08:18 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233
Mar 25 15:08:18 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233
Mar 25 15:08:18 dca-ngx01-al.REDACTED.local Keepalived_vrrp[12968]: Sending gratuitous ARP on ens192 for 172.31.5.233
I can however see from using TCPDump that regular VRRP packets from the other host are at least hitting the network interface when firewalld is active:
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens192, link-type EN10MB (Ethernet), capture size 262144 bytes
15:25:21.532300 IP dca-ngx02-al.REDACTED.local > vrrp.mcast.net: AH(spi=0xac1f05e5,seq=0x3160): VRRPv2, Advertisement, vrid 151, prio 150, authtype ah, intvl 1s, length 20
15:25:22.532419 IP dca-ngx02-al.REDACTED.local > vrrp.mcast.net: AH(spi=0xac1f05e5,seq=0x3161): VRRPv2, Advertisement, vrid 151, prio 150, authtype ah, intvl 1s, length 20
15:25:23.532476 IP dca-ngx02-al.REDACTED.local > vrrp.mcast.net: AH(spi=0xac1f05e5,seq=0x3162): VRRPv2, Advertisement, vrid 151, prio 150, authtype ah, intvl 1s, length 20
15:25:24.532544 IP dca-ngx02-al.REDACTED.local > vrrp.mcast.net: AH(spi=0xac1f05e5,seq=0x3163): VRRPv2, Advertisement, vrid 151, prio 150, authtype ah, intvl 1s, length 20
Does anybody have any ideas as to how I can further troubleshoot this issue?
Thanks in advance.