I'm currently struggling with my Juniper Switch Stack.
Topology is like this Topology
The Client Ports on the Stack are configured as tagged-access with dot1x (multiple supplicant) and they switch according to the Radius authentication. This works without a problem and VLANs get correctly assigned.
The 2 PFSense firewalls do provide one DHCP instance for every VLAN in failover configuration with an CARP IP on the same subnet as the VLAN. So no DHCP Relay is needed.
Windows clients can obtain an IP and work correctly but Linux clients and PXE boot do not.
From tcpdump and Wireshark we see a DHCP Discover/Offer loop on the Linux clients. The offer reaches the client but the client does not send a DHCP Request. We tried multiple Linux derivatives and PXE implementations but without any luck. We also compared the Wireshark captures from Windows and Linux and there is absolutely no difference.
Any suggestions on how to track down the problem?
Thanks in advance.
Update:
Just to add more information.
The IP assignment flow is like this:
- Client starts up (NIC connects to Switch stack)
- Switch authenticates the Client against the Radius Server
- Radius Server answers with Accept and VLAN ID 940
- Switch stack assigns VLAN 940 to the Port the Client is connecting in multiple supplicant mode
- Clients sends out DHCP Discover
- DHCP Server (both PFSense) respond with an offer.
- Client sends a DHCP Request
- DHCP Server sends an DHCP ACK
So obviously 1-6 is working. The Client gets assigned to VLAN 940 through the Radius Server, sends out a DHCP discover, both PFSense have a DHCP instance configured for the VLAN 940 (IP Range 10.94.0.1-200/24) and they send an offer.
This is a tcpdump on one of the PFsense firewalls in case it helps.
18:55:25.538580 IP (tos 0x0, ttl 20, id 3, offset 0, flags [none], proto UDP (17), length 576)
0.0.0.0.bootpc > 255.255.255.255.bootps: [udp sum ok] BOOTP/DHCP, Request from 00:19:99:f7:3d:23 (oui Unknown), length 548, xid 0x99f73d23, secs 18, Flags [Broadcast] (0x8000)
Client-Ethernet-Address 00:19:99:f7:3d:23 (oui Unknown)
Vendor-rfc1048 Extensions
Magic Cookie 0x63825363
DHCP-Message Option 53, length 1: Discover
Parameter-Request Option 55, length 36:
Subnet-Mask, Time-Zone, Default-Gateway, Time-Server
IEN-Name-Server, Domain-Name-Server, RL, Hostname
BS, Domain-Name, SS, RP
EP, RSZ, TTL, BR
YD, YS, NTP, Vendor-Option
Requested-IP, Lease-Time, Server-ID, RN
RB, Vendor-Class, TFTP, BF
Option 128, Option 129, Option 130, Option 131
Option 132, Option 133, Option 134, Option 135
MSZ Option 57, length 2: 1260
GUID Option 97, length 17: 0.72.178.216.253.99.205.17.226.190.154.221.134.53.14.178.59
ARCH Option 93, length 2: 0
NDI Option 94, length 3: 1.2.1
Vendor-Class Option 60, length 32: "PXEClient:Arch:00000:UNDI:002001"
END Option 255, length 0
PAD Option 0, length 0, occurs 200
18:55:26.546900 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 334)
10.94.0.253.bootps > 255.255.255.255.bootpc: [udp sum ok] BOOTP/DHCP, Reply, length 306, xid 0x99f73d23, secs 18, Flags [Broadcast] (0x8000)
Your-IP 10.94.0.5
Server-IP 10.91.0.1
Client-Ethernet-Address 00:19:99:f7:3d:23 (oui Unknown)
file "pxelinux.0"
Vendor-rfc1048 Extensions
Magic Cookie 0x63825363
DHCP-Message Option 53, length 1: Offer
Server-ID Option 54, length 4: 10.94.0.253
Lease-Time Option 51, length 4: 600
Subnet-Mask Option 1, length 4: 255.255.255.0
Default-Gateway Option 3, length 4: 10.94.0.254
Domain-Name-Server Option 6, length 8: 10.0.2.1,10.0.2.2
Domain-Name Option 15, length 9: "domain.intra"
NTP Option 42, length 4: 10.94.0.254
TFTP Option 66, length 9: "10.91.0.1"
END Option 255, length 0
18:55:26.547180 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 334)
10.94.0.252.bootps > 255.255.255.255.bootpc: [udp sum ok] BOOTP/DHCP, Reply, length 306, xid 0x99f73d23, secs 18, Flags [Broadcast] (0x8000)
Your-IP 10.94.0.104
Server-IP 10.91.0.1
Client-Ethernet-Address 00:19:99:f7:3d:23 (oui Unknown)
file "pxelinux.0"
Vendor-rfc1048 Extensions
Magic Cookie 0x63825363
DHCP-Message Option 53, length 1: Offer
Server-ID Option 54, length 4: 10.94.0.252
Lease-Time Option 51, length 4: 600
Subnet-Mask Option 1, length 4: 255.255.255.0
Default-Gateway Option 3, length 4: 10.94.0.254
Domain-Name-Server Option 6, length 8: 10.0.2.1,10.0.2.2
Domain-Name Option 15, length 9: "domain.intra"
NTP Option 42, length 4: 10.94.0.254
TFTP Option 66, length 9: "10.91.0.1"
END Option 255, length 0
The Client sees the exact same but simply ignores it.
Does it look wrong?
It just works if i do the same with a Linux VM on the Server side Switches (where the Radius Server is connected). So i'm pretty sure the problem is somewhere within the Juniper Switch Stack.
Update 2:
My assumption about a problem in the Switch Stack was right. It seems that "tagged-access" port mode does not behave as it should. Switching to "access" port mode did solve the problem. But it doesn't make much sense to me as "access" mode shouldn't be able to handle multiple supplicants in different VLANs, but it obviously does.