Score:1

Spanning tree causing packet loss between a Cisco switch C3560 and Linux server running on CentOs

sv flag

I am working in a network environnement where I have some Cisco Switch WS-C3560X-48 and Linux Servers running CentOS 7.7.

The linux servers are connected 3 times on my switches : one admin link, one production link and one ILO link because they are running on HP hardware.

When I try to ping the servers on the admin LAN from my Cisco switch, I get the following result :

SWTCisco#ping 10.123.213.152 source 10.123.213.158 repeat 100

Type escape sequence to abort.
Sending 100, 100-byte ICMP Echos to 10.123.213.152, timeout is 2 seconds:
Packet sent with a source address of 10.123.213.158
!!!!!!.!!!!!!.!!!!!!.!!!!!!.!!!!!!.!!!!!!.!!!!!!.!!!!!!.!!!!!!.!!!!!!.
!!!!!!.!!!!!!.!!!!!!.!!!!!!.!!
Success rate is 86 percent (86/100), round-trip min/avg/max = 1/3/17 ms

As you can see, I have a pattern, I always lose a packet on the 7th ping. On server side, I can see with tcpdump that the icmp request is received but the icmp reply is not sent. On the exemple below, I pingued 8 times the server and we can see 2 request following each other.

root@CentOSserver:/etc/sysconfig/network-scripts# tcpdump -i eno1 host 10.123.213.158 -nn
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eno1, link-type EN10MB (Ethernet), capture size 262144 bytes
11:37:04.770292 IP 10.123.213.158 > 10.123.213.152: ICMP echo request, id 134, seq 0, length 80
11:37:04.770354 IP 10.123.213.152 > 10.123.213.158: ICMP echo reply, id 134, seq 0, length 80
11:37:04.772624 IP 10.123.213.158 > 10.123.213.152: ICMP echo request, id 134, seq 1, length 80
11:37:04.772644 IP 10.123.213.152 > 10.123.213.158: ICMP echo reply, id 134, seq 1, length 80
11:37:04.774394 IP 10.123.213.158 > 10.123.213.152: ICMP echo request, id 134, seq 2, length 80
11:37:04.774411 IP 10.123.213.152 > 10.123.213.158: ICMP echo reply, id 134, seq 2, length 80
11:37:04.776592 IP 10.123.213.158 > 10.123.213.152: ICMP echo request, id 134, seq 3, length 80
11:37:04.776606 IP 10.123.213.152 > 10.123.213.158: ICMP echo reply, id 134, seq 3, length 80
11:37:04.789083 IP 10.123.213.158 > 10.123.213.152: ICMP echo request, id 134, seq 4, length 80
11:37:04.789099 IP 10.123.213.152 > 10.123.213.158: ICMP echo reply, id 134, seq 4, length 80
11:37:04.791466 IP 10.123.213.158 > 10.123.213.152: ICMP echo request, id 134, seq 5, length 80
11:37:04.791483 IP 10.123.213.152 > 10.123.213.158: ICMP echo reply, id 134, seq 5, length 80
11:37:04.793669 IP 10.123.213.158 > 10.123.213.152: ICMP echo request, id 134, seq 6, length 80
11:37:04.822159 ARP, Request who-has 10.123.213.158 tell 10.123.213.144, length 46
11:37:06.793024 IP 10.123.213.158 > 10.123.213.152: ICMP echo request, id 134, seq 7, length 80
11:37:06.793068 IP 10.123.213.152 > 10.123.213.158: ICMP echo reply, id 134, seq 7, length 80

10.123.213.158 is the address of the vlan on my Cisco Switch
10.123.213.152 is the address of the eno1 on the Linux Server
10.123.213.144 is the ILO adress of another server doing an arp request while my tcpdump was running.

After new investigation, I found that the problem is related to spanning-tree. I hosted a pcap of what I found. https://filebin.net/9x9ech3uude93sda

In the pcap, we can see that there is a STP packet between the 2 icmp request. I tried several time and each time, a STP packet is where I should have found my reply.

To me, it's just a bpdu message and should not have any impact on my interface GigabitEthernet0/27.

Nothing particulary alarming (to me)is visible on the spanning tree configuration on the cisco :

SWTCisco#sh spanning-tree vlan 28

VLAN0028
  Spanning tree enabled protocol ieee
  Root ID    Priority    32796
             Address     501c.bf45.1c00
             This bridge is the root
             Hello Time   2 sec  Max Age 20 sec  Forward Delay 15 sec

  Bridge ID  Priority    32796  (priority 32768 sys-id-ext 28)
             Address     501c.bf45.1c00
             Hello Time   2 sec  Max Age 20 sec  Forward Delay 15 sec
             Aging Time  300 sec

Interface           Role Sts Cost      Prio.Nbr Type
------------------- ---- --- --------- -------- --------------------------------
Gi0/11              Desg FWD 4         128.11   P2p
Gi0/18              Desg FWD 4         128.18   P2p
Gi0/19              Desg FWD 4         128.19   P2p
Gi0/20              Desg FWD 4         128.20   P2p
Gi0/21              Desg FWD 19        128.21   P2p
Gi0/22              Desg FWD 4         128.22   P2p
Gi0/23              Desg FWD 4         128.23   P2p
Gi0/24              Desg FWD 4         128.24   P2p
Gi0/25              Desg FWD 4         128.25   P2p
Gi0/26              Desg FWD 4         128.26   P2p
Gi0/27              Desg FWD 4         128.27   P2p
Gi0/31              Desg FWD 4         128.31   P2p
Gi0/32              Desg FWD 19        128.32   P2p
Gi0/33              Desg FWD 4         128.33   P2p
Gi0/34              Desg FWD 4         128.34   P2p
Gi0/35              Desg FWD 4         128.35   P2p
Gi0/36              Desg FWD 4         128.36   P2p
Gi0/37              Desg FWD 4         128.37   P2p
Gi0/38              Desg FWD 4         128.38   P2p
Gi0/39              Desg FWD 4         128.39   P2p
Gi0/40              Desg FWD 4         128.40   P2p
Gi0/47              Desg FWD 19        128.47   P2p
Gi1/3               Desg FWD 4         128.51   P2p

SWTCisco#sh run int gigabitEthernet 0/27
Building configuration...

Current configuration : 113 bytes
!
interface GigabitEthernet0/27
 switchport access vlan 28
 switchport mode access
end

SWTCisco#sh spanning-tree blockedports

Name                 Blocked Interfaces List
-------------------- ------------------------------------

Number of blocked ports (segments) in the system : 0

SWTCisco#sh spanning-tree summary
Switch is in pvst mode
Root bridge for: VLAN0028, VLAN0031, VLAN3715
EtherChannel misconfig guard is enabled
Extended system ID           is enabled
Portfast Default             is disabled
PortFast BPDU Guard Default  is disabled
Portfast BPDU Filter Default is disabled
Loopguard Default            is disabled
UplinkFast                   is disabled
BackboneFast                 is disabled
Configured Pathcost method used is short

Name                   Blocking Listening Learning Forwarding STP Active
---------------------- -------- --------- -------- ---------- ----------
VLAN0028                     0         0        0         23         23
VLAN0031                     0         0        0         12         12
VLAN0157                     0         0        0          1          1
VLAN3715                     0         0        0          1          1
---------------------- -------- --------- -------- ---------- ----------
4 vlans                      0         0        0         37         37
SWTCisco#sh version | in RELEASE
Cisco IOS Software, C3560E Software (C3560E-UNIVERSALK9-M), Version 12.2(55)SE5, RELEASE SOFTWARE (fc1)
BOOTLDR: C3560E Boot Loader (C3560X-HBOOT-M) Version 12.2(53r)SE1, RELEASE SOFTWARE (fc1)

I watched my interface Gi0/27 while the ping is active and the interface stay in FWD state.

Does anyone have any idea why I lose a packet while the switch is sending a bdpu frame ? I have some troubles understanding some advanced stp functionnality so I may be missing something here.

Michael Hampton avatar
cz flag
Why have you obfuscated your RFC1918 addresses? This gives you no benefit, it only makes the question harder to follow. What are these IP addresses actually?
Keftef avatar
us flag
Check the MTU size at your Linux servers ip a | grep mtu Then check the MTU at your switch as well.
Doji avatar
sv flag
@MichaelHampton : You are right, I removed the obfuscation. The .158 is the adress of the vlan interface on Cisco switch side. The .152 is the admin IP set on my eno1. I also checked the MTU. It is set to 1500 on both interface, server side and cisco switch side.
Michael Hampton avatar
cz flag
OK, it's a bit clearer now. .152 is your Linux server's admin interface and .158 is the Cisco. So which is the .144?
Doji avatar
sv flag
@MichaelHampton : .144 is an ILO interface of another server on the same LAN. I did not noticed the arp request in my paste, it is not related to my problem. I can remove it to add some clarity.
Michael Hampton avatar
cz flag
It shouldn't be removed, only annotated so that people know what it is (and that it might not be relevant). We've had plenty of examples of people removing stuff that they thought was not relevant, and it turned out that it actually was.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.