Score:1

Possible STP Issue with Cisco SG350 Switches

mg flag

we are having a weird issue with a network of Cisco SG350 Switches that I cannot figure out. We are thinking it may be related to STP but have verified all the normal problem points (i.e. proper ports are showing as Root / SmartPort disabled / etc)

Here is the network diagram: Cisco SG-250 Network Diagram

As you can see, we have 5 Cisco SG350 switches, all in a parallel daisy chain, except switch 243. All of these are connected on TRUNK ports all have 3 VLANs configured. The problem we are seeing is that throughout the day (its intermittent) traffic will drop on either the 242 and/or 243 switches for about 30-60 seconds.

When we investigate the logs, we are able to verify that a) the switches have not rebooted, b) the connection was lost, c) it appears that certain ports were in STP blocking for a period of time (usually 30-60 seconds).

For example, earlier today (Oct 21 around 23:30 GMT), sw242 went offline for about 30 seconds. The logs on sw242 only show gi17 going up/down, which we believe to be unrelated as gi17 is a CCTV camera. The logs on sw243 show nothing substation (even though devices on this switch went down but the upstream sw241 switch did not) and the upstream sw241 switch logs show ge24 STP Blocking and gi25 STP Blocking (this is the sw242/243 switches that went down).

It appears that for SOME reason sw241 is causing ge24 and ge25 (the two downstream switches) to STP block periodically but I cannot figure out why.

I have posted a copy of the TSR/CONFIG for each switch and I can provide logs if necessary but we have been troubleshooting this problem for several weeks and cannot seem to pinpoint it. Today, we upgraded the firmware on all the switches AND rebuilt the configuration for 244, 243, 242 from the ground up. We did not rebuild 251 or 241 as they do not seem to be causing the problem (that we can tell) and as the business was open, it was not conducive to take their entire network down.

Download Tech Support Logs

Any assistance is greatly appreciated!!

Ron Maupin avatar
us flag
"_we have 5 Cisco SG350 switches, all in a parallel daisy chain, except switch 243._" Don't do that. Build a tree. Have two root/secondary switches connected to each other, and the rest of the switches each connect to both of those switches., but not each other. Chaining or looping switches is suboptimal. In any case, the access interfaces should be set to portfast and BPDU guard, and STP will not block them. If someone connects a switch to them, then they will be disabled.
devGuru avatar
mg flag
@RonMaupin - in our situation, the switches are spread out based on proximity of the buildings/rooms they are in. So this is an outdoor venue - Switch 251 is in an office across the parking lot. Switch 241 has a fiber connection at the main bar, then from 241, the other switches (242, 243, 244) are all at other bars around the building. It's not super easy to move the switches and/or even add additional connections. Given those limitations, what would you recommend?
Ron Maupin avatar
us flag
You can connect fiber to fiber to reach back to the root. You are basically running a broken loop. Read the answers to [this question](https://networkengineering.stackexchange.com/q/80278/8499). Switched networking should not use loops or chaining. In any case, if you keep what you have, then use portfast and BPDU guard for the access interfaces to prevent STP problems on those interfaces.
devGuru avatar
mg flag
@RonMaupin - thanks for your help. I have enabled BDPU Guard and Edge Port (portfast) on the G24 and G25 interfaces of my 241 switch. (The ports where the other switches are connected). Can you confirm all this looks correct? https://www.dropbox.com/s/wlsy62rtbgs52ib/STOT%20Switch%20241%20Settings.png?dl=0
Ron Maupin avatar
us flag
No, not where the switches are connected, but on the access interfaces (where end-devices connect). BPDU guard kills an interface if it sees BPDUs from another switch. You want STP to be working on the switch-to-switch links, but disable it on the access interfaces, and keep people from connecting rogue switches to the access interfaces that can cause STP problems.
devGuru avatar
mg flag
Hey Ron - following up - and thank you again for all your help. So we made changes to the switches, however when we re-enabled STP on the trunk interfaces (with STP Edge Port on the access ports), the switch is still failing b/c those ports - G23 and G24 are blocking about every 6 hours for 30 seconds. Any thoughts?
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.