Score:0

New network not using updated TLS and failing to connect to certain websites

pg flag

I have 2 networks that are configured just about identically. The both have the same Router - Mikrotik RB2011UiAS-RM, with a direct fiber link to the ISP. I am using the same ISP for both networks. My first network has been up and running with no significant issues for about 4 years now. The new network has been up for maybe 2 months. I have patterned the second network after the first so they are set up with the same VLANs, IP Schemes, etc. Everything seems to be working fine but the last couple weeks I've been getting complaints about certain websites failing to load.

The issues is consistent with websites that don't load and it seems to be random which websites have the issue. For example Hulu.com will load but logging into Hulu fails. The biggest problem is some of the companies Vendor websites are not loading. These are the ones I've focused on since they are the ones that need to work for the company.

Last week, I wiresharked the connection at the second network to see if I could see what was failing on a site that they were telling me doesn't load. I got the following:

2097    81.935154   10.0.100.193    45.60.196.32    TCP 66  50793 → 443 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM
2098    81.936384   10.0.100.193    45.60.196.32    TCP 66  50794 → 443 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM
2111    81.976423   45.60.196.32    10.0.100.193    TCP 66  443 → 50793 [SYN, ACK] Seq=0 Ack=1 Win=64240 Len=0 MSS=1460 SACK_PERM WS=128
2112    81.976513   45.60.196.32    10.0.100.193    TCP 66  443 → 50794 [SYN, ACK] Seq=0 Ack=1 Win=64240 Len=0 MSS=1460 SACK_PERM WS=128
2113    81.976549   10.0.100.193    45.60.196.32    TCP 54  50793 → 443 [ACK] Seq=1 Ack=1 Win=262656 Len=0
2114    81.976616   10.0.100.193    45.60.196.32    TCP 54  50794 → 443 [ACK] Seq=1 Ack=1 Win=262656 Len=0
2115    81.977504   10.0.100.193    45.60.196.32    TLSv1   571 Client Hello
2116    81.978230   10.0.100.193    45.60.196.32    TLSv1   571 Client Hello
2124    82.017575   45.60.196.32    10.0.100.193    TCP 60  443 → 50793 [ACK] Seq=1 Ack=518 Win=64128 Len=0
2125    82.017984   45.60.196.32    10.0.100.193    TCP 60  443 → 50794 [ACK] Seq=1 Ack=518 Win=64128 Len=0
2126    82.018045   45.60.196.32    10.0.100.193    SSL 1230    [TCP Previous segment not captured] , Continuation Data
2127    82.018081   10.0.100.193    45.60.196.32    TCP 66  [TCP Dup ACK 2113#1] 50793 → 443 [ACK] Seq=518 Ack=1 Win=262656 Len=0 SLE=2921 SRE=4097
2128    82.018447   45.60.196.32    10.0.100.193    SSL 1230    [TCP Previous segment not captured] , Continuation Data
2129    82.018491   10.0.100.193    45.60.196.32    TCP 66  [TCP Dup ACK 2114#1] 50794 → 443 [ACK] Seq=518 Ack=1 Win=262656 Len=0 SLE=2921 SRE=4097
2130    82.018816   45.60.196.32    10.0.100.193    SSL 236 [TCP Previous segment not captured] , Continuation Data
2131    82.018853   10.0.100.193    45.60.196.32    TCP 74  [TCP Dup ACK 2113#2] 50793 → 443 [ACK] Seq=518 Ack=1 Win=262656 Len=0 SLE=5557 SRE=5739 SLE=2921 SRE=4097
2132    82.019221   45.60.196.32    10.0.100.193    SSL 236 [TCP Previous segment not captured] , Continuation Data
2133    82.019246   10.0.100.193    45.60.196.32    TCP 74  [TCP Dup ACK 2114#2] 50794 → 443 [ACK] Seq=518 Ack=1 Win=262656 Len=0 SLE=5557 SRE=5739 SLE=2921 SRE=4097
2414    91.975313   45.60.196.32    10.0.100.193    TCP 60  443 → 50793 [FIN, ACK] Seq=5739 Ack=518 Win=64128 Len=0
2415    91.975378   10.0.100.193    45.60.196.32    TCP 74  [TCP Dup ACK 2113#3] 50793 → 443 [ACK] Seq=518 Ack=1 Win=262656 Len=0 SLE=5557 SRE=5739 SLE=2921 SRE=4097
2416    91.980004   45.60.196.32    10.0.100.193    TCP 60  443 → 50794 [FIN, ACK] Seq=5739 Ack=518 Win=64128 Len=0
2417    91.980052   10.0.100.193    45.60.196.32    TCP 74  [TCP Dup ACK 2114#3] 50794 → 443 [ACK] Seq=518 Ack=1 Win=262656 Len=0 SLE=5557 SRE=5739 SLE=2921 SRE=4097
3135    111.978393  10.0.100.193    45.60.196.32    TCP 54  50793 → 443 [FIN, ACK] Seq=518 Ack=1 Win=262656 Len=0
3136    111.978658  10.0.100.193    45.60.196.32    TCP 54  50794 → 443 [FIN, ACK] Seq=518 Ack=1 Win=262656 Len=0
3139    112.280923  10.0.100.193    45.60.196.32    TCP 54  [TCP Retransmission] 50794 → 443 [FIN, ACK] Seq=518 Ack=1 Win=262656 Len=0
3140    112.280923  10.0.100.193    45.60.196.32    TCP 54  [TCP Retransmission] 50793 → 443 [FIN, ACK] Seq=518 Ack=1 Win=262656 Len=0
3150    112.882128  10.0.100.193    45.60.196.32    TCP 54  [TCP Retransmission] 50793 → 443 [FIN, ACK] Seq=518 Ack=1 Win=262656 Len=0
3151    112.882127  10.0.100.193    45.60.196.32    TCP 54  [TCP Retransmission] 50794 → 443 [FIN, ACK] Seq=518 Ack=1 Win=262656 Len=0
3163    114.097284  10.0.100.193    45.60.196.32    TCP 54  [TCP Retransmission] 50794 → 443 [FIN, ACK] Seq=518 Ack=1 Win=262656 Len=0
3164    114.097284  10.0.100.193    45.60.196.32    TCP 54  [TCP Retransmission] 50793 → 443 [FIN, ACK] Seq=518 Ack=1 Win=262656 Len=0
3193    116.514004  10.0.100.193    45.60.196.32    TCP 54  [TCP Retransmission] 50794 → 443 [FIN, ACK] Seq=518 Ack=1 Win=262656 Len=0
3194    116.514004  10.0.100.193    45.60.196.32    TCP 54  [TCP Retransmission] 50793 → 443 [FIN, ACK] Seq=518 Ack=1 Win=262656 Len=0
3387    121.329207  10.0.100.193    45.60.196.32    TCP 54  [TCP Retransmission] 50794 → 443 [FIN, ACK] Seq=518 Ack=1 Win=262656 Len=0
3388    121.329207  10.0.100.193    45.60.196.32    TCP 54  [TCP Retransmission] 50793 → 443 [FIN, ACK] Seq=518 Ack=1 Win=262656 Len=0
3727    130.944445  10.0.100.193    45.60.196.32    TCP 54  50794 → 443 [RST, ACK] Seq=519 Ack=1 Win=0 Len=0
3728    130.944445  10.0.100.193    45.60.196.32    TCP 54  50793 → 443 [RST, ACK] Seq=519 Ack=1 Win=0 Len=0

So when I saw this I knew there was a problem with the server not responding to the TLS Client Hello sent from my machine. It wasn't until I did another capture on the first network that I saw what was going on:

141 8.485975    10.0.100.193    45.60.196.32    TCP 66  49533 → 443 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM
143 8.495430    10.0.100.193    45.60.196.32    TCP 66  49534 → 443 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM
160 8.529277    45.60.196.32    10.0.100.193    TCP 66  443 → 49533 [SYN, ACK] Seq=0 Ack=1 Win=64240 Len=0 MSS=1340 SACK_PERM WS=128
161 8.529397    10.0.100.193    45.60.196.32    TCP 54  49533 → 443 [ACK] Seq=1 Ack=1 Win=262400 Len=0
162 8.530000    10.0.100.193    45.60.196.32    TLSv1.3 571 Client Hello
163 8.538789    45.60.196.32    10.0.100.193    TCP 66  443 → 49534 [SYN, ACK] Seq=0 Ack=1 Win=64240 Len=0 MSS=1340 SACK_PERM WS=128
164 8.538878    10.0.100.193    45.60.196.32    TCP 54  49534 → 443 [ACK] Seq=1 Ack=1 Win=262400 Len=0
165 8.539542    10.0.100.193    45.60.196.32    TLSv1.3 571 Client Hello
180 8.572428    45.60.196.32    10.0.100.193    TCP 60  443 → 49533 [ACK] Seq=1 Ack=518 Win=64128 Len=0
181 8.575808    45.60.196.32    10.0.100.193    TLSv1.3 1394    Server Hello, Change Cipher Spec, Application Data
182 8.575965    45.60.196.32    10.0.100.193    TCP 1394    443 → 49533 [PSH, ACK] Seq=1341 Ack=518 Win=64128 Len=1340 [TCP segment of a reassembled PDU]

For some reason, on my new network, my computer is not using TLSv1.3 its using TLSv1 and I'm guessing the server isn't responding because it's not going to use the outdated protocol. (Which makes sense to me.) So I understand what is happening but what I can't figure out is why my computer is doing this.

Correct me if I'm wrong but my understanding is the TLS version is negotiated between Client and Server and is not dependent on the network used. I used the same laptop on both networks so it's not a matter of updates needed to the Client machine. Additionally, tracert shows that I have a link to the IP which isn't surprising because I'm definitely communicating with it but the TLS version is stopping the server from continuing to communicate.

I'm completely at a loss as to how to fix this or why I would even be seeing this problem. Definitely a first for me. Does anyone have some troubleshooting ideas or possibly ever had a similar issue?

Thanks in advance for all your help.

Update: I went back to the new network to do some exploring. I'm even more confused now. I just ran a capture on just my IP and tried to do my normal work/browsing and found many sites that won't load. Amazon works fine. ServerFault and StackOverflow wont load. So I filtered my capture by TLS protocol and I am definitely seeing TLSv1.2 and TLSv1.3 successfully work on that network but it seems to be selective. In all of the cases where the website fails/times out my computer is trying to communicate over TLSv1. I just have no idea why it would try that when the website supports a higher protocol.

Update #2: 2 new things that happened yesterday:

  1. I checked the system time on all of my switches and my router. The router had the correct time but my switches were still at default time of somewhere in the year 2000. So I set the time for all of my network equipment but this didn't make a difference in my issue.
  2. I ran traceroutes on both locations and got vastly different results. The new network (which fails to connect) had 15 hops while the other network had 9 hops. I have the same ISP at both locations and the first 2 hops after leaving the LAN were exactly the same and then things seemed to spiral out for the new network. I've sent these to my ISP and I'm waiting to hear back from them.

At this point, I'm thinking its not something wrong with my local network but there are issues down the line.

Update #3: My ISP sent out a tech with a media converter and connected their laptop directly to the Fiber network and all of the sites worked just perfectly. So something in my router is causing the communication problem. Additionally I've even downgraded my "non-working" router to the same version as the network that doesn't have any issues. I'm still having the same problem. It might be worth noting that my ISP has set up a static route for one of the websites that we are having issues with and that website is not having an issue. So I would think that means the routing of the packets could play a part as well. I've identified some ppp settings that I can try to work with but I'm not hopeful that they will actually be the problem. In the meantime I've reached out to Mikrotik to see what any insights they might have.

Score:0
pg flag

I have discovered the answer. It had to dig deep in the PPoE settings on my router but the ppp profile was set to encryption which in turn had change-tcp-mss=no

Once I corrected the profile to the nonencrypted profile with change-tcp-mss=yes everything worked as expected.

Score:0
jp flag

I suspect your only actual problem in the first trace is that your system is not receiving the first segment(s?) of the server response, containing the ServerHello plus some other things (for 1.3 probably CCS EE and part of Cert), on either of the two connections. "Previous segment not captured" shows Wireshark didn't see at least one segment, and the subsequent outgoing "Dup Ack ... ack=1" confirms your OS stack also didn't see it/them,

The TLS protocol version cannot be determined from ClientHello alone. Only by combining ClientHello with ServerHello can Wireshark determine if TLS1.3 is used, and display it. If Wireshark only sees the ClientHello and not the ServerHello, the version it displays is arbitrary and usually wrong, which you should ignore.

So why did your system not receive the server's first segment(s)? I can't be sure, but a common reason is if MTU is set too low on any network or link, or (conversely) MSS too high at the/a peer, and fragmentation is disabled or blocked, so you get (a) TCP segment(s) that is/are longer than the permitted datagram and is/are dropped. (On IPv6 fragmentation no longer exists, so the last part of the condition becomes moot.) If you expand the first "Previous not captured" frames and look at (relative) sequence number in the TCP header, it will show how much data is missing, which is either equal to the previous segment size if only one, or the sum of the previous segment sizes and since non-pushed segments are usually all the same size (MSS) the sum is a small integer multiple of the individual segment size.

If the problem were that your system was for some reason actually sending a TLSv1.0 hello, which should not happen for exactly the reasons you state, no sane server would fail to respond; if it doesn't accept 1.0 (and most good public servers today don't) it would respond with either an alert or a disconnect (either normal FIN or abnormal RST), and any of those would be visible (and clearly distinguished) in Wireshark.

fatdollar avatar
pg flag
Thanks for that great explanation. I wasn't aware that Wireshark would arbitrarily assign it a TLS version without a response. Thanks for the info and also confirming a few of my thought processes. I've updated my question again with more details from yesterday's investigations.
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.