Score:0

Suddenly can't ping any hosts over WiFi

hu flag

Since I started using openwrt on my router, something weird started to happen.

I usually have 4 devices (2 phones and 2 laptops) connected to the WiFi AP/router daily, but one of the laptops (namely a XPS 13 9365) started to suddenly get "disconnected". I've quoted the word because, in theory, I'm still connected, but network connection simply stops working.

It's weird because the issue simply doesn't show up some days, while other days are a real nightmare with the connection stopping working every couple minutes. And only for the XPS 13. Other devices work like a charm, even when I have ~10 devices connected at once.

This is what I get right after noticing network stops:

$ sudo iw dev "wlp60s0" link
Connected to **:**:**:**:**:** (on wlp60s0)
    SSID: my_ap
    freq: 2447
    RX: 15583826 bytes (14173 packets)
    TX: 1550845 bytes (6382 packets)
    signal: -40 dBm
    rx bitrate: 144.4 MBit/s MCS 15 short GI
    tx bitrate: 144.4 MBit/s MCS 15 short GI

    bss flags:  short-preamble short-slot-time
    dtim period:    2
    beacon int: 100

And I still have an IP address etc.:

$ ip addr list
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
3: enx00e04c6810ec: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN group default qlen 1000
    link/ether **:**:**:**:**:** brd ff:ff:ff:ff:ff:ff
5: wlp60s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether **:**:**:**:**:** brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.11/24 brd 10.0.0.255 scope global dynamic wlp60s0
       valid_lft 43060sec preferred_lft 43060sec
    inet6 fe80::fa63:3fff:fe2f:837/64 scope link 
       valid_lft forever preferred_lft forever

So, from the above, you can see I'm still connected to the AP and have a valid IP. But no matter who I try to ping, I get 100% packet loss. Other ways of connecting (like ssh, browser, etc.) also don't work. See:

$ ping 10.0.0.1
PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
^C
--- 10.0.0.1 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1011ms

I also tried to check for any system messages. No luck:

$ dmesg
$

Note: I issued sudo dmesg -c right after boot to make it easier to identify issues and while the network was still usable.

I'm under Ubuntu 20.04.3:

$ cat /etc/issue
Ubuntu 20.04.3 LTS \n \l

My wireless device:

$ lspci | grep -i network
3c:00.0 Network controller: Intel Corporation Wireless 8265 / 8275 (rev 78)

As a temporary workaround, I developed a script to stop NetworkManager and reconnect via command line. Something like this:

iface="wlp60s0"
essid="my_ap"
tmpfile="/tmp/wpa.conf"
pass="my_pass"

sudo systemctl stop NetworkManager.service
sudo iw dev "$iface" del
sudo iw phy phy0 interface add "$iface" type managed
sudo ip link set "$iface" up
sudo wpa_passphrase "$essid" "$pass" > "$tmpfile"
sudo wpa_supplicant -i"$iface" -c"$tmpfile" -B
sudo dhclient -v "$iface"

This makes life a bit easier, but of course it's just temporary, rudimentary and far from ideal. And also it doesn't help much as I keep loosing connection from time to time anyway, exactly the same way as when I use NetworkManager. It's just quicker than waiting for NetworkManager to restart...

What I've tried so far

  • Disabling wifi power_save with sudo iw dev wlp60s0 set power_save off.
  • Disabling wifi power save via NetworkManager by editing /etc/NetworkManager/conf.d/default-wifi-powersave-on.conf and changing wifi.powersave = 3 to wifi.powersave = 2 then restarting. (source: https://unix.stackexchange.com/a/315400/108418)
  • Changing wifi security on router (WPA -> WEP or other) (source: 20.04 can't connect to 5Ghz wifi after update)
  • Changing wifi mode from "N" to "Legacy". This one seemed to solve the problem, but maybe because I didn't use it for long enough. Besides the network speed drop obviously makes this option impracticable.
  • Enabling NetworkManager debug mode and trying to identify possible issues.

None of the above worked.

Other links I've visited

These were some of my tries, but either the symptoms are not exactly the same or the proposed solution didn't work for me...

https://www.reddit.com/r/linuxquestions/comments/ausg6k/arch_wifi_stays_connected_but_theres_no_internet/ehc3oph/

https://blog.stigok.com/2017/03/26/wifi-loses-connectivity-periodically-wpasupplicant-reason-4.html

So I'm posting all this here in the hope someone went through this already and maybe can shine some lights...

Thank you very much!

Update #1

I've found a way to reproduce the issue. Every time I visit this page and browse the photos (to make the browser load many photos at once, in parallel), the connection drops.

https://www.facebook.com/terraadentropelomundo/photos/

I wonder if there's any issues with the wireless driver in handling many connections at once.

Update #2

After browsing other forums in the hope for a solution, I came across this:

It seems to have become better when I changed "Beacon Interval" from the default 100 ms to 50 on my AP. So far no disconnects in three days.

EDIT: Can confirm, the problem appears to be fixed after this change.

(source: https://bugs.archlinux.org/task/58457#comment185619)

It makes sense, considering I started facing this problem after moving to openwrt on my AP. So there is certainly something weird with the Intel driver/firmware, but changing beacon on my AP seems to solve the issue. I'll test for some more days and see if the issue is gone.

Update #3

Didn't work. Even using beacon 50ms in openwrt, I'm still being disconnected from time to time without any messages showing up in dmesg...

pe flag
Do your openwrt device be filtering echo commands? 'Coz the issue seems not related to the PCs OS...
hu flag
Filtering icmp packets you mean? No. Note that every other devices on the network work perfectly fine. Only this PC is having this disconnection issues. And they are intermittent. There are days where I use the laptop all day without any issues. Other days I have connection drops every couple minutes. Just for this PC. All other devices (other notebooks, other smartphones, etc) still work perfectly fine. Why do you think it's not related to the laptop if the issues only happen with this specific device?
cc flag
Maybe set ipv6 to ignore or disable if you are not really using it.
hu flag
It's off already, @ubfan1. I'm starting to believe it has something to do with the driver being overloaded by connections, or maybe too much data throughput... Have a look at the Update section. I added some info there.
chili555 avatar
cn flag
Have you tried my suggestions here? https://askubuntu.com/questions/1364239/tp-link-usb-wireless-adapter-keep-losing-data-every-several-minutes-without-disc/1364295#1364295
hu flag
Yes, @chili555. All the three options. No luck. Up to now, what seems to be working was changing beacon from 100 to 50 on the router (see "Update #2"). But I tested it for barely 3 hours while I already had entire days without a single drop, so let's give it a couple days more and I'll come back here with more feedback. Thank you!
waltinator avatar
it flag
Look at the logs! `sudo journalctl -b 0 -u NetworkManager`. Read `man journalctl`. Look at `ip route show`. Read `man ip ip-route`.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.