Score:0

Debian server network connection keeps dropping

mx flag

Background

I'm working on an internal server running Debian

Uname - r output :

Linux osteocalcine 5.10.0-13-amd64 #1 SMP Debian 5.10.106-1 (2022-03-17) x86_64 GNU/Linux

This server (which is planned to be used as a shared document repository etc etc ...) receives a fixed ip address from the local dhcp server, based on its MAC address.

here is the relevant section of etc/network/interfaces

auto lo
iface lo inet loopback


auto enp11s0f0
iface enp11s0f0 inet dhcp

The server is strictly internal, no communication to it from outside (ie I can't connect to it unless I am on the same network, so no SSH from home, only the office).

The setup was with a basic gui (I envisage in the future that my colleagues may want to use it to run various analysis, and I'm sure that they would be happier to remote log on to a desktop, rather than a command line).

Problem

The server will regularly drop off the network, and not auto reconnect.

The last it did this I grabbed the output of dmesg

[Mon May  9 12:08:25 2022] audit: type=1400 audit(1652090911.500:10): apparmor="
STATUS" operation="profile_load" profile="unconfined" name="libreoffice-senddoc"
 pid=585 comm="apparmor_parser"
[Mon May  9 12:08:25 2022] audit: type=1400 audit(1652090911.500:11): apparmor="
STATUS" operation="profile_load" profile="unconfined" name="libreoffice-oopslash
" pid=595 comm="apparmor_parser"
[Mon May  9 12:08:25 2022] pstore: Using crash dump compression: deflate
[Mon May  9 12:08:25 2022] pstore: Registered efi as persistent store backend
[Mon May  9 12:08:25 2022] bnx2 0000:0b:00.0: firmware: direct-loading firmware 
bnx2/bnx2-mips-09-6.2.1b.fw
[Mon May  9 12:08:25 2022] bnx2 0000:0b:00.0: firmware: direct-loading firmware 
bnx2/bnx2-rv2p-09-6.0.17.fw
[Mon May  9 12:08:25 2022] bnx2 0000:0b:00.0 enp11s0f0: using MSIX
[Mon May  9 12:08:27 2022] bnx2 0000:0b:00.0 enp11s0f0: NIC Copper Link is Up, 1
00 Mbps full duplex

[Mon May  9 12:08:27 2022] IPv6: ADDRCONF(NETDEV_CHANGE): enp11s0f0: link become
s ready
[Mon May  9 12:08:29 2022] bnx2 0000:0b:00.1 enp11s0f1: using MSIX
[Mon May  9 12:08:29 2022] bnx2 0000:15:00.0 ens2f0: using MSIX
[Mon May  9 12:08:29 2022] bnx2 0000:15:00.1 ens2f1: using MSIX
[Mon May  9 12:09:02 2022] kauditd_printk_skb: 10 callbacks suppressed
[Mon May  9 12:09:02 2022] audit: type=1400 audit(1652090942.775:22): apparmor="
DENIED" operation="capable" profile="/usr/sbin/cupsd" pid=991 comm="cupsd" capab
ility=12  capname="net_admin"
[Mon May  9 12:09:03 2022] audit: type=1400 audit(1652090943.315:23): apparmor="
DENIED" operation="capable" profile="/usr/sbin/cups-browsed" pid=1081 comm="cups
-browsed" capability=23  capname="sys_nice"
[Mon May  9 13:24:56 2022] perf: interrupt took too long (2519 > 2500), lowering
 kernel.perf_event_max_sample_rate to 79250
[Mon May  9 14:52:01 2022] perf: interrupt took too long (3161 > 3148), lowering
 kernel.perf_event_max_sample_rate to 63250
[Tue May 10 00:00:44 2022] audit: type=1400 audit(1652133644.687:24): apparmor="
DENIED" operation="capable" profile="/usr/sbin/cupsd" pid=77551 comm="cupsd" cap
ability=12  capname="net_admin"
[Tue May 10 00:00:44 2022] audit: type=1400 audit(1652133644.819:25): apparmor="
DENIED" operation="capable" profile="/usr/sbin/cups-browsed" pid=77552 comm="cup
s-browsed" capability=23  capname="sys_nice"

at which point I rebooted the server (it seems to be the only way to restart the networking), here are the lines from the new dmesg that seem relevant ...

[Tue May 10 14:00:01 2022] bnx2 0000:0b:00.0 eth0: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found at mem 96000000, IRQ 24, node addr 5c:f3:fc:e4:6f:d8
[Tue May 10 14:00:01 2022] bnx2 0000:0b:00.1 eth1: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found at mem 98000000, IRQ 36, node addr 5c:f3:fc:e4:6f:da
[Tue May 10 14:00:01 2022] i801_smbus 0000:00:1f.3: enabling device (0140 -> 0143)
[Tue May 10 14:00:01 2022] i801_smbus 0000:00:1f.3: SMBus using PCI interrupt
[Tue May 10 14:00:01 2022] bnx2 0000:15:00.0 eth2: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found at mem 92000000, IRQ 28, node addr 00:10:18:fb:1f:20
[Tue May 10 14:00:01 2022] ACPI Warning: SystemIO range 0x00000000000005A8-0x00000000000005AF conflicts with OpRegion 0x00000000000005A8-0x00000000000005AF (\_SB.PCI0.LPC0.GPE0) (20200925/utaddress-204)
[Tue May 10 14:00:01 2022] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[Tue May 10 14:00:01 2022] ACPI Warning: SystemIO range 0x0000000000000430-0x000000000000043F conflicts with OpRegion 0x0000000000000439-0x0000000000000439 (\_SB.PCI0.RIL) (20200925/utaddress-204)
[Tue May 10 14:00:01 2022] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[Tue May 10 14:00:01 2022] ACPI Warning: SystemIO range 0x0000000000000400-0x000000000000042F conflicts with OpRegion 0x000000000000040E-0x000000000000040E (\_SB.PCI0.RIT) (20200925/utaddress-204)
[Tue May 10 14:00:01 2022] ACPI Warning: SystemIO range 0x0000000000000400-0x000000000000042F conflicts with OpRegion 0x000000000000040C-0x000000000000040C (\_SB.PCI0.RTY) (20200925/utaddress-204)
[Tue May 10 14:00:01 2022] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[Tue May 10 14:00:01 2022] lpc_ich: Resource conflict(s) found affecting gpio_ich
[Tue May 10 14:00:01 2022] bnx2 0000:15:00.1 eth3: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found at mem 94000000, IRQ 37, node addr 00:10:18:fb:1f:22
[Tue May 10 14:00:01 2022] bnx2 0000:0b:00.0 enp11s0f0: renamed from eth0
[Tue May 10 14:00:13 2022] bnx2 0000:0b:00.0: firmware: direct-loading firmware bnx2/bnx2-mips-09-6.2.1b.fw
[Tue May 10 14:00:13 2022] bnx2 0000:0b:00.0: firmware: direct-loading firmware bnx2/bnx2-rv2p-09-6.0.17.fw
[Tue May 10 14:00:13 2022] bnx2 0000:0b:00.0 enp11s0f0: using MSIX
[Tue May 10 14:00:14 2022] bnx2 0000:0b:00.0 enp11s0f0: NIC Copper Link is Up, 100 Mbps full duplex
[Tue May 10 14:00:14 2022] IPv6: ADDRCONF(NETDEV_CHANGE): enp11s0f0: link becomes ready
[Tue May 10 14:00:15 2022] bnx2 0000:0b:00.1 enp11s0f1: using MSIX
[Tue May 10 14:00:16 2022] bnx2 0000:15:00.0 ens2f0: using MSIX
[Tue May 10 14:00:16 2022] bnx2 0000:15:00.1 ens2f1: using MSIX

after which the next lines are for Wednesday, so I guess not relevant (please tell me if I am wrong).

I'm not sure exactly what I need to look for in dmesg (or elsewhere) to determine what is causing the connection to drop.

I have noticed the following however.

  • To correct the problem I must physically restart the server.
    • Restarting the neworking using `systelctl networking restart' does nothing. (how do I extract error / trace messages from this).
  • The problem seems to arise when I am logged in over SSH and my local terminal 'goes to sleep' ~ could this somehow cause an issue on the server ?

Is there a daemon that I need to configure that will keep testing the connection and 'bring it up' if it goes down (note however that the systemctl call doesn't seem to get the network to come back up ~ so it may be a moot point).

Note As mentioned above I have installed a 'desktop' on the server, just in case one of my colleagues want to log into it. I realised that network manager is installed on the system ....

$ apt list --installed |grep network

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

glib-networking-common/stable,now 2.66.0-2 all [installed,automatic]
glib-networking-services/stable,now 2.66.0-2 amd64 [installed,automatic]
glib-networking/stable,now 2.66.0-2 amd64 [installed,automatic]
libqt5network5/stable,now 5.15.2+dfsg-9 amd64 [installed,automatic]
network-manager-gnome/stable,now 1.20.0-3 amd64 [installed,automatic]
network-manager/stable,now 1.30.0-2 amd64 [installed,automatic]

could this be somehow causing problem.

If you need more details, just ask and I will post an update.

As always, thanks in advance for your aid.

edit 1 :

So I'm leaning more to the problem being with ssh. I was logged in today, and the connection 'froze'. I had to log in via another terminal, and kill the first connection. I mention this as it happened at a time when I would normally have been having lunch / coffee with colleagues. What I need to do now is improve the monitoring of the whole system ... but what should I add, and what do I need to look out for ?

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.