Score:0

Concurrent ssh connections on control server keep dropping

us flag

I'm facing an issue with setup I am using for ocassionally doing maintenance on a bunch of customer servers via remote SSH

Following Setup:

1 Control Server X Arbitrary number of Customer servers set up to have a 'service' account connect to my control server via SSH.

I've set up the clients to automatically connect to the control server, which has a fixed IP, via the service account using autoSSH after bootup. This is my /etc/ssh/ssh_config on the customer machine:

# This is the ssh client system-wide configuration file.  See
# ssh_config(5) for more information.  This file provides defaults for
# users, and the values can be changed in per-user configuration files
# or on the command line.

# Configuration data is parsed as follows:
#  1. command line options
#  2. user-specific file
#  3. system-wide file
# Any configuration value is only changed the first time it is set.
# Thus, host-specific definitions should be at the beginning of the
# configuration file, and defaults at the end.

# Site-wide defaults for some commonly used options.  For a comprehensive
# list of available options, their meanings and defaults, please see the
# ssh_config(5) man page.

Host *
#   ForwardAgent no
#   ForwardX11 no
#   ForwardX11Trusted yes
#   PasswordAuthentication yes
#   HostbasedAuthentication no
#   GSSAPIAuthentication no
#   GSSAPIDelegateCredentials no
#   GSSAPIKeyExchange no
#   GSSAPITrustDNS no
#   BatchMode no
#   CheckHostIP yes
#   AddressFamily any
#   ConnectTimeout 0
#   StrictHostKeyChecking ask
#   IdentityFile ~/.ssh/id_rsa
#   IdentityFile ~/.ssh/id_dsa
#   IdentityFile ~/.ssh/id_ecdsa
#   IdentityFile ~/.ssh/id_ed25519
#   Port 22
#   Protocol 2
#   Ciphers aes128-ctr,aes192-ctr,aes256-ctr,aes128-cbc,3des-cbc
#   MACs hmac-md5,hmac-sha1,[email protected]
#   EscapeChar ~
#   Tunnel no
#   TunnelDevice any:any
#   PermitLocalCommand no
#   VisualHostKey no
#   ProxyCommand ssh -q -W %h:%p gateway.example.com
#   RekeyLimit 1G 1h
    SendEnv LANG LC_*
    HashKnownHosts yes
    GSSAPIAuthentication yes
    ServerAliveInterval 300

On the control server I am using the following sshd_config:

#       $OpenBSD: sshd_config,v 1.103 2018/04/09 20:41:22 tj Exp $

# This is the sshd server system-wide configuration file.  See
# sshd_config(5) for more information.

# This sshd was compiled with PATH=/usr/bin:/bin:/usr/sbin:/sbin

# The strategy used for options in the default sshd_config shipped with
# OpenSSH is to specify options with their default value where
# possible, but leave them commented.  Uncommented options override the
# default value.

Port --hidden--
#AddressFamily any
#ListenAddress 0.0.0.0
#ListenAddress ::

#HostKey /etc/ssh/ssh_host_rsa_key
#HostKey /etc/ssh/ssh_host_ecdsa_key
#HostKey /etc/ssh/ssh_host_ed25519_key

# Ciphers and keying
#RekeyLimit default none

# Logging
#SyslogFacility AUTH
#LogLevel INFO

# Authentication:

#LoginGraceTime 2m
#StrictModes yes
#MaxAuthTries 6
#MaxSessions 10

#PubkeyAuthentication yes

# Expect .ssh/authorized_keys2 to be disregarded by default in future.
#AuthorizedKeysFile     .ssh/authorized_keys .ssh/authorized_keys2

#AuthorizedPrincipalsFile none

#AuthorizedKeysCommand none
#AuthorizedKeysCommandUser nobody

# For this to work you will also need host keys in /etc/ssh/ssh_known_hosts
#HostbasedAuthentication no
# Change to yes if you don't trust ~/.ssh/known_hosts for
# HostbasedAuthentication
#IgnoreUserKnownHosts no
# Don't read the user's ~/.rhosts and ~/.shosts files
#IgnoreRhosts yes

# To disable tunneled clear text passwords, change to no here!
#PermitEmptyPasswords no

# Change to yes to enable challenge-response passwords (beware issues with
# some PAM modules and threads)
ChallengeResponseAuthentication no

# Kerberos options
#KerberosAuthentication no
#KerberosOrLocalPasswd yes
#KerberosTicketCleanup yes
#KerberosGetAFSToken no

# GSSAPI options
#GSSAPIAuthentication no
#GSSAPICleanupCredentials yes
#GSSAPIStrictAcceptorCheck yes
#GSSAPIKeyExchange no

# Set this to 'yes' to enable PAM authentication, account processing,
# and session processing. If this is enabled, PAM authentication will
# be allowed through the ChallengeResponseAuthentication and
# PAM authentication via ChallengeResponseAuthentication may bypass
# If you just want the PAM account and session checks to run without
# and ChallengeResponseAuthentication to 'no'.
UsePAM yes

#AllowAgentForwarding yes
AllowTcpForwarding yes
GatewayPorts yes
X11Forwarding yes
#X11DisplayOffset 10
#X11UseLocalhost yes
#PermitTTY yes
PrintMotd no
#PrintLastLog yes
#TCPKeepAlive yes
#PermitUserEnvironment no
#Compression delayed
ClientAliveInterval 30
ClientAliveCountMax 99999
#UseDNS no
#PidFile /var/run/sshd.pid
#MaxStartups 10:30:100
#PermitTunnel no
#ChrootDirectory none
#VersionAddendum none

# no default banner path
#Banner none

# Allow client to pass locale environment variables
AcceptEnv LANG LC_*

# override default of no subsystems
Subsystem       sftp    /usr/lib/openssh/sftp-server

# Example of overriding settings on a per-user basis
#Match User anoncvs
#       X11Forwarding no
#       AllowTcpForwarding no
#       PermitTTY no
#       ForceCommand cvs server

PasswordAuthentication no
PermitRootLogin yes

Basically I would expect the servers to just keep the connections open, since both sides have enough timeouts set. However, the connections randomly keep dropping. I've checked /var/log/syslog and it seems like sshd randomly drops one of the active connections once a new connection comes in. So I'm pretty sure I'm hitting some connection limit here:

Nov 26 18:38:38 v2202102140578142103 systemd[1]: session-115234.scope: Succeeded.
Nov 26 18:38:38 v2202102140578142103 systemd[1]: Started Session 115376 of user service.
Nov 26 18:38:47 v2202102140578142103 systemd[1]: session-115235.scope: Succeeded.
Nov 26 18:38:47 v2202102140578142103 systemd[1]: Started Session 115377 of user service.
Nov 26 18:38:52 v2202102140578142103 systemd[1]: session-115236.scope: Succeeded.
Nov 26 18:38:53 v2202102140578142103 systemd[1]: Started Session 115378 of user service.
Nov 26 18:39:08 v2202102140578142103 systemd[1]: session-115237.scope: Succeeded.
Nov 26 18:39:08 v2202102140578142103 systemd[1]: Started Session 115379 of user service.
Nov 26 18:39:08 v2202102140578142103 systemd[1]: session-115238.scope: Succeeded.
Nov 26 18:39:08 v2202102140578142103 systemd[1]: Started Session 115380 of user service.
Nov 26 18:39:09 v2202102140578142103 systemd[1]: session-115239.scope: Succeeded.
Nov 26 18:39:09 v2202102140578142103 systemd[1]: Started Session 115381 of user service.
Nov 26 18:39:14 v2202102140578142103 systemd[1]: session-115240.scope: Succeeded.
Nov 26 18:39:15 v2202102140578142103 systemd[1]: Started Session 115382 of user service.
Nov 26 18:39:31 v2202102140578142103 systemd[1]: session-115241.scope: Succeeded.
Nov 26 18:39:31 v2202102140578142103 systemd[1]: session-115242.scope: Succeeded.
Nov 26 18:39:31 v2202102140578142103 systemd[1]: Started Session 115383 of user service.
Nov 26 18:39:31 v2202102140578142103 systemd[1]: Started Session 115384 of user service.
Nov 26 18:39:32 v2202102140578142103 systemd[1]: session-115243.scope: Succeeded.
Nov 26 18:39:33 v2202102140578142103 systemd[1]: Started Session 115385 of user service.

Probably something super simple to fix, but I'm not a linux networking expert, and I wasn't able to find anything useful via own research. So hopefilly someone is able to point me to the limit I have to change for this behaviour to stop?

Thanks in advance!

in flag
Not an answer, but I strongly suggest you consider using an actual VPN, in particular something like wireguard which is good at re-establishing connections if there was any network issues.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.