I have built an SSH tunneling service that runs in an Alpine based container, based on the approach outlined here: https://github.com/cagataygurturk/docker-ssh-tunnel
The service connects via IdentityFiles, and sets up multiple ControlSockets and tunnels.
I am testing this out against an Amazon Linux bastion, tunneling through to a PostgreSQL database.
SSH login and tunnel creation is correct, and the tunnels can be used, but there appears to be a timeout somewhere.
- If a tunnel - maybe the connection to the target SSH server as a whole? - has been left idle for 5 mins and then connected to, the connection process hangs for 30 secs before continuing successfully.
- Tunnel connections just after the first connection are quick - subsecond.
- Let the tunnel/server be idle for 5 mins, and the 30 sec delay comes back.
Here is the evidence:
client ssh-config
Host my-bastion
HostName 99.99.99.99
User ec2-user
IdentityFile ~/.ssh/key.pem
Host *
ControlMaster auto
ControlPath ~/.ssh/controlmasters/cp_%r_%h
ControlPersist yes
StrictHostKeyChecking no
ServerAliveCountMax 60
ServerAliveInterval 30
TCPKeepAlive no
ForkAfterAuthentication yes
StdinNull yes
ExitOnForwardFailure yes
IPQoS 0x00
Test Workflow
Tunnel previously established using ControlSocket.
Testing with a psql request that fails authentication, but exercises the tunnel.
psql makes 2 connections through the tunnel during the test.
First access after at least 5 mins idle.
# date && time psql "host=localhost port=5430 dbname=xxx user=UUU password=X"
Tue Mar 8 12:10:57 PST 2022
psql: error: FATAL: password authentication failed for user "UUU"
FATAL: password authentication failed for user "UUU"
real 0m32.497s - slow!
SSH client log -vv
1st psql request
[2022-03-08 20:10:57] debug1: Connection to port 5430 forwarding to xxx.us-east-1.rds.amazonaws.com port 5432 requested.
[2022-03-08 20:10:57] debug1: channel 3: new [direct-tcpip]
30 sec Delay here
[2022-03-08 20:10:57] debug2: channel 3: open confirm rwindow 2097152 rmax 32768
[2022-03-08 20:11:29] debug2: channel 3: read<=0 rfd 7 len 0
2nd psql request
[2022-03-08 20:11:29] debug1: Connection to port 5430 forwarding to xxx.us-east-1.rds.amazonaws.com port 5432 requested.
[2022-03-08 20:11:29] debug1: channel 4: new [direct-tcpip]
subsecond response on channel 4
[2022-03-08 20:11:29] debug2: channel 4: open confirm rwindow 2097152 rmax 32768
[2022-03-08 20:11:29] debug2: channel 4: read<=0 rfd 8 len 0
Access immediately after 1st.
# date && time psql "host=localhost port=5430 dbname=xxx user=UUU password=X"
Tue Mar 8 12:11:41 PST 2022
psql: error: FATAL: password authentication failed for user "UUU"
FATAL: password authentication failed for user "UUU"
real 0m0.874s - fast!
user 0m0.021s
sys 0m0.016s
1st psql request
[2022-03-08 20:11:41] debug1: Connection to port 5430 forwarding to xxx.us-east-1.rds.amazonaws.com port 5432 requested.
[2022-03-08 20:11:41] debug2: fd 7 setting TCP_NODELAY
[2022-03-08 20:11:41] debug2: fd 7 setting O_NONBLOCK
[2022-03-08 20:11:41] debug1: channel 3: new [direct-tcpip]
Subsecond response to request
[2022-03-08 20:11:41] debug2: channel 3: open confirm rwindow 2097152 rmax 32768
[2022-03-08 20:11:42] debug2: channel 3: read<=0 rfd 7 len 0
...
2nd psql request
[2022-03-08 20:11:42] debug1: Connection to port 5430 forwarding to xxx.us-east-1.rds.amazonaws.com port 5432 requested.
[2022-03-08 20:11:42] debug1: channel 4: new [direct-tcpip]
[2022-03-08 20:11:42] debug2: channel 4: open confirm rwindow 2097152 rmax 32768
I have searched for others with this issue, but have not found this problem being talked about. I have tried advice from https://jrs-s.net/2017/07/01/slow-ssh-logins/ and set IpQos=0x00 to work around any potential router issues.