We have a Google Cloud Platform (GCP) server that is open to world with ssh port 22. (I know that this is a bad idea, but fail2ban
is running also on this server just for this reason.) Recently, the server started a behavior of suddenly using high CPU and a few hours later it becomes unresponsive. Possibly GCP decreases the CPU credits of the server to a very low value and eventually the server comes to a standstill.
I tried to catch the problem by running something like nohup top -d 120 -b | grep --line-buffered "load average" -A 10 &
and logging the output to a file, but I haven't seen anything wrong until nothing is logged by top
and before the server becomes unresponsive.
However, today I noticed that, when I try to connect to the server while in this state, using the debug (-vvv
) switch of ssh
, sometimes the connection hangs, but sometimes it is terminated with a message like this:
debug1: kex_exchange_identification: banner line 0: Exceeded MaxStartups
Trying telnet server_ip 22
results in the same behavior: Sometimes the connection hangs, but sometimes I get the response Exceeded MaxStartups
.
Is this the result of a some kind of SSH attack or something else? How can I further trace down the problem?