Score:0

Remote Server seems to be dead, how to troubleshoot?

us flag

I have a Ubuntu server running remotely in another office. It has gone dead few times and I can't figure out the cause. It's a server that requests external service via api. By dead I mean it is still running but just stop working. The server's network seems to be offline too and a lan scan doesn't find it.

It's behind an office router and running 18.04 kernel 4.15.0-147-generic. No one onsite has account on this server.

Here is what I have tried.

  1. last reboot result:
reboot system boot 4.15.0-151-gener Thu Jul 22 14:49  still running
reboot system boot 4.15.0-147-gener Wed Jul 21 15:48  still running
reboot system boot 4.15.0-147-gener Wed Jul 21 14:05 - 15:48 (01:43)
reboot system boot 4.15.0-147-gener Sat Jul 17 18:24 - 15:48 (3+21:24)
reboot system boot 4.15.0-147-gener Thu Jul 15 17:26 - 15:48 (5+22:22)

Jul 22 14:49 was a reboot that I asked staff onsite did. There were power outage on Jul 21.

  1. /var/log/syslog
Jul 22 09:08:50 localhost service_start.sh[946]: INFO:launcher:myjob finish a output for 2.
Jul 22 09:08:50 localhost service_start.sh[946]: INFO:launcJul 22 14:50:05 localhost systemd[1]: Starting Flush Journal to Persistent Storage...
Jul 22 14:50:05 localhost systemd[1]: Started LVM2 metadata daemon.
Jul 22 14:50:05 localhost systemd[1]: Started Load/Save Random Seed.
Jul 22 14:50:05 localhost lvm[443]:   2 logical volume(s) in volume group "localhost-vg" monitored
Jul 22 14:50:05 localhost systemd[1]: Started Set the console keyboard layout.
Jul 22 14:50:05 localhost systemd-modules-load[436]: Inserted module 'iscsi_tcp'

The system went offline after Jul 22 09:08:50. Jul 22 14:50:05 was the reboot mentioned before.

Looks like the system was not reboot or shutdown otherwise there should be some log indicated that. And there is no system error log in syslog either.

There are two user cron jobs setup to run every 5 and 10 mintues and there were cron running entries in syslog around Jul 22 09:05:01 before the system became dead around Jul 22 09:08:50.

There are no technical people onsite and I can only reach the server via teamview from another onsite computer at the moment.

I had run htop and the system load was light.

I am at loss right now. What else should I check during my next teamview session?

Score:0
br flag

You have quite a few variables in describing your problem, primarily the networking infrastructure at the location where the server is hosted. If this were my server, a first step would be to ssh into it and do a:

tail -f /var/log/syslog

This, or monitoring one of the other log files, could shed some light on what's causing the server to be unresponsive.

Since you say that the server is still running even though it's dead (unclear on what that means) this sort of implies a lost network connection so that's what I'd focus my monitoring on.

You may find that the fastest way to resolve this is to troubleshoot this on site via the local LAN.

us flag
It feels dead because it seems to be offline while doing teamview. I can't ping it or ssh to it. It came back online after a reboot. I am going to setup sar to monitor the system, that's one thing I am going to try. I wish I could connect a monitor to the server..
jones0610 avatar
br flag
My answer was to do precisely that: ssh into the server from any location while the server is still operating correctly. See if syslog catches anything that indicates problems. From your initial description my take was that something was happening that caused loss of network connectivity.... this may or may not be due to a server issue or just something going on at the host site. If my answer helps you troubleshoot this I'd appreciate an up-vote.
us flag
sorry was away from work for a week. I have the syslog backed up. based on the logs recorded before and after the problem occurred, it looks just like a gap. nothing in between happened. As mentioned before, I am going to implemented a monitoring system to aid this troubleshooting.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.