Score:1

GCP VM Instance dysfunctioning

cn flag

Currently using a GCP VM instance to run an ODK aggregate server, I cannot access the server since Friday evening. I guess it's not linked to ODK but rather to the server issue, indeed, I followed the following steps:

  • Changed the internet connection and browser, tried to access locally on my computer : no improvement.
  • Checked that the url is still operational on the website where I created it (on freedns.afraid). It's the case.
  • Checked my GCP VM instance menu and parameters (ubuntu-1804-bionic-v20210604, g1-small :1 vCPU, 1,7 GB of memory, 10 GB in the disk storage, Intel Haswell as processor platform, using W10). I didn't identify a reason to explain the problem. But the port script of the last days were signaling errors:

"Aug 13 16:24:27 enquetes chronyd[2104]: Could not write to temporary driftfile /var/lib/chrony/chrony.drift.tmp Aug 13 16:39:16 enquetes systemd-networkd[19493]: ens4: Configured Aug 13 17:09:17 enquetes systemd-networkd[19493]: ens4: Configured [5034594.247692] systemd-journald[19543]: Failed to create new system journal: No space left on device"

I think it's linked to the disk storage, which was indeed full. I have doubled its capacity today afternoon (from 10 GB to 20 GB) but I get the same scripts after that. See for instance : "Aug 15 18:50:55 enquetes systemd[1]: snapd.service: Start operation timed out. Terminating. Aug 15 18:52:25 enquetes systemd[1]: snapd.service: State 'stop-sigterm' timed out. Killing. Aug 15 18:52:25 enquetes systemd[1]: snapd.service: Killing process 29463 (snapd) with signal SIGKILL. Aug 15 18:52:25 enquetes systemd[1]: snapd.service: Main process exited, code=killed, status=9/KILL Aug 15 18:52:25 enquetes systemd[1]: snapd.service: Failed with result 'timeout'. Aug 15 18:52:25 enquetes systemd[1]: Failed to start Snap Daemon. Aug 15 18:52:25 enquetes systemd[1]: snapd.service: Service hold-off time over, scheduling restart. Aug 15 18:52:25 enquetes systemd[1]: snapd.service: Scheduled restart job, restart counter is at 949. Aug 15 18:52:25 enquetes systemd[1]: Stopped Snap Daemon. Aug 15 18:52:25 enquetes systemd[1]: Starting Snap Daemon... Aug 15 18:52:25 enquetes snapd[29509]: AppArmor status: apparmor is enabled and all features are available Aug 15 18:52:25 enquetes snapd[29509]: AppArmor status: apparmor is enabled and all features are available Aug 15 18:53:56 enquetes systemd[1]: snapd.service: Start operation timed out. Terminating."

  • Tried to stop the instance and restart it. No improvement.
  • Tried to reboot with the commands sudo reboot now / sudo reboot -f via g-cloud and Google Shell, but it doesn't work ("Failed to write reboot parameter file: No such file or directory" or disconnected from G Shell just after entering the 2nd one). I cannot access to SSH, although firewall and ports are ok.

I don't know what would be better to do now, as I don't master Serial console and command lines : I have already created a persistent disk snapshot and would like to restore the data to a new disk and have access again to my current server (same external IP address, host name, etc.).

Do you have any idea on how to fix the problem ?

Thank you in advance for your help.

N.T.

John Hanley avatar
cn flag
1) You ran out of free disk space. Then you resized the disk. Not all GCP operating systems will automatically resize the disk partitions. Figure out your booting OS and then find a tutorial on resizing the root file system. 2) If you create another instance from the snapshot, you will have the same problem. 3) Edit your question with details on the OS version, what you did and what steps you have tried to resolve the problem.
Naej Teco avatar
cn flag
Hello John, thank you for your return. I edited the request and just find this tuto for instance : https://cloud.google.com/compute/docs/disks/working-with-persistent-disks#resize_pd But I can't connect to my VM via SSH ; after loading a long time, it displays that it is unable to connect...
John Hanley avatar
cn flag
Ubuntu auto resizes on restart. Solve your serial port problem so that you can connect to the instance.
Naej Teco avatar
cn flag
Thanks. I don't succeed in fixing it at the moment (I edited with the actions of today). All reboot attempts I tried are failing and I don't know what else to do. Serial port problems are dealing with snapd.service or Snap Daemon.
Alex G avatar
ar flag
You have to create/change the credentials before connecting to your serial console. Follow [this post](https://stackoverflow.com/questions/65997438/how-do-i-reset-a-google-cloud-linux-vm-ssh-password) for setting that up first, then enable serial console and connect using thus [documentation](https://cloud.google.com/compute/docs/troubleshooting/troubleshooting-using-serial-console#console_1). Once done, you should be inside your system and would be able to make more investigation using the logs it generates.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.