Archive

Posts Tagged ‘gce unbootable’

Rescuing GCE compute instance

September 17, 2016 Leave a comment

I’m digging deep inside my soul
To bring myself out of this God-damned hole
I rid the demons from my heart
And found the truth was with me from the start
(Halford – Resurrection)

If you’re still managing servers as pets and not as cattle – inevitable happens: filesystem breaks, sshd gets killed, wrong iptables settings get applied, wrong mount option halts boot process – and you’re in deep. Your important server is inaccessible.
If your server is running on in-house managed VmWare or XenServer, .NET console will help you rescue it. If it’s running on bare metal, you can rely on iDRAC or something similar. But, if you’re running in cloud – you’re pretty much screwed.

If you’re running GCE, there are couple of options at your disposal at the time of dismay. First, there is a beta Virtual Serial Port option that you can connect to and see where the hell did your instance halt and what messages are printed.

To enable Virtual Serial Port, you need to have gcloud (command line tool) installed and authenticated to your project. So, first thing is to list available instances:

% gcloud compute instances list
NAME   ZONE        MACHINE_TYPE   INTERNAL_IP  EXTERNAL_IP      STATUS
test   us-east1-b  g1-small       10.240.0.5   104.155.112.80   RUNNING

Now, to be able to connect to Virtual Serial Console, you need to set up ssh keys properly. Username and keys from your project metadata can be obtained by running:

% ssh-keygen ~/.ssh/google_compute_engine
% gcloud compute project-info add-metadata \
   --metadata-from-file sshKeys=google_compute_engine.pub

If you already have a key, you will need to set up both ~/.ssh/google_compute_engine and ~/.ssh/google_compute_engine.pub to match the key from the project.

After the keys are set, you can finally connect:

% gcloud beta compute connect-to-serial-port gceuser@test

You should probably get a standard TTY login prompt.

If an attempt to fix the problem through GCE Virtual Serial Console didn’t succeed, but you think boot disk can be saved by attaching it to other instance, you will need to:

  • disable “auto delete boot disk”
  • destroy instance 😦
  • attach boot disk as additional disk to another VM
  • mount it, fix whatever is broken, umount it
  • detach disk from instance
  • create new instance, and choose this disk as boot disk

Using gcloud, it would look something like this

% gcloud compute instances \
  set-disk-auto-delete test --no-auto-delete --device-name test
% gcloud compute instances delete test
% gcloud compute instance

You should probably get a standard TTY logi

Advertisements
%d bloggers like this: