Anatomy of Proxmox File Systems Crashes – stuck on `random: crng init done` again?

Proxmox disks sometimes break. It’s not supposed to, but when it happens it can be devastating.

I only use defaults. Nothing fancy. Whatever the operating system install gave me. Yet,  on many of my Debian/Ubuntu boxes I’ve still had some crashed.

Fortunately we also use Proxmox Backup Server so generally we can recover. However, it’s still not nice and is a tense hole where you can find yourself in, not fun especially when you are running VMs for clients.

Recently I had another one, the dreaded screen below:

Note the time delays, it took around 3 second to get to this:

Btrfs loaded, crc32c=crc32c-generic, zoned=yes, fsverity=yes

Then nothing happens. Minutes of waiting, apparently almost 3.5 minutes, then this crypted message:

random: crng init done

The panic already stars setting in after 3 seconds because it’s blatantly obvious that the OS is stuck. The hopefuls wait and wait but after that long wait, and that cryptic message, you’re pretty much screwed.

At this point you ask yourself things like:

  • What did I do to deserve this?
  • I pray for a recent backup
  • I pray for any backup
  • I hope yesterday’s backup is good enough
  • Help Proxmox forums, help
  • Help, google, help
  • Help, chatgpt, help
  • Am I in the right job?
  • etc.

The point really is there isn’t so many people who can help you. Disk crashes are nasty and unpredictable and chances of recovery slim to poor.

Unfortunately there isn’t such a quick thing as “fsck” in Proxmox. That would be the most obvious next step right?

Just before we carry on moaning, here is another screenshot of a broken disk. You’ll note no btrfs output this time, instead, the long hang happens at

hid-generic 0003:0627:0001.0001: input,hidraw0: USB HID v0.01 Mouse [QEMU QEMU USB Tablet] on usb-0000:00:01.2-1/input0

After that I got the same `random: crng init done` but I didn’t hang around to produce screenshots as I had to get on with the recovery.

Miraculously I managed to fix the first problem by doing this:

  1. Download latest Ubuntu LTS 24.04 Live
  2. Do not connect to the internet
  3. Try Ubuntu
  4. Start a terminal
  5. lsblk to orientate

I then tried fsck /dev/sda

That broke because you don’t fix disks with fsck, rather partitions. You’ll get `Superblock invalid` if you do this.

then I tried fsck /dev/sda1

That worked!

With regards to the second problem, after 5 tries Proxmox displayed the graphics and I was able to run fsck again:

Once I got going, I had to press Y about 10 times.

And boom!! VM running again.

Upgrade of hypervisor started: 1AM

Fixed final problem: 6AM

 

Share this article

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to Top