Can I get a filesystem cleanup on aisle 6?

CmdrKeen@lemmy.today · 11 months ago

Can I get a filesystem cleanup on aisle 6?

RustyNova@lemmy.world · 11 months ago

Can a linux/systemd nerd explain what the error is? I know it’s a shutdown sequence, but I’m curious on the fault

CameronDev@programming.dev · edit-2 11 months ago

It is actually a boot failure. Normally the kernel reads some config from the initrd (the bootloader loads initrd and passes it to the kernel - thanks dan) and then does a bunch of setup stuff, and then it mounts the actual root filesystem, and then switches to using that. In this case, the root filesystem has failed to mount.

Hardware failure is most likely the cause, but misconfiguration can also make this happen. Probably hardware though.

If its misconfiguration, an admin can reattempt to mount the root drive on /new_root, and then ctrl-d to get the init system to try again

ELI5: couldnt open C:/ drive

Edit: clarified what loads the initrd - as per dans comment.

dan@upvote.au · edit-2 11 months ago

Normally the kernel loads an initrd filesystem,

The bootloader (GRUB) loads the initrd, not the kernel. The kernel accesses stuff from the initrd, but it’s already loaded by that point.

CameronDev@programming.dev · 11 months ago

You are correct. Ill add an edit. Thanks!

Synthead@lemmy.world · 11 months ago

The root filesystem mounted fine. That’s why the init is starting with all the services on the root disk.

neidu@feddit.nl · 11 months ago

Not necessarily. I’ve seen failures like this if the boot partition works, but fails to mount the root partition. systemd then fails to proceed, and shuts down the running services.

Synthead@lemmy.world · edit-2 11 months ago

systemd daemons are configured via /etc/systemd, and systemd itself lives in /usr/lib/systemd/systemd. How can systemd run or start the configured services without the root disk mounted? The initrd (from the boot partition) only contains enough of an environment to call the entrypoint for the init system, not contain the entirety of systemd (or the configured services).

damium@programming.dev · 11 months ago

Initrd contains the systemd binary and enough libraries, services, and kernel modules to get booted this far. The system failed at switch root which is where the real root disk is mounted. Initrd can contain as much or as little as needed to get a working system which can be a lot of you are using a network filesystem as a root for instance.

CameronDev@programming.dev · 11 months ago

Those are all hardware management services (as far as I can tell), and are configured before the root is mounted.

I have hit this exact error before, that is what failing to mount the root disk looks like. A bunch of services will start, and then you get dropped into a shell (with a login).

If you want to see it for yourself, change /etc/fstab such that /root is now pointing to the wrong device, and then rebuild your initrd. When you reboot you’ll see exactly that output. To fix it, login to the shell and mount your root on /new_root, and ctrl-d to continue the boot (from memory it has a message telling you to do that anyway). When your system boots you can fix fstab and rebuild initrd. Its reversable, but maybe test on a machine you dont care about to be safe :)

Synthead@lemmy.world · 11 months ago

Oh interesting! I suppose I have just been very careful with /etc/fstab and I haven’t seen systemd fail this way. TIL! Thanks for letting me know!

RustyNova@lemmy.world · 11 months ago

Thanks for that!

Switching to Linux and actually being able to see real time logs made me actually curious how it works, so that’s one gear out of the machine demistified

earthquake@lemm.ee · 11 months ago

These kinds of public errors are almost always a hard drive failure.

CmdrKeen@lemmy.today · 11 months ago

Using an actual hard drive for an embedded system like this would be a failure in and of itself.

Unless it literally has to store several hours’ worth of HD video content, no reason the entire system couldn’t fit on an SD card.

constantokra@lemmy.one · 11 months ago

It’s been my experience that SD cards are almost always what causes a failure on a SBC. Given the cost of the screens, i’d probably choose something that could boot off nvme storage. Or at least tape a new, configured SD card to the case of the SBC for when this inevitably happens.

VieuxQueb@lemmy.ca · 11 months ago

An SD card is MUCH less reliable than a good hdd unless it’s read only.

dublet@lemmy.world · 11 months ago

As someone who works on embedded devices: HDDs are used for media storage and can be easily replaced. Any NAND as a limited life span and good embedded software will try very hard to minimise writes. Though in my particular area, there’s additional security constraints on the OS, which preclude any removable flash storage from being used.

body_by_make@lemmy.dbzer0.com · 11 months ago

They probably expect the signage to change a lot and don’t want a hardware failure when they do it too much, or didn’t use an external drive in this case and the SD card failed because they wrote to it too much (which would happen eventually anyway).

corsicanguppy@lemmy.ca · 11 months ago

Using an actual hard drive for an embedded system like this would be a failure in and of itself.

You may be surprised to learn that these stores use machines that are occasionally more than a year old and also use inexpensive tech like enterprise spinny disk.

A spinny disk will work in this space, and you know they’ll be deciding based on cost.

glibg10b@lemmy.ml · 11 months ago

Even better: Three SD cards with a ZFS mirror and failure notifications

CmdrKeen@lemmy.today · 11 months ago

Bah humbug, just hook it up to the cloud, WCGW?

glibg10b@lemmy.ml · 11 months ago

You don’t need an internet connection for failure notifications

aard@kyu.de · 11 months ago

Systemd has a feature to shorten lines too long for the display, which is a pretty stupid idea, as you can see here.

The service failing here would be initrd-switch-root.service.

indepndnt@lemmy.world · 11 months ago

So the weird block character in the “see… for details” line is replacing “nitrd-switch-roo” just to shorten the line? That’s what I was trying to figure out.

aard@kyu.de · 11 months ago

Yeah, that’d be the Unicode ellipsis character (…) rendered on a system without a Unicode font on the terminal.