Score:10

How can I make Linux reboot instead of remounting the filesystem as read-only?

us flag

Linux systems sometimes remount the root file system as read-only, e.g. if there's an I/O error.

I have a machine that becomes useless when this happens, and I end up rebooting it manually.

Is there a way to make Linux just automatically reboot when this happens? A read-only mount is useless to me.

br flag
I'd also investigate the source of these I/O errors. The last time an ext2 filesystem went readonly for me was in 1994, and the cause could be traced to a broken CPU fan.
mx flag
You have an [XY problem](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem) here. The correct solution is not to make the system reboot on an IO error (the accepted answer explains how to do that, _but_ that’s actually rather risky for multiple reasons), it’s to _fix the root cause of the IO errors_, because then the filesystem will not randomly get mounted read-only. If it’s only intermittent and the storage device is good, you probably have suspect RAM or a flaky PSU, both of which can cause much bigger issues than a simple filesystem error.
us flag
@AustinHemmelgarn: I don't have an XY problem here. You're just making a lot of assumptions that don't hold true in the case(s) I'm asking about.
us flag
@SimonRichter: I indeed have tried looking into the cause, but thanks for the reminder, others should probably do that before rebooting.
us flag
I just realized somehow I posted this on ServerFault rather than Unix.SE as I had intended to! Glad it's still on-topic I guess, but feel free to migrate if needed.
cn flag
Rebooting rather than sort out the reason for the R/O remount has a high likelihood of making the problem *worse* - especially if it fails to mount the system on reboot and you're now stuck with an entirely unresponsive system.
mx flag
@user541686 You have random IO errors. That _will_ cause other problems eventually (and trust me, they will be much more of a pain to fix than just rebooting the system), hence my assertion that this is an XY problem. The fact that you do not recognize the X as a problem does not make it any less of an XY problem.
us flag
@AustinHemmelgarn: I'm well aware of what's going on in my situation and why I resorted to this solution. Unfortunately you're not. The fact that you don't recognize you're still making unfounded assumptions about my situation doesn't make you more correct, but admittedly I can't stop you from lecturing.
us flag
@Shadur: I fully understand all that, believe it or not. Nobody is saying this solution should be used in every situation. I'm just telling you I have **a** situation where this solution makes sense. If you can't imagine why, that's fine. Just have faith in me that I'm not stupid and that I'm only asking this because there's information I have that you don't.
Mark avatar
tz flag
@user541686, if there's relevant information, provide it. Don't just say "trust me".
us flag
@Mark: No, I won't provide irrelevant info. It's quite literally nobody's business what situation I'm dealing with that gave rise to this question. If you would rather believe it's out of my stupidity, feel free to continue believing that; don't feel obligated to "trust me". It's not like I can force you.
Andrew Henle avatar
ph flag
@user541686 You're papering over I/O errors on the root filesystem with a reboot, on the ***HOPE*** that your system will return to operational status. You're coming across as someone who thinks they know everything but in reality is just smart enough to be dangerous. You may think you know why you're getting IO errors, but what happens **when** you get one that's not like you think? You get a dead system that you can't access. "I know what's going on!" doesn't provide any limits as to what **can** go on - the universe doesn't care about what you think you know.
marcelm avatar
ng flag
@Mark (and others) _"... if there's relevant information, provide it. Don't just say trust me."_ - I don't think it's worth barking up the XY tree here. First of all, the question as it stands (panic/reboot instead of remounting ro) is a perfectly valid and answerable question. Secondly, the OP seems well aware that I/O errors are, ahem, not ideal, and has now explicitly declared that area off-topic. Sadly, sometimes there's just nothing you can do to fix the root cause _right now_, and a workaround is needed. With that in mind, I don't think we're in a place to demand OP provide more context.
Score:23
ca flag

I deduce you are using ext3 or ext4 as the file system. If so, you can mount it with the errors=panic option and configure watchdog to reboot your system in case a panic happen.

While more complex than roelvanmeer's answer (which I upvoted), it has the added bonus of working for all panic-level kernel crash.

As suggested by NikitaKipriyanov, setting the panic=5 kernel boot option can be a simpler alternative to watchdog (which has more configuration options but it is slightly more complex as result).

Nikita Kipriyanov avatar
za flag
Alternative to watchdog might be adding something like `kernel.panic = 5` into the `/etc/sysctl.d/panic-reboot.conf`.
us flag
Thank you! I'll give this a shot. Hopefully it won't [fail to reboot](https://forums.debian.net//viewtopic.php?f=5&t=102033)!
shodanshok avatar
ca flag
@NikitaKipriyanov good suggestion, I'll edit my answer. Thanks.
joshudson avatar
cn flag
warning: probable reboot loop
us flag
@joshudson: Yeah I'm planning to watch out for that, that's definitely an important warning for anyone trying this.
Andrew Henle avatar
ph flag
@joshudson If it reboots at all. Relying on a system that knows its root filesystem might be corrupt and/or its root disk broken to reboot is based on wishful thinking and unicorns.
joshudson avatar
cn flag
@AndrewHenle: I've brought a lot of systems up with a trashed root filesystem. Usually I can' take over the boot process and get fsck to run because the damage rarely hits `/sbin` or files that haven't changed in awhile.
Andrew Henle avatar
ph flag
@joshudson You hope... ;-) My thoughts here are based on the idea that trying to soldier on when your root filesystem device is tossing IO errors is a misguided effort in the first place and throwing in a reboot only makes significant issues more likely - "My root device is going bad, so let's do something that ***really*** depends on the root device being fully functional and having proper access to the bulk of the filesystem!"
Score:14
ie flag

Maybe not a very pretty solution, but my first thought would be to run a command from cron every minute:

test -w / || reboot
us flag
+1 thanks, this'll be a great fallback if the other solution fails!
im flag
I think it is not guaranteed that `test -w` checks if the filesystem is read-write. Though GNU `test` and `test` built into `bash` seems to do that. --- Here you can see what should POSIX-compliant `test` do: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/test.html#tag_20_128_05 As I understand it `test` is only required to check the access rights of the file.
joshudson avatar
cn flag
In which case `tee -a /root/.bash_history < /dev/null || reboot` will work.
shodanshok avatar
ca flag
@joshudson Even simpler: `touch /writecheck || reboot`
br flag
@shodanshok That's great if you don't mind a file called `/writecheck` lying around at the root of your filesystem, since it'll be created when the filesystem _isn't_ read-only. The other proposed methods were attempting to avoid creating any spurious empty files. (Though if pabouk is right — which I'm 50/50 on, personally — actually-creating a file may be unavoidable, in order to fully determine the filesystem's read-only state.)
im flag
Another question about the problem of testing write access: [How to non-invasively test for write access to a file?](https://unix.stackexchange.com/q/159557/19702)
cn flag
@shodanshok that could lead to unexpected reboots - or reboot loops - from error conditions unrelated to filesystem errors, eg temporary upsets of the libc installation, OOM conditions, anything that could make touch fail....
shodanshok avatar
ca flag
@rackandboneman sure - but *any* script with `|| reboot` is subject to these issues. Moreover, if `touch` fails on your system due to libc issues, you probably have worse problem then a reboot loop. Anyway, as stated in my answer, `watchdog` is the way to go for more advanced needings.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.