Score:12

How dangerous might it be - and what performance gains may be had - by turning vulnerability mitigations off on non-Internet facing servers?

in flag

When a virtual machine Linux host server is non-Internet facing and is used exclusively on a LAN and is using a relatively well tested distribution like Proxmox, how dangerous would it be to turn off all vulnerability mitigations via the kernel arg mitigations=off?

Additionally, has anyone tested what kinds of performance gains might be seen by turning off all such mitigations?

This recently came to be a question for me when I saw this big hit that the retbleed mitigations create: https://www.phoronix.com/review/retbleed-benchmark

This line of thought extended to the curiosity of what may be the ramifications - both ill and positive - of removing all or some mitigations either via the above kernel argument or by individually turning off high-impact mitigations.

cn flag
My experience has been most haven't enabled this on existing systems anyway.
Adam Barnes avatar
iq flag
Have you heard of Stuxnet?
Peter Cordes avatar
ke flag
You can disable retbleed mitigation specifically; I think it's the most expensive.
Score:14
la flag

The Linux kernel flag mitigations=[on|off] is a single toggle to easily enable/disable all available kernel mitigations for hardware vulnerabilities as listed here https://docs.kernel.org/admin-guide/hw-vuln/index.html

The impact of that depends of course entirely on your CPU:

  • When your CPU isn’t vulnerable to any of the known vulnerabilities, then none of the mitigations are applicable and the impact should effectively be zero.

  • When your CPU is vulnerable to some (are there even CPUs vulnerable to all of them?) the impact depends on which specific vulnerabilities and your workload.

As to a risk analysis, that also depends on your workload and user base.

On a virtualization host operated by public VPS provider, the guests are trusted less and much more likely to be malicious (or compromised) than I would expect on a internal virtualization host used exclusively by my colleagues.

For example on our virtualization hosts used for CI/CD pipelines and compute clusters, all the guests are short lived, get deployed from trusted images, run for up to a couple of hours and then get destroyed again. There we need all the performance we can get and disable the mitigations.

On a different shared cluster we host more classical server consolidation workloads; guests that get deployed there there can (and are more likely to) run “forever” rather than for hours. It has a mix of production and non-production workloads and the guests are managed by DevOps teams that are not all as diligent in patching and updating their systems and applications.
There the risk of a malicious or compromised guest is much higher, the possible reduced performance for specific workloads an acceptable trade-off and thus mitigations do get enabled there and we limit which CPU flags get exposed to the guests.

Score:2
eg flag

is the server really impossible to connect to even indirectly from outside your network?

That's the first question you should ask yourself. If the LAN can be accessed from a machine that has 2 network connections, one to that LAN and another to another network that is connected to the internet (hopefully through several layers of firewalls) you just connected that server to the internet.

And remember that a machine doesn't need to be connected to the internet for it to be vulnerable to attacks. A machine could be infected with malware that is simply intent on doing damage to it via a USB stick, floppy disk, or even typed in commands on a terminal.

In the end, all security, as all performance improvements, is a compromise. What's an acceptable level of risk for the rewards gained.

To find out your potential performance gains, run tests on the machine while it has no network connections at all and see for yourself. Most likely it won't be very much, but it might just be enough to squeeze out those few extra CPU cycles that help some old hardware survive just that bit longer that you have the time to convince upper management that you really, really, should get the budget for a new server.

sa flag
For all known exploits of the vulnerabilities being discussed, the attacker would need to run arbitrary code on the server. There is a slim theoretical possibility that they can be exploited without running arbitrary code. Also the effect of the vulnerabilities is that the attacker can read data from outside of sandboxes. So which sandboxes is an attacker running arbitrary code on a server likely to be in? A cost-benefit analysis should consider these things. (A *security* analaysis should say leave them on no matter what)
Margaret Bloom avatar
jp flag
@user253751 Meltdown/Spectre lets you read kernel secrets and escalate your privilege. L1TF also impacts VMM root mode (i.e. hypervisors). These vulnerabilities are LPE vulnerabilities, so being able to run code on the machine is taken for granted. It's not a far fetched hypothesis: putting a foot inside the network of an organization is easier than it seems (e.g. by malspam) and once there you can keep luring/looking for more secrets and eventually you'll get the credentials of a machine and run your LPE. This is how a ransomware attack works.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.