Score:3

Hyper-V Clustering Resource Availability

fr flag

I have a Hyper-V Cluster with 4 Hosts at 51 VM's (Roles) across those. I would like to keep it so that I can still power down 1 host at a time for maintenance/Patching. So I'm looking at my resources. In performance monitor I'm only using 7-8% of the CPU however when I look at the Nodes in Failover Cluster Manager it says 75% CPU Usage.

Each host is about 1TB RAM and 96 Intel Xeon E7-8890 @ 2.2Ghz Processors for 192 Cores. I find it hard to believe that 51 VM's would be eating up 75% of the processors across 4 hosts. RAM I still have 800-900GB available in each host.

I would like to migrate about 100-120 VM's over to this cluster but with the CPU Usage where it is that makes me a little nervous.

Am I reading the CPU Usage correctly? Is there a Hyper-V Sizing tool that can assist?

EDIT: For Comparisons sake. I looked at one of the existing hosts on an older cluster, 1 host has 60 Xeon E7-4890 @ 2.8Ghz for 120 Cores and it's 50% utilized with 32 VM's running on it. Making me wonder if something is configured incorrectly.

EDIT 2: Digging into it some more when I run the following I get

Get-WmiObject win32_processor | Select-Object -ExpandProperty LoadPercentage
1
100
100
100

To me that says that there's 4 sockets and something has pegged the last 3 at 100%, thus the 75% utilization.

cn flag
Given the age of the hardware, it may no longer be fit for service, and a candidate for replacement. Probably the same situation for the operating system platform. Perfmon isn't very meaningful in a spot check. That needs to be assessed over days/weeks of measuring usage.
Score:3
cn flag

In performance monitor I'm only using 7-8% of the CPU

That looks high. Note that for compatibility reasons, performance monitor is using ONLY the VM you are logged in - which is the master VM (yes, the hypervisor runs UNDER windows, technically your windows instance is in a VM).

however when I look at the Nodes in Failover Cluster Manager it says 75% CPU Usage.

Which is using all the VM's - so you are at 75%.

To me that says that there's 4 sockets and something has pegged the last 3 at 100%, thus the 75% utilization.

If it would have 4 sockets, it would be a very old machine or a very rare machine - I have problems finding more than dual socket servers since years.

I checked and the CPU you indicate has 24 cores, 48 threads - it also is a 2016 model. It is 3-4 generations out. It is in fact end of service updates (June 30th 2022).

I would not expect it to run a lot - memory speed will be awful, CPU cache will be thrashing.

I would like to migrate about 100-120 VM's over

Then you may want to get a server that should not be end of life.

it's 50% utilized with 32 VM's running on it

Seriously not. 51 VM's can run a lot of not a lot of performance, depending what they are doing - a VM is not the same to each other VM. And on hardware that old with RAM standards that old - what do you expect?

Noone can help you here - you need to nail it down a little more. It could be a driver. NOONE can make a sizing tool - because how long is a piece of string? What is a VM?

RAM I still have 800-900GB available in each host.

That indicates something is unusual. Like they are not servers doing nothing. Also, it does not mean the RAM is not overloaded speed wise. Funny how someone doing that professionally uses only the ram amount, ignoring the speed of the RAM. You have ONLY a 85gp/s memory bandwidth on those things as per spec. - assuming it is full set up with the fastest RAM. If not - that may be a delimiter resulting in a high CPU usage.

Man, seriously, want to consolidate, get a modern server. AMD just has the nice GENOA-X out with a LOT of cache (up to 1gb), exactly good for virtualization. That thing is - you get pretty much what I would expect from it.

Phonic avatar
fr flag
I will not disagree with your assessment that these are older equipment and need to be replaced. I was previously a VMWare Engineer since 2014 and ran WAY more VM's on way older hosts without issue. In my new team I am given a box and I am trying my best to make it work with the equipment and funds that I have. My box says run this ancient equipment with only Microsoft Products. While I am trying to get the funding to purchase new servers, these will need to work for now. My question was what could be destroying CPU load when an even older PRD doesn't have this issue.
cn flag
"I was previously a VMWare Engineer since 2014 and ran WAY more VM's on way older hosts without issue" As a VMWare engineer you should know how ridiculous irresponsible that statement is. The number of VM's depends on the workload - you give ZERO indication that those VM's here have the workload you had before. If you want to know what uses the cpu, start doing baseline administrator work: Find out which VM's use the CPU, then log into the VM's and find out which processes on them do that. Maybe something is wrong there, maybe those VM's in that NEW COMPANY just need more CPU.
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.