Score:0

Process with very high RAM usage takes very long to stop using force stop in Task Manager

br flag

We have a Windows Server 2016 with around physical 700GB RAM. A colleague of mine ran a machine learning script in Matlab, that loaded 25GB of data in RAM and during the training the RAM usage increased up to 350GB (usual behavior for many AI algorithms during training). This led to a big drop in performance for many other people (including the colleague who did this). He tried stopping it by force stoping the Matlab process tree (one node only) from the Task Manager.

The process is still "stopping" 2 hours later. We noticed that the RAM usage is gradually dropping but with around 200KB/s. Restarting the machine is currently not possible.

Any idea what is going on here? Normally killing a process should go past gentle shutdown procedures. At least this is my experience.

Update: a day later and the Matlab process has increased the RAM usage to 357GB

Ramhound avatar
cn flag
“Normally killing a process should go past gentle shutdown procedures.” - The system has no memory to perform that operation since all physical and virtual memory was used by a single process. Seem strange it was able to use more than 5x the amount of memory that the system has.
paladin avatar
id flag
It's Skynet emerging! Remove the power plug! ;-) It's probably a recursive problem. The machine learning process has created many sub-processes, which also create further sub-processes, killing them all at once is not possible, as the processes are unknown, so one after one process is being killed, while new processes are generated at the same time. Next time use Matlab within a VM and kill that VM.
Mokubai avatar
cn flag
Unless the machine actually had 350GB of physical RAM it probably wrote all of that used memory out to the page file and ballooned the page file out to 350+ GB as well. Now it is having to read in every page of memory that got paged out, invalidate it and then release it from the page table stored in memory. Whoever administrates this server should set some kind of sensible upper limit to page file size.
Ginnungagap avatar
gu flag
The question doesn't seem to have been edited and it clearly states that the server actually has 700GB of RAM so the script consumed about half of that, no reason for the pagefile to be involved (from what is stated in the question).
rbaleksandar avatar
br flag
It's a server. It has a lot of memory (700GB physical). I will add the info to the question so that there is no confusion. Also one day later the memory usage is still 49-50%. :D
vidarlo avatar
ar flag
How was the process killed? `taskkill /f /im:foo.exe` will force it, and may be quicker. How much swap is used?
cn flag
The amount of memory is meaningless if it is fragmented. You would need a memory dump to analyze, but this is beyond your capability. `Restarting the machine is currently not possible`. Then use another server.
Score:0
ca flag

You probably killed the main process without killing the entire process tree. I suggest using ProcessExplorer to discover which processes are using RAM the most and then killing the entire process tree.

rbaleksandar avatar
br flag
Actually I went for the whole process tree (as stated in the question) even though in the tree there was just a single node. I did also try ProcessExplorer. The process currently cannot be stopped. Windows refuses to do so with the respective error message, which usually occurs whenever you try to stop something that is already in the process of stopping.
shodanshok avatar
ca flag
Can you share the precise error message (or a screenshot)?
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.