Score:0

Optimize system for rapidly loading large volumes of data

tj flag

I'm running a complex Matlab program on my Ubuntu computer. It needs to load several .MAT files, each of which is several GB on disk and larger in RAM. I know it is possible to run this program - a coworker of mine who has a MacBook with 32 GB of RAM almost never sees it crash - but my tower, with 128 GB of RAM and Ubuntu 22.04, has not yet successfully run it all the way through, always crashing for lack of memory.

I am not worried about other programs hoarding significant amounts of memory. The memory is totally going to Matlab as it loads the data and as it processes the data.

One obvious measure is to increase the size of the swap file; I increased it from 2 GB to 32 GB. However, that seems not enough. I was perplexed to see the RAM fill up to 99.6% and hover around there while the swap space remained oblivious, staying at about 26% full and not changing much for a minute until the program finally crashed.

It might be that I decreased the swappiness too much - I saw recommendations of decreasing it from the default 60 to a much lower value, so I put it at 5. I've since increased the swappiness to 15, but maybe I should put it higher still? Alongside swappiness I also saw mentions of cache pressure, but I think I want to keep the cache pressure as high as possible, so the default of 100 seems good.

Anything else I'm missing?

ar flag
One possibility is one of the libraries Matlab uses is leaking memory. There was an old saying about keeping the swap double the size of RAM, that is (mostly) not relevant anymore. But in your case, you may try it. maybe your program will conclude before it eats all the swap. Another alternate is to rework your program to load one (or two) files at a time. That is, sequentially work through the different datasets.
Marco avatar
br flag
Matlab on AppleSilicon runs with "Rosetta 2" and as far as I know this uses Memory Compression.
Post169 avatar
tj flag
@Marco - interesting, so on OSX, Matlab not only is nimble at using swap, but it also compresses the data before it puts it in, with an algorithm that apparently saves more time than it takes. I don't suppose there's an easy way to make Matlab on Ubuntu put data through a compression algorithm before putting it into swap & after taking it back out?
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.