When you say "we upped it from 128GB to 192GB and it hasn't solved the problem" what do you mean? The JVM heap space? The RHEL VM? Also what do you mean by "our monitoring takes a reading?" Is your monitoring looking at Java heap memory or system memory?
Is it possible to get OOM with plenty of ram available?
Sure. The most common cause is that "plenty of RAM is available" but not of the right kind. e.g. you have RAM on the server, but the Java process isn't configured to use it. Or you have RAM available in the Java heap, but the Java application needs stack memory instead of heap memory. Or perm memory. Or off heap memory.
There are some other edge cases where you can get an OOM error even with the above, but those are pretty rare. Most likely it is that you are adding the wrong kind of memory.
If I were to debug my first steps would be:
- What exactly is the OOM error and where are you seeing it?
- Looking at the JVM startup flags (and potentially the config of the application, depending on what kind of application it is).
- Enabling GC logging in the application.
EDIT IN RESPONSE TO STACK TRACE:
Well, it looks like my "there are some other edge cases" comment was prophetic. I agree with Philipp Wendler's comment that this is a duplicate of https://stackoverflow.com/q/16789288/396730 . You aren't actually running out of memory, you are running out of threads.
You can look here : https://access.redhat.com/solutions/1420363 for how to increase the number of threads (short version: update
/proc/sys/kernel/threads-max ). But as is discussed on the linked Stack Overflow post, you probably need to fix your application rather than just bump the limit. Any application using more than the default maximum number of threads is probably leaking threads. (And if they aren't it's definitely being wasteful of threads.) Especially if you say that they aren't being flooded with requests.