I have a java application running on an ec2 instance. Nginx and mongodb is also running on the instance. The application is accessed through ELB which forwards requests to the instance.
I have 2 versions of the exact same instance running with one receiving slightly more traffic than the other (as one serves the assets for both apps). However, only the main, asset-serving, instance falls over.
Most mornings the ec2 instance falls over and therefore the app is no longer running where i receive a text from sns. Its often at 4:01am UTC (which doesnt seem like a coincidence) but there have been other failure times ranging from 1:26am to 5:21am.
This is odd as the app is used through the day and not through the night. I have confirmed this with both the nginx logs on the instance and the app logs.
The instance is a t2.micro but before i increase this i'd like to understand the cause of the issue as during its peak use times it seems to handle things fine and its not quite making sense why its failing early morning.
At the point where the issue occurs there is a minor spike in CPU usage from ~2% to ~8%.
The suspicious statistic is a huge spike in Read Bandwith on the ebs volume just before the crash which seems to maintain until restart.
spike in read bandwith on ebs
The only activity im aware of on the volume is a mongo backup jump which dumps the database and uploads an archive to s3 at 2:40am
Can someone please give me some insight to whats causing this?
Apologies if this is not enough information