Score:0

AWS Elastic Inference not working after waking up from hibernation

ng flag

I babysit a Python daemon in an EC2 instance watching incoming jobs and running PyTorch inferences on them in Elastic Inference.

When there are no jobs, I hibernate the instance. When there are jobs, the instance is waken up and the Python script continues its loop from where it was frozen.

When the script tries to run an inference after waking up from hibernation, it throws this error:

EI Error Code: [1, 4, 1]
EI Error Description: Internal error
EI Request ID: PT-1F304B24-DCB7-48A0-8ABB-0D30XXXXXXXX  --  EI Accelerator ID: eia-7646efb5xxxxxxxxxxxxxxxxxxxxxxxx
EI Client Version: 1.7.0

If I do not hibernate (either run continuously or do a full stop/start) then everything is okay.

I prefer to hibernate as waking up resumes job processing much faster than a cold start.

How to debug the above issue?

I would imagine there is some process/memory association with the EI accelerator when the script is running, and that is lost on hibernation. Is there no way to make it persist?

Tim avatar
gp flag
Tim
What happens if you build in a delay of maybe 10 seconds before you try to do the job? This sounds like an ideal job for a serverless solution https://aws.amazon.com/blogs/machine-learning/machine-learning-inference-at-scale-using-aws-serverless/
Greendrake avatar
ng flag
@Tim Re serverless, its a TorchScript model that I need to run inferences on. The last time I checked (about 6 months ago) it was either not supported on Lambda or required docker containers which suffered from the same cold start problem.
Tim avatar
gp flag
Tim
Elastic Inference doesn't seem to be available in lambda anyway. Hibernating an instance is probably going to have a longer cold start than start than a lambda container, if it did work. Given EI is a network based attachment I guess the attachment is lost when the instance is hibernated, so either you can't hibernate or you need to give it a chance to re-establish the interface when the instance comes up.
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.