I would like to know the error message from your MIG log because it could be an issue with the initial delay, so I suggest reviewing how the health check and autohealing policy are configured in your MIG. In this, there are some probes and settings that you can adjust like your vm --initial-delay. This setting delays autohealing from potentially prematurely recreating the VM if the VM is in the process of starting up, and could help with your startup script issue. Sometimes when the vm is starting, it needs more time to execute the startup script. It also helps if there is some delay in the network because some startup scripts issues are related to network connectivity with the metadata server. So, to avoid this you can increase the initial delay in your health check. You can obtain you health check with the following command:
gcloud compute health-checks describe <health check name>
You can update your health check using the update command like is shown in the following example:
gcloud compute instance-groups managed update my-mig \
--health-check example-check \
--initial-delay 300 \
--zone us-east1-b
In this, you can see that the initial delay was set to 5 min, in the following link you will find more information about how to set up health checking and autohealing in a MIG.
Also you can check your instance at any time with this command:
gcloud compute instance-groups managed list-instances your-instance-group
NAME ZONE STATUS HEALTH_STATE ACTION INSTANCE_TEMPLATE VERSION_NAME LAST_ERROR
igm-with-hc-fvz6 europe-west1 RUNNING HEALTHY NONE my-template
igm-with-hc-gtz3 europe-west1 RUNNING HEALTHY NONE my-template