I have an application hosted on Google Kubernetes Engine that utilizes a Horizontal Autoscaler to scale automatically to the current demand (e.g. 30% CPU). In general, this works very well and allows us to keep costs down. However, occassionally we have quite big spikes because when we send push notifications to our users, which prompts them to open the mobile App and increases traffic dramatically, about 30x the normal load and it scales to 90 instead of the usual 5-10 pods. Most of the time this works quite well, but sometimes this is not fast enough.
What is the easiest and especially most reliable way to scale the system automatically in advance. We already know the push-notifiations will be sent at least 10 minutes in advance. So manually I would simply increase the minimum pod size to 50, which I want to automate.
I am sure there are several ways to do that, but I am looking for something that does not involve a lot of custom code, which is simply a place which could break. E.g. the code has an error and doesn't scale back properly or starts too many pods.
What I am think is, to use custom metrics. Maybe to print a log statement, which I could convert to a Stackdriver Metric. It would be enough for me that the minimum number of replicas gets set to 50 as long as the log statements are printed and after 10min it scales back.
Does anybody know if this might work or has a better idea?