Score:0

K8S limit number of starting pods at the same time

gp flag

I have problem with K8S, I have about 30 microservices (java running on spring boot). Microservices needs a lot of CPUs only for startup because there are a lot of libraries. When I deploy all microservices at the same time it will create really big CPU load on K8S nodes and they are marked as unavailable. I need to limit somehow number of simultaneously starting pods to avoid of high cpu load on nodes. Is there any way how to do this?

SYN avatar
hk flag
SYN
You need to adjust resource limits/requests based on consumption.
Score:1
mc flag

You can set CPU limits and requests.

Once you set these, even if the limits are generous, the kubelet and container runtime work together to enforce the CPU limits. Along with that, you can reserve resources for Kubernetes itself so that the workload doesn't put the overall node at risk.

Once you define these, the Linux kernel becomes responsible for enforcing the limits and ensuring that the available resources are allocated fairly. If you have DaemonSets in your cluster, make sure that these too have resources and limits. You could consider running the DaemonSets as guaranteed, so that their resource are ringfenced. See Configure Quality of Service for Pods.

Taken together, these measures should protect your nodes from the workload whilst still allowing the app Pods to burst into available CPU during startup.

If you find there are still issues, there is an extra step you can take: delay each startup by a random amount. You can do that without app changes, by running a custom init container prior to the main app startup. That random delay helps to avoid thundering herd issues where each JVM runs with the same resource access pattern at the same time.

Score:0
mc flag

If you're sure that you want to rate-limit the number of Pods being run during a rollout, you can extend your Kubernetes cluster by deploying Argo Rollouts.

With effort, you could define your own deploy mechanism to limit the number of simultaneous deploys by setting ephemeral metadata and then making your own mechanism (either a validating admission webhook, or a scheduling plugin) to limit the rate at which pods with these labels are either created or started.

I don't recommend this: although Kubernetes makes it possible, it's a lot of work. My other answer is a simpler way to achieve a similar overall outcome.

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.