Score:0

How to fix etcd within a kuberentes cluster?

us flag

I have a bare-metal (kubeadm) kubernetes cluster that's really unstable, and I traced it back to an etcd issue.

From the etcd pod's description I get:

Image: k8s.gcr.io/etcd:3.4.13-0
Liveness: ... #success=1 #failure=8
Startup:  ... #success=1 #failure=24

In the logs startup sequence seems fine (compared to another cluster), then I get a lot of warnings:

etcdserver: [...] request ... took too long to execute

But I don't think it's hardware related because etcd_disk_backend_commit_duration_seconds 99th percentile is at 16ms which is fine according to the FAQ.

Anyways, this goes on for a few minutes, and then I guess this causes the restart:

etcdserver/api/etcdhttp: /health error; QGET failed etcdserver: request timed out (status code 503)

Any idea what further steps I can take to diagnose the issue and fix etcd ?

Mikołaj Głodziak avatar
id flag
Did you see this [issue](https://github.com/etcd-io/etcd/issues/11809)? Is it similar to yours?
us flag
Well it has some similarities, but in the issue you mention the timeouts start just after startup wheras in my case it starts after a few minutes of uptime. Also it isn't clear if there is a crash in the other issue, whereas for me there is for sure. But I'll continue to look into disk performance until I get a better idea...
Mikołaj Głodziak avatar
id flag
Which version of Kubernetes did you use? Can you provide steps how exactly did you set up the cluster?
Wytrzymały Wiktor avatar
it flag
Hello @Antoine. Any updates?
us flag
Thanks, I was able to get help on github and resolve the issue: https://github.com/etcd-io/etcd/issues/13373. I think at some point my node changed its private IP because of hardware issues, and upon upgrading etcd it caused configuration issues. Fix was to dump+restore etcd data.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.