Multi master kubernetes cluster with haproxy LB, cluster is not working after master node restart(Unable to execute kubectl cmds)

Question

Score:0

Server

Multi master kubernetes cluster with haproxy LB, cluster is not working after master node restart(Unable to execute kubectl cmds)

user1672382

3/23/24, 5:52 PM

I have installed multi master cluster, referring to the guide Setting up k8 multi master cluster

Setup details as following.

Load balancer: Haproxy LB

frontend kubernetes-frontend
    bind 192.168.1.11:6443
    mode tcp
    option tcplog
    default_backend kubernetes-backend

backend kubernetes-backend
    mode tcp
    option tcp-check
    balance roundrobin
    server master21.server 192.168.1.21:6443 check fall 3 rise 2
    server master22.server 192.168.1.22:6443 check fall 3 rise 2

Kubernetes version : v1.25.0

No Of masters: 2
No of workers: 2

Docker version 23.0.1

cri-dockerd V3.0

Env: Vmware virtual servers : Centos 8

After the installation and cluster setup, everything was running fine and, I have deployed a sample pod as well. Then I wanted to check the high availability of the cluster by shutting down one of the master server, here came the problem, Once I shut down one of the master server, kubectl commands stopped working. Tried restarting and switching Master nodes, yet kubectl commands are not working . when the command timed out it gives following error,( but not always)

error: Get "https://192.168.1.11:6443/api?timeout=32s": net/http: TLS handshake timeout - error from a previous attempt: EOF

I have tried curl commands with and without http(s) , this is the result

[***@master21 ~]$ curl -v https://192.168.1.11:6443/api?timeout=32s
*   Trying 192.168.1.11...
* TCP_NODELAY set
* Connected to 192.168.1.11 (192.168.1.11) port 6443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to 192.168.1.11:6443
curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to 192.168.1.11:6443

[***@master21 ~]$ curl -v http://192.168.1.11:6443/api?timeout=32s
*   Trying 192.168.1.11...
* TCP_NODELAY set
* Connected to 192.168.1.11 (192.168.1.11) port 6443 (#0)
> GET /api?timeout=32s HTTP/1.1
> Host: 192.168.1.11:6443
> User-Agent: curl/7.61.1
> Accept: */*
>
* Empty reply from server

Can some one help me to solve this issue, I believe that TLS configs are required on haproxy, but don't understand how to configure it to be match with existing SSL setup in the k8 cluster

Output of curl -kv https://192.168.1.21:6443/healthz after bring down one master (master22.server whole vm)



[***@master21 ~]$ curl -kv https://192.168.1.21:6443/healthz
*   Trying 192.168.1.21...
* TCP_NODELAY set
* Connected to 192.168.1.21 (192.168.1.21) port 6443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Request CERT (13):
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, [no content] (0):
* TLSv1.3 (OUT), TLS handshake, Certificate (11):
* TLSv1.3 (OUT), TLS handshake, [no content] (0):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=kube-apiserver
*  start date: Mar 23 08:10:26 2023 GMT
*  expire date: Mar 22 08:10:26 2024 GMT
*  issuer: CN=kubernetes
*  SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* TLSv1.3 (OUT), TLS app data, [no content] (0):
* TLSv1.3 (OUT), TLS app data, [no content] (0):
* TLSv1.3 (OUT), TLS app data, [no content] (0):
* Using Stream ID: 1 (easy handle 0x5605644bf690)
* TLSv1.3 (OUT), TLS app data, [no content] (0):
> GET /healthz HTTP/2
> Host: 192.168.1.21:6443
> User-Agent: curl/7.61.1
> Accept: */*
>
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS app data, [no content] (0):
* Connection state changed (MAX_CONCURRENT_STREAMS == 250)!
* TLSv1.3 (OUT), TLS app data, [no content] (0):
* TLSv1.3 (IN), TLS app data, [no content] (0):
* TLSv1.3 (IN), TLS app data, [no content] (0):
< HTTP/2 403
< audit-id: 930660ff-c7ee-4226-9b98-8fdaed13a251
< cache-control: no-cache, private
< content-type: application/json
< x-content-type-options: nosniff
< x-kubernetes-pf-flowschema-uid:
< x-kubernetes-pf-prioritylevel-uid:
< content-length: 224
< date: Fri, 24 Mar 2023 06:46:01 GMT
<
* TLSv1.3 (IN), TLS app data, [no content] (0):
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "forbidden: User \"system:anonymous\" cannot get path \"/healthz\"",
  "reason": "Forbidden",
  "details": {},
  "code": 403
* Connection #0 to host 192.168.1.21 left intact

As per further checks I have noticed that the issue occurs when I bring down a master node completely( whole VM) , when I stopped only the kubelet service, kubectl command gives following expected output

[***@master22 ~]$  kubectl get nodes
NAME              STATUS     ROLES           AGE   VERSION
master21.server   Ready      control-plane   22h   v1.25.0
master22.server   NotReady   control-plane   22h   v1.25.0
worker31.server   Ready      <none>          22h   v1.25.0
worker32.server   Ready      <none>          22h   v1.25.0

429

0 + 6

ssl

haproxy

docker

kubernetes

Multi master kubernetes cluster with haproxy LB, cluster is not working after master node restart(Unable to execute kubectl cmds)

Post an answer