etcd cluster with DNS Discovery - client: etcd cluster is unavailable or misconfigured; Error: unexpected status code 404; dig SRV returns blank

Question

Score:0

Server

etcd cluster with DNS Discovery - client: etcd cluster is unavailable or misconfigured; Error: unexpected status code 404; dig SRV returns blank

Banoona

12/4/22, 5:26 PM

I am configuring etcd to bootstrap using DNS discovery but it says that the server is misconfigured and it appears to be querying the wrong port, and the SRV records don't seem right.

Please could you review the below and see my questions at the bottom of this post?

Specifications

root domain: etcd.ksone

server SRV record:

_etcd-server-ssl._tcp.etcd.ksone    SRV Simple  -   
0 0 2380 etcd2.ksone
0 0 2380 etcd1.ksone

client SRV record:

_etcd-client-ssl._tcp.etcd.ksone    SRV Simple  -   
0 0 2379 etcd2.ksone
0 0 2379 etcd1.ksone

using TLS: True

OS:

[fedora@ip-10-0-0-245 ~]$ uname
Linux
[fedora@ip-10-0-0-245 ~]$ cat /etc/os-release
NAME=Fedora
VERSION="24 (Twenty Four)"
ID=fedora
VERSION_ID=24
PRETTY_NAME="Fedora 24 (Twenty Four)"
ANSI_COLOR="0;34"
CPE_NAME="cpe:/o:fedoraproject:fedora:24"
HOME_URL="https://fedoraproject.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=24
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=24
PRIVACY_POLICY_URL=https://fedoraproject.org/wiki/Legal:PrivacyPolicy

etcdctl version

[fedora@ip-10-0-0-245 ~]$ etcdctl --version
etcdctl version 2.2.5

List Members

I try to list members using the command below and get the following error:

bash-4.3# etcdctl --ca-file /etc/etcd/ca.pem --key-file /etc/etcd/kubernetes-key.pem --cert-file /etc/etcd/kubernetes.pem --discovery-srv etcd.ksone --debug member list

start to sync cluster using endpoints(https://etcd1.ksone.:2380,https://etcd2.ksone.:2380)
cURL Command: curl -X GET https://etcd1.ksone.:2380/v2/members
got endpoints() after sync
Cluster-Endpoints:
cURL Command: curl -X GET /v2/members
client: etcd cluster is unavailable or misconfigured

Cluster Health

Similarly, I can query for cluster-health with the following command and output:

bash-4.3# etcdctl --ca-file /etc/etcd/ca.pem --key-file /etc/etcd/kubernetes-key.pem --cert-file /etc/etcd/kubernetes.pem --discovery-srv etcd.ksone --debug cluster-health

Cluster-Endpoints: https://etcd1.ksone.:2380, https://etcd2.ksone.:2380
===> NOTE: IT TRIES (INCORRECTLY?) ON PORT 2380 (SERVER)  cURL Command: curl -X GET https://etcd1.ksone.:2380/v2/members
cluster may be unhealthy: failed to list members
Error:  unexpected status code 404

SRV records

I have configured the SRV records as follows

list SRV for root domain i.e. "etcd.ksone" (expected result --> should show the full SRV records but returns nothing?):

 dig +noall +answer SRV etcd.ksone

==> <the console shows no output - empty!>

list SRV explicitly for server:

# dig +noall +answer SRV _etcd-server-ssl._tcp.etcd.ksone


_etcd-server-ssl._tcp.etcd.ksone. 33 IN SRV     0 0 2380 etcd2.ksone.
_etcd-server-ssl._tcp.etcd.ksone. 33 IN SRV     0 0 2380 etcd1.ksone.

list SRV explicitly for client:

/ # dig +noall +answer SRV _etcd-client-ssl._tcp.etcd.ksone


_etcd-client-ssl._tcp.etcd.ksone. 300 IN SRV    0 0 2379 etcd1.ksone.
_etcd-client-ssl._tcp.etcd.ksone. 300 IN SRV    0 0 2379 etcd2.ksone.

Try the Client endpoint explicitly (SUCCESS, but not really using dns discovery!)

bash-4.3# etcdctl --ca-file /etc/etcd/ca.pem --key-file /etc/etcd/kubernetes-key.pem --cert-file /etc/etcd/kubernetes.pem --debug --endpoint https://etcd1.ksone:2379 cluster-health


Cluster-Endpoints: https://etcd1.ksone:2379
cURL Command: curl -X GET https://etcd1.ksone:2379/v2/members
member 499073e22ac73562 is healthy: got healthy result from https://etcd1.ksone:2379
member b98d4fc780a787fe is healthy: got healthy result from https://etcd2.ksone:2379
cluster is healthy

etcd Service Setup

systemctl status etcd



_ etcd.service - etcd
   Loaded: loaded (/etc/systemd/system/etcd.service; enabled; vendor preset: disabled)
   Active: active (running) since Sun 2021-08-01 07:17:39 UTC; 1h 23min ago
     Docs: https://github.com/coreos
 Main PID: 2363 (etcd)
    Tasks: 7 (limit: 512)
   CGroup: /system.slice/etcd.service
           __2363 /usr/bin/etcd --name etcd1.ksone --discovery-srv=etcd.ksone --initial-advertise-peer-urls https://etcd1.ksone:2380 --initial-cluster-token etcd-cluster-0 --initial-cluster-state new --advertise-client-urls https://etcd1.ksone:2379 --listen-client-urls https://etcd1.ksone:2379,http://127.0.0.1:2379 --listen-peer-urls https://etcd1.ksone:2380 --data-dir=/var/lib/etcd/data --cert-file=/etc/etcd/kubernetes.pem --key-file=/etc/etcd/kubernetes-key.pem --peer-cert-file=/etc/etcd/kubernetes.pem --peer-key-file=/etc/etcd/kubernetes-key.pem --trusted-ca-file=/etc/etcd/ca.pem --peer-trusted-ca-file=/etc/etcd/ca.pem

Aug 01 07:18:32 ip-10-0-0-245.eu-west-1.compute.internal etcd[2363]: 499073e22ac73562 became candidate at term 41
Aug 01 07:18:32 ip-10-0-0-245.eu-west-1.compute.internal etcd[2363]: 499073e22ac73562 received vote from 499073e22ac73562 at term 41
Aug 01 07:18:32 ip-10-0-0-245.eu-west-1.compute.internal etcd[2363]: 499073e22ac73562 [logterm: 1, index: 2] sent vote request to b98d4fc780a787fe at term 41
Aug 01 07:18:32 ip-10-0-0-245.eu-west-1.compute.internal etcd[2363]: 499073e22ac73562 received vote from b98d4fc780a787fe at term 41
Aug 01 07:18:32 ip-10-0-0-245.eu-west-1.compute.internal etcd[2363]: 499073e22ac73562 [q:2] has received 2 votes and 0 vote rejections
Aug 01 07:18:32 ip-10-0-0-245.eu-west-1.compute.internal etcd[2363]: 499073e22ac73562 became leader at term 41
Aug 01 07:18:32 ip-10-0-0-245.eu-west-1.compute.internal etcd[2363]: raft.node: 499073e22ac73562 elected leader 499073e22ac73562 at term 41
Aug 01 07:18:32 ip-10-0-0-245.eu-west-1.compute.internal etcd[2363]: published {Name:etcd1.ksone ClientURLs:[https://etcd1.ksone:2379]} to cluster 1c370848b4697ef2
Aug 01 07:18:32 ip-10-0-0-245.eu-west-1.compute.internal etcd[2363]: setting up the initial cluster version to 2.2
Aug 01 07:18:32 ip-10-0-0-245.eu-west-1.compute.internal etcd[2363]: set the initial cluster version to 2.2

Summary of Observations

The discovery mechanism on the etcd client does not appear to be working, as evidenced by the error above i.e. cluster is unavailable or misconfigured or ``Error: unexpected status code 404```.
The debug logs seem to indicate that it is trying to connect to the peer port i.e. 2380 instead of the client port i.e. 2379.
I can get it to work only by explicitly setting the endpoint switch to port 2379
The SRV query on the root domain does not appear to be working correctly i.e. it returns a blank result (no output)
systemctl status etcd seems to indicate that the endpoint have been configured correctly for the etcd startup command.

Questions

How do I query the records correctly, and what might be the problems (if any) with the dns SRV configuration?
Why is the etcdctl --discovery-srv switch not working - I expect it to discover the correct port i.e. 2379 and not to report any errors.
Is etcd supposed to be load balanced? Is there a single endpoint that I can query? [Why] is it up to the client to choose an endpoint? Should I configure a load balancer on top of my etcd rig?

Many Thanks!

254

0 + 1

etcd

etcd cluster with DNS Discovery - client: etcd cluster is unavailable or misconfigured; Error: unexpected status code 404; dig SRV returns blank

Post an answer