I already have a EFK stack on a kubernetes cluster in AWS EKS and I intend to add s3 logging functionality for the elasticsearch pods in es-statefulset. For this, I have created a custom elasticseach image where I added the repository-s3 plugin and then added an init-container for storing the AWS credentials in the es-statefulset yaml file. But when applying the changes, the elasticseach pods part of es statefulset fails with 'Error'.
The statefulset config is as follows:
apiVersion: apps/v1 # API version of kubernetes in which `StatefulSet` is available. For Kubernetes 1.8.7 its apps/v1beta1
kind: StatefulSet # Type of resource that we are creating
metadata: # Holds metadata for this resource
name: es # Name of this resource
namespace: kube-logging
labels: # Extra metadata goes inside labels. It is for stateful resource
component: elasticsearch # Just a metadata we are adding
spec: # Holds specification of this resource
replicas: 5 # Responsible for maintaining the given number of replicas
selector:
matchLabels:
component: elasticsearch
serviceName: elasticsearch # Name of service, required by statefulset
template: # Template holds the spec of the pod that will be created and maintained by statefulset
metadata: # Holds metadata for the pod
labels: # Extra metadata goes inside labels. It is for the pod
component: elasticsearch # Just a metadata for the pod
spec: # Holds the spec of the pod
initContainers: # will always initialize before other containers in the pod
- name: init-sysctl # Name of the init-container
image: busybox # Image that will be deployed in this container
imagePullPolicy: IfNotPresent # Sets the policy that only pull image from registry if it is not available locally
command: ["sysctl", "-w", "vm.max_map_count=262144"] # Sets the system varibale in the container, this value is required by ES
securityContext: # Security context holds any special permission given to this container
privileged: true # This container gets the right to run in privilaged mode
- name: add-aws-keys
image: xxxxxxxxx.dkr.ecr.ap-southeast-1.amazonaws.com/myes:6.4.2
env:
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aws-s3-keys
key: access-key-id
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: aws-s3-keys
key: access-secret-key
command:
- sh
- -c
- |
echo $AWS_ACCESS_KEY_ID | bin/elasticsearch-keystore add --stdin --force s3.client.default.access_key
echo $AWS_SECRET_ACCESS_KEY | bin/elasticsearch-keystore add --stdin --force s3.client.default.secret_key
containers: # Holds the list and configs of normal containers in the pod
- name: es # Name of the first container
securityContext: # Security context holds any special permission given to this container
capabilities: # Container will have the capability to IPC Lock , can lock on memory so that it is not swapped out.
add:
- IPC_LOCK
image: xxxxxxxxx.dkr.ecr.ap-southeast-1.amazonaws.com/myes:6.4.2
env: # array of environment variables with values are passed to this image
- name: KUBERNETES_CA_CERTIFICATE_FILE
value: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: "CLUSTER_NAME"
value: "myesdb1"
- name: "DISCOVERY_SERVICE"
value: "elasticsearch"
- name: NETWORK_HOST
value: "_eth0_"
- name: ES_JAVA_OPTS #Specify the Heap Size
value: -Xms1536m -Xmx1536m
ports: # Ports that this pod will open
- containerPort: 9200
name: http
protocol: TCP
- containerPort: 9300
name: transport
protocol: TCP
volumeMounts: # The path where volume will be mounted.
- mountPath: /data
name: storage # Name given to this mount
updateStrategy:
type: RollingUpdate
volumeClaimTemplates: # It provides stable storage using PersistentVolumes provisioned by a PersistentVolume Provisioner
- metadata: # Metadata given to this resource (Persistant Volume Claim)
name: storage # Name of this resource
spec: # Specification of this PVC (Persistant Volume Claim)
storageClassName: gp2 # Storage class used to provision this PVC
accessModes: [ ReadWriteOnce ] # Access mode of the volume
resources: # Holds the list of resources
requests: # Requests sent to the storage class
storage: 100Gi
The describe pod gives the following:
$ kubectl describe pod/es-0 -n kube-logging
Name: es-0
Namespace: kube-logging
Priority: 0
Node: ip-xxxxxxxxxxxx.ap-southeast-1.compute.internal/xxxxxxxxx
Start Time: Thu, 19 Aug 2021 16:23:44 +0000
Labels: component=elasticsearch
controller-revision-hash=es-7dc4b7477c
statefulset.kubernetes.io/pod-name=es-0
Annotations: kubernetes.io/psp: eks.privileged
Status: Running
IP: xxxxxxxxxxxx
IPs:
IP: xxxxxxxxxxxxx
Controlled By: StatefulSet/es
Init Containers:
init-sysctl:
Container ID: docker://291ab715d302d7f505925168685955ad20c529e4db3371c3385e911614d60179
Image: busybox
Image ID: docker-pullable://busybox@sha256:0f354ec1728d9ff32edcd7d1b8bbdfc798277ad36120dc3dc683be44524c8b60
Port: <none>
Host Port: <none>
Command:
sysctl
-w
vm.max_map_count=262144
State: Terminated
Reason: Completed
Exit Code: 0
Started: Thu, 19 Aug 2021 16:23:50 +0000
Finished: Thu, 19 Aug 2021 16:23:50 +0000
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-9nv96 (ro)
add-aws-keys:
Container ID: docker://23e3335b99c8a5bf144f2f2a52c1bbd357a667205ce335ead5e0627ecdf403e9
Image: xxxxxxxxxxxx.dkr.ecr.ap-southeast-1.amazonaws.com/myes:6.4.2
Image ID: docker-pullable://xxxxxxxxxxx.dkr.ecr.ap-southeast-1.amazonaws.com/myes@sha256:c09d586a4bdc7149c41ff74783bae0138f68c4779f03315b197e7dda1c4332c6
Port: <none>
Host Port: <none>
Command:
sh
-c
echo $AWS_ACCESS_KEY_ID | bin/elasticsearch-keystore add --stdin --force s3.client.default.access_key
echo $AWS_SECRET_ACCESS_KEY | bin/elasticsearch-keystore add --stdin --force s3.client.default.secret_key
State: Terminated
Reason: Completed
Exit Code: 0
Started: Thu, 19 Aug 2021 16:23:50 +0000
Finished: Thu, 19 Aug 2021 16:23:53 +0000
Ready: True
Restart Count: 0
Environment:
AWS_ACCESS_KEY_ID: <set to the key 'access-key-id' in secret 'aws-s3-keys'> Optional: false
AWS_SECRET_ACCESS_KEY: <set to the key 'access-secret-key' in secret 'aws-s3-keys'> Optional: false
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-9nv96 (ro)
Containers:
es:
Container ID: docker://a26a4cf4ed98f0f6dd535069e8a5c54dbf39746a04e8a14fbacb0c51ff4684ec
Image: xxxxxxxxxx.dkr.ecr.ap-southeast-1.amazonaws.com/myes:6.4.2
Image ID: docker-pullable://xxxxxxxxx.dkr.ecr.ap-southeast-1.amazonaws.com/myes@sha256:c09d586a4bdc7149c41ff74783bae0138f68c4779f03315b197e7dda1c4332c6
Ports: 9200/TCP, 9300/TCP
Host Ports: 0/TCP, 0/TCP
State: Waiting
Reason: CrashLoopBackOff <<<<<<<<<<<
Last State: Terminated <<<<<<<<<<<<<<<<<
Reason: Error
Exit Code: 1
Started: Thu, 19 Aug 2021 16:50:16 +0000
Finished: Thu, 19 Aug 2021 16:50:17 +0000
Ready: False
Restart Count: 10
Environment:
KUBERNETES_CA_CERTIFICATE_FILE: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
NAMESPACE: kube-logging (v1:metadata.namespace)
CLUSTER_NAME: myesdb1
DISCOVERY_SERVICE: elasticsearch
NETWORK_HOST: _eth0_
ES_JAVA_OPTS: -Xms1536m -Xmx1536m
Mounts:
/data from storage (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-9nv96 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
storage:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: storage-es-0
ReadOnly: false
default-token-9nv96:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-9nv96
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 29m default-scheduler Successfully assigned kube-logging/es-0 to ip-xxxxxxx.ap-southeast-1.compute.internal
Normal SuccessfulAttachVolume 29m attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-2abce33a-7193-4070-b4da-74486e72d568"
Normal Pulled 29m kubelet Container image "busybox" already present on machine
Normal Created 29m kubelet Created container init-sysctl
Normal Started 29m kubelet Started container init-sysctl
Normal Pulled 29m kubelet Container image "xxxxxxxx.dkr.ecr.ap-southeast-1.amazonaws.com/myes:6.4.2" already present on machine
Normal Created 29m kubelet Created container add-aws-keys
Normal Started 29m kubelet Started container add-aws-keys
Normal Created 29m (x4 over 29m) kubelet Created container es
Normal Started 29m (x4 over 29m) kubelet Started container es
Normal Pulled 28m (x5 over 29m) kubelet Container image "xxxxxxxxx.dkr.ecr.ap-southeast-1.amazonaws.com/myes:6.4.2" already present on machine
Warning BackOff 4m46s (x115 over 29m) kubelet Back-off restarting failed container
The pod logs gives following error related to AccessDeniedException: /elasticsearch/config/elasticsearch.keystore:
$ kubectl logs pod/es-0 -n kube-logging
Starting Elasticsearch 6.4.2
Exception in thread "main" org.elasticsearch.bootstrap.BootstrapException: java.nio.file.AccessDeniedException: /elasticsearch/config/elasticsearch.keystore
Likely root cause: java.nio.file.AccessDeniedException: /elasticsearch/config/elasticsearch.keystore <<<<<<<<<<<<
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:84)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
at java.nio.file.Files.newByteChannel(Files.java:361)
at java.nio.file.Files.newByteChannel(Files.java:407)
at org.apache.lucene.store.SimpleFSDirectory.openInput(SimpleFSDirectory.java:77)
at org.elasticsearch.common.settings.KeyStoreWrapper.load(KeyStoreWrapper.java:215)
at org.elasticsearch.bootstrap.Bootstrap.loadSecureSettings(Bootstrap.java:226)
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:291)
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:136)
at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:127)
at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86)
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124)
at org.elasticsearch.cli.Command.main(Command.java:90)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:93)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:86)
Refer to the log for complete error details.
Further researching on this, I found some links regarding this issue that points to issue with keystore location in elasticsearch container and some key formatting in gcp: https://github.com/elastic/cloud-on-k8s/issues/4124 https://discuss.elastic.co/t/access-denied-error-on-keystore-when-using-eck/276297
but I am not sure how to resolve this as in AWS, moreover when I tried adding the credentials directly while creating the custom image, it got added successfully without any error, (tested by starting a container from the custom image built using Dockerfile):
...
Step 6/8 : RUN echo $AWS_ACCESS_KEY_ID | bin/elasticsearch-keystore add --stdin --force s3.client.default.access_key
---> Running in 42f5231573a7
Created elasticsearch keystore in /elasticsearch/config
Removing intermediate container 42f5231573a7
---> 3b202f355a8b
Step 7/8 : RUN echo $AWS_SECRET_ACCESS_KEY | bin/elasticsearch-keystore add --stdin --force s3.client.default.secret_key
---> Running in 644e555f2362
Removing intermediate container 644e555f2362
---> d08abcc761ce
Step 8/8 : RUN bin/elasticsearch-plugin install --batch repository-s3
---> Running in d21d7d4fbb7b
-> Downloading repository-s3 from elastic
[=================================================] 100%
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: plugin requires additional permissions @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
* java.lang.RuntimePermission accessDeclaredMembers
* java.lang.RuntimePermission getClassLoader
* java.lang.reflect.ReflectPermission suppressAccessChecks
* java.net.SocketPermission * connect,resolve
* java.util.PropertyPermission es.allow_insecure_settings read,write
See http://docs.oracle.com/javase/8/docs/technotes/guides/security/permissions.html
for descriptions of what these permissions allow and the associated risks.
-> Installed repository-s3
Removing intermediate container d21d7d4fbb7b
---> 19f4588f1f7f
Successfully built 19f4588f1f7f
Successfully tagged xxxxxxx.dkr.ecr.ap-southeast-1.amazonaws.com/myes:6.4.2
Also, note I have tried changing permission of /elasticsearch/config/elasticsearch.keystore to 777 but that it did not help either.
I would appreciate some help on this issue.