Within my deployment, the following livenessProbe
is defined:
apiVersion: apps/v1
kind: Deployment
metadata:
name: backend-deployment
labels:
name: backend-deployment
app: fc-test
spec:
replicas: 1
selector:
matchLabels:
name: fc-backend-pod
app: fc-test
template:
metadata:
name: fc-backend-pod
labels:
name: fc-backend-pod
app: fc-test
spec:
containers:
- name: fc-backend
image: localhost:5000/backend:1.3
ports:
- containerPort: 4042
env:
- name: NODE_ENV
value: "int"
livenessProbe:
exec:
command:
- RESULT=$(curl -X GET $BACKEND_SERVICE_HOST:$BACKEND_SERVICE_PORT/api/v2/makes | wc | awk '{print $3}');
- if [[ $RESULT -lt 150 ]]; then exit 1; else exit 0; fi
initialDelaySeconds: 20
failureThreshold: 8
periodSeconds: 10
Since there are some issues with API connection sometimes, I decided to set up an action checking if the whole set of requested data gets fetched from the API. If it does, the whole set is around 400 KB of size. If it doesn't, only a short message gets returned and the size of the response is lower than 120 B. And this is when the second command from the probe gets in: it checks whether the RESULT
environment variable is low: if it is, then it means the response didn't contain all desired data and exits with an error code.
Both commands were tested by calling from inside of the running container, so both cases are covered: a) correct data fetched - exit 0, and b) just an error message fetched - exit 1.
The application running without the probe has been working correctly for at least 3-4 hours, then the problems with connection appeared and they were self-solvable in the end, but choked the app a bit, what was pretty undesirable.
After the probe was implemented, first instability issues started to happen minutes after deployment. Every couple of minutes pods were restarted and the restart count increased in a regular manner.
What I found describing the deployment:
Pod Template:
Labels: app=fc-test
name=fc-backend-pod
Containers:
nsc-backend:
Image: localhost:5000/backend:1.3
Port: 4042/TCP
Host Port: 0/TCP
Liveness: exec [RESULT=$(curl -X GET $BACKEND_SERVICE_HOST:$BACKEND_SERVICE_PORT/api/v2/makes | wc | awk '{print $3}'); if [[ $RESULT -lt 150 ]]; then exit 1; else exit 0; fi] delay=20s timeout=1s period=10s #success=1 #failure=8
It looks reasonable, but when entering the running container with exec
command, I found out that echo $RESULT
gives no output (just an empty line).
Does it mean that only the first call of the probe was somehow processed successfully and all following ones didn't? How to approach the probe configuration to make it work as intended?