Why "Cassandra" uses "StatefulSet" instead of "Deploymdnt" file for "Kubernetes"?

best_of_man

12/29/23, 6:48 PM

I am trying to deploy Cassandra on my local Kind cluster running on my Ubuntu 22.04 machine. The only instruction I found is this, that uses a StatefulSet for that. I am just wondering to know, isn't a Deployment file something newer? Why they didn't use Deployment file instead of StatefulSet? If it is better to use a Deployment file, can anybody help me to convert this StatefulSet to a Deployment file?

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: cassandra
  labels:
    app: cassandra
spec:
  serviceName: cassandra
  replicas: 3
  selector:
    matchLabels:
      app: cassandra
  template:
    metadata:
      labels:
        app: cassandra
    spec:
      terminationGracePeriodSeconds: 1800
      containers:
      - name: cassandra
        image: gcr.io/google-samples/cassandra:v13
        imagePullPolicy: Always
        ports:
        - containerPort: 7000
          name: intra-node
        - containerPort: 7001
          name: tls-intra-node
        - containerPort: 7199
          name: jmx
        - containerPort: 9042
          name: cql
        resources:
          limits:
            cpu: "500m"
            memory: 1Gi
          requests:
            cpu: "500m"
            memory: 1Gi
        securityContext:
          capabilities:
            add:
              - IPC_LOCK
        lifecycle:
          preStop:
            exec:
              command: 
              - /bin/sh
              - -c
              - nodetool drain
        env:
          - name: MAX_HEAP_SIZE
            value: 512M
          - name: HEAP_NEWSIZE
            value: 100M
          - name: CASSANDRA_SEEDS
            value: "cassandra-0.cassandra.default.svc.cluster.local"
          - name: CASSANDRA_CLUSTER_NAME
            value: "K8Demo"
          - name: CASSANDRA_DC
            value: "DC1-K8Demo"
          - name: CASSANDRA_RACK
            value: "Rack1-K8Demo"
          - name: POD_IP
            valueFrom:
              fieldRef:
                fieldPath: status.podIP
        readinessProbe:
          exec:
            command:
            - /bin/bash
            - -c
            - /ready-probe.sh
          initialDelaySeconds: 15
          timeoutSeconds: 5
        # These volume mounts are persistent. They are like inline claims,
        # but not exactly because the names need to match exactly one of
        # the stateful pod volumes.
        volumeMounts:
        - name: cassandra-data
          mountPath: /cassandra_data

  # These are converted to volume claims by the controller
  # and mounted at the paths mentioned above.
  # do not use these in production until ssd GCEPersistentDisk or other
  # ssd pd
  volumeClaimTemplates:
  - metadata:
      name: cassandra-data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: fast
      resources:
        requests:
          storage: 1Gi
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: fast
provisioner: k8s.io/minikube-hostpath
parameters:
  type: pd-ssd

153

1 + 0

deployment

cassandra

kubernetes

Score:1

Server

larsks

12/29/23, 10:22 PM

A StatefulSet is different from a Deployment. From the documentation:

Like a Deployment, a StatefulSet manages Pods that are based on an identical container spec. Unlike a Deployment, a StatefulSet maintains a sticky identity for each of their Pods. These pods are created from the same spec, but are not interchangeable: each has a persistent identifier that it maintains across any rescheduling.

You use StatefulSets when your pods need to maintain some sort of unique state -- for example, the volumeClaimTemplates section of the manifest means that each pod gets a unique PersistentVolumeClaim. This isn't possible using a Deployment.

In general you cannot convert a StatefulSet into a Deployment unless you only plan on having a single replica.

+ 7

best_of_man

12/30/23, 12:02 AM

Thank you so much for clarifying. But for `mysql` for example, it also needs a persistent volume memory, but still they use `deployment` file for it as you can see here https://kubernetes.io/docs/tasks/run-application/run-single-instance-stateful-application/ .... What is the difference?

larsks

12/30/23, 12:47 AM

The title of that page is "Run a Single-Instance Stateful Application". That reflects what I said in my answer: you can only use a Deployment to run stateful application if you only have a single replica.

best_of_man

12/30/23, 12:51 AM

So this is not specific to `cassandra` and even for `mysql` we need a `StatefulSet` instead of `deployment` file if we want to have more than one `mysql` instance.

larsks

12/30/23, 1:14 AM

That's correct (because each mysql instance will e.g. need its own volume for data, which you can do with a statefulset but not with a deployment).

best_of_man

12/30/23, 3:37 AM

I tried to also use `StatefulSet` for deploying `mysql` on multi-nodes and I think I did it sucessfully, but I am interested to know why do I see `READY 2/2` when I try `kubectl get pods`? Because for `cassandra clusters I see `READY 1/1`. This is the first time I see `2/2`. What are they?

larsks

12/30/23, 7:38 AM

You might want to open a new question. Include both the output of `kubectl get pods` and the YAML manifest used to create the pods.

best_of_man

12/30/23, 8:01 PM

I opened a new question and also asked my previous question there: https://serverfault.com/questions/1119154/crashloopbackoff-while-deploying-mysql-on-multi-node-cluster

Elon Musk

I sit in a Tesla and translated this thread with Ai:

EN: Why "Cassandra" uses "StatefulSet" instead of "Deploymdnt" file for "Kubernetes"?

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.