User is experiencing issues with pod affinity rules in Longhorn and suggests improvements to how Longhorn handles these rules to avoid pending states.
Hi, I think I may have a misunderstanding of how Longhorn works but this is my scenario. Based on prior advice, I have created 3 "storage" nodes in Kubernetes which manage my Longhorn replicas. These have large disks and replication is working well. I have separate dedicated worker nodes and an LLM node. There may be more than 3 worker nodes over time. If I create a test pod without any affinity rules, then the pod picks a node (e.g. a worker) and happily creates a PVC and longhorn manages this correctly. The moment I add an affinity rules (e.g. run ollama on the LLM node, create a pod that needs a PVC on the worker nodes only), the pod gets stuck in "pending" state and refuses to start because of: "**0/8 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 3 node(s) had volume node affinity conflict, 4 node(s) didn't match Pod's node affinity/selector. preemption: 0/8 nodes are available: 8 Preemption is not helpful for scheduling."** The obvious answer seems to be to delete the storage nodes and let \*every\* node, workers and LLM, use longhorn but..... this means if I have 5 worker nodes and an LLM, then I have 6 replicas... my storage costs would explode. I only need the 3 replicas, hence the 3 storage nodes. Am I missing something? This is an example apply YAML. If I remove the affinity in the spec, it works fine even if it schedules on a worker node and not a storage node. apiVersion: v1 kind: PersistentVolumeClaim metadata: name: my-claim spec: accessModes: - ReadWriteOnce resources: requests: storage: 5Gi --- apiVersion: v1 kind: Pod metadata: name: my-pod spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/role operator: In values: - worker containers: - name: my-container image: nginx:latest volumeMounts: - mountPath: /data name: my-volume volumes: - name: my-volume persistentVolumeClaim: claimName: my-claim I'm using Helm to install longhorn, as follows, and Longhorn is my default storage class. helm install longhorn longhorn/longhorn \ --namespace longhorn-system \ --create-namespace \ --set defaultSettings.createDefaultDiskLabeledNodes=true \ --version 1.11.0 \ --set service.ui.type=LoadBalancer