Installing on Kubernetes
You can install GridGain 9 and run a GridGain cluster on Kubernetes cluster. This section describes all the necessary steps, as well as provides the configurations and manifests that you can copy and paste into your environment.
Prerequisites
Recommended Kubernetes Version
GridGain is tested on Kubernetes version 1.26.
Installation Steps
Create ConfigMaps
-
Create the GridGain configuration file and get a license. The minimum node configuration is as follows:
gridgain-config.confignite: { network: { # GridGain 9 node port port = 3344 nodeFinder = { netClusterNodes = [ # Kubernetes service to access the GridGain 9 cluster on the Kubernetes network "gridgain-svc-headless:3344" ] } } storage: { profiles = [ { engine = "aipersist" name = "default" replacementMode = "CLOCK" # Explicit storage size configuration sizeBytes = 2147483648 } ] } }
-
Place your license content in the
license.conf
file. -
Create the ConfigMap object for GridGain configuration:
kubectl create configmap gridgain-config -n <namespace> --from-file=gridgain-config.conf
-
Create the ConfigMap object for the GridGain license:
kubectl create configmap gridgain-license -n <namespace> --from-file=license.conf
Replace
<namespace>
with the name of the namespace where you want to deploy GridGain.To update GridGain node configuration, modify the existing ConfigMap and restart all GridGain pods.
-
Modify previously configured ConfigMap object:
kubectl edit configmap gridgain-config -n <namespace>
-
Restart GridGain pod, repeat for every pod:
kubectl delete pod <GridGain pode name> -n <namespace>
-
Create and Deploy the Service
Depending on your requirements, define and deploy a Kubernetes service. Gridgain 9 use two types of services: one for internal cluster discovery, and the other — for external client access.
-
First, choose a type of service you need and prepare the
service.yaml
file.-
For communication inside the Kubernetes cluster, Use a headless service by setting the
clusterIP
parameter toNone
. This will expose each pod’s IP, enabling GridGain to be partition‑aware: clients discover every node’s address, determine which partition resides on which node, and send requests directly where the data is located.
service.yamlapiVersion: v1 kind: Service metadata: # The name must be equal to netClusterNodes. name: gridgain-svc-headless # Place your namespace name here. namespace: <namespace> spec: clusterIP: None internalTrafficPolicy: Cluster ipFamilies: - IPv4 ipFamilyPolicy: SingleStack ports: - name: management port: 10300 protocol: TCP targetPort: 10300 - name: rest port: 10800 protocol: TCP targetPort: 10800 - name: cluster port: 3344 protocol: TCP targetPort: 3344 selector: # Must be equal to the label set for pods. app: gridgain # Include not-yet-ready nodes. publishNotReadyAddresses: True sessionAffinity: None type: ClusterIP
-
Use a
LoadBalancer
service to allow external clients to connect. Keep in mind, that with this option you giving up partition awareness.If your environments does not support
LoadBalancer
, you can usetype: NodePort
instead. Refer to the Kubernetes documentation for details.apiVersion: v1 kind: Service metadata: name: gridgain-loadbalancer labels: app: gridgain spec: type: LoadBalancer selector: app: gridgain ports: - name: rest protocol: TCP port: 10800 targetPort: 10800 - name: client port: 10300 protocol: TCP targetPort: 10300
-
-
Then apply the
service.yaml
file to set up this service:
kubectl apply -f service.yaml
Deploy the StatefulSet
-
Prepare the
statefulset.yaml
file for StatefulSet deployment:statefulset.yamlapiVersion: apps/v1 kind: StatefulSet metadata: # The cluster name. name: gridgain-cluster # Place your namespace name. namespace: <namespace> spec: # The initial number of pods to be started by Kubernetes. replicas: 2 # Kubernetes service to access the GridGain 9 cluster on the Kubernetes network. serviceName: gridgain-svc-headless selector: matchLabels: app: gridgain template: metadata: labels: app: gridgain spec: terminationGracePeriodSeconds: 60000 containers: # Custom pod name. - name: gridgain-node # Limits and requests for the GridGain container. resources: limits: cpu: "4" memory: 4Gi requests: cpu: "4" memory: 4Gi env: # Must be specified to ensure that GridGain 9 cluster replicas are visible to each other. - name: GRIDGAIN_NODE_NAME valueFrom: fieldRef: fieldPath: metadata.name # GridGain 9 working directory. - name: GRIDGAIN_WORK_DIR value: /gg9-work # GridGains Docker image and it's version. image: gridgain/gridgain9:9.1.8 ports: - containerPort: 10300 - containerPort: 10800 - containerPort: 3344 volumeMounts: # The config will be placed at this path in the container. - mountPath: /opt/gridgain/etc/gridgain-config.conf name: config-vol subPath: gridgain-config.conf # The license will be placed at this path in the container. - mountPath: /opt/gridgain/etc/license.conf name: license-vol subPath: license.conf # GridGain 9 working directory. - mountPath: /gg9-work name: persistence volumes: - name: config-vol configMap: name: gridgain-config - name: license-vol configMap: name: gridgain-license volumeClaimTemplates: - apiVersion: v1 kind: PersistentVolumeClaim metadata: name: persistence spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi # Provide enough space for your application data. volumeMode: Filesystem
-
Apply the
statefulset.yaml
file to deploy the main components of GridGain 9:kubectl apply -f statefulset.yaml
Wait for Pods to Start
-
Monitor the status of the pods:
kubectl get pods -n <namespace> -w
-
Ensure that all pods'
STATUS
isRunning
before proceeding.
Deploy the Job
-
Prepare the
job.yaml
file for deploying the job:job.yamlapiVersion: batch/v1 kind: Job metadata: name: cluster-init # Place your namespace name here. namespace: <namespace> spec: template: spec: containers: # Command to init the cluster. URL and host must be the name of the service you created before. Port is 10300 as the management port. - args: - -ec - | apt update && apt-get install -y bind9-host GG_NODES=$(host -tsrv _cluster._tcp.gridgain-svc-headless | grep 'SRV record' | awk '{print $8}' | awk -F. '{print $1}' | paste -sd ',') /opt/gridgain9cli/bin/gridgain9 cluster init --name=gridgain --url=http://gridgain-svc-headless:10300 --license=/opt/gridgain/etc/license.conf command: - /bin/sh # Specify the Docker image with the GridGain 9 CLI and its version. image: gridgain/gridgain9:9.1.8 imagePullPolicy: IfNotPresent name: cluster-init resources: {} volumeMounts: # The license required to be mounted to cluster-init job. - mountPath: /opt/gridgain/etc/license.conf name: license-vol subPath: license.conf restartPolicy: Never terminationGracePeriodSeconds: 120 volumes: - name: license-vol configMap: name: gridgain-license
-
Apply the
job.yaml
file to complete installation.kubectl apply -f job.yaml
Installation Verification
-
Check the status of all resources in your namespace:
kubectl get all -n <namespace>
-
Ensure that all components are running as expected, without errors, and that the initialization job is in the
Completed
status. -
Verify that your cluster is initialized and running.
kubectl exec -it gridgain-cluster-0 bash -n <namespace> /opt/gridgain9cli/bin/gridgain9 cluster status
The command output must include the name of your cluster and the number of nodes. The status must be
ACTIVE
.
Optional: KEDA Configuration
You can configure KEDA to automatically scale the cluster based on your needs, ensuring optimal resource optimization and performance. This implementation uses Prometheus to monitor cluster load.
To enable KEDA scaling for your cluster:
-
Add the necessary Helm repositories, install KEDA and Prometheus:
helm install keda kedacore/keda --namespace keda --create-namespace helm install prometheus prometheus-community/prometheus --namespace keda -f prometheus-values.yaml
-
Deploy the KEDA configurations:
-
The
keda-scaled-object.yaml
configuration defines the scaling rules for the GridGain cluster:keda-scaled-object.yamlapiVersion: keda.sh/v1alpha1 kind: ScaledObject metadata: name: gridgain-autoscale namespace: gridgain spec: scaleTargetRef: kind: StatefulSet name: gridgain-cluster pollingInterval: 30 cooldownPeriod: 120 minReplicaCount: 2 # Set initial number of replics maxReplicaCount: 5 advanced: horizontalPodAutoscalerConfig: behavior: scaleDown: selectPolicy: Disabled scaleUp: stabilizationWindowSeconds: 60 # Increase if needed selectPolicy: Max policies: - type: Pods value: 1 periodSeconds: 120 triggers: # Uncomment and modify the following section to enable CPU usage based autoscaling # - type: prometheus # metadata: # name: cpu-usage # serverAddress: http://prometheus-server.keda.svc.cluster.local:80 # query: sum(os_system_load_average{job="gridgain"}) # threshold: "0.8" # activationThreshold: "0.6" - type: prometheus name: heap-memory-usage metadata: serverAddress: http://prometheus-server.keda.svc.cluster.local:80 query: | sum(jvm_memory_committed_bytes{area="heap", job="gridgain"} / jvm_memory_max_bytes{area="heap", job="gridgain"}) threshold: "0.7" - type: prometheus name: nonheap-memory-usage metadata: serverAddress: http://prometheus-server.keda.svc.cluster.local:80 query: | sum(jvm_memory_committed_bytes{area="nonheap", job="gridgain"} / jvm_memory_max_bytes{area="nonheap", job="gridgain"}) threshold: "0.7"
-
The
keda-recovery-scaled-job.yaml
configuration handles rebuilding CMG nodes in the GridGain cluster.keda-scaled-object.yamlapiVersion: keda.sh/v1alpha1 kind: ScaledJob metadata: name: gridgain-recovery namespace: gridgain spec: jobTargetRef: template: spec: # securityContext: # runAsUser: 0 # runAsGroup: 0 # fsGroup: 0 containers: - name: recovery image: gridgain/gridgain9:9.1.1 command: ["/bin/bash", "/scripts/recovery.sh"] volumeMounts: - name: script-vol mountPath: /scripts restartPolicy: Never volumes: - name: script-vol configMap: name: gridgain-recovery-script defaultMode: 0777 backoffLimit: 1 pollingInterval: 30 successfulJobsHistoryLimit: 2 failedJobsHistoryLimit: 3 maxReplicaCount: 1 triggers: # - type: prometheus # metadata: # serverAddress: http://prometheus-server.keda.svc.cluster.local:80 # query: avg(os_system_load_average{job="gridgain"}) # threshold: "0.8" # activationThreshold: "0.6" - type: prometheus name: heap-memory-usage metadata: serverAddress: http://prometheus-server.keda.svc.cluster.local:80 query: | avg(jvm_memory_committed_bytes{area="heap", job="gridgain"} / jvm_memory_max_bytes{area="heap", job="gridgain"}) threshold: "0.7" - type: prometheus name: nonheap-memory-usage metadata: serverAddress: http://prometheus-server.keda.svc.cluster.local:80 query: | avg(jvm_memory_committed_bytes{area="nonheap", job="gridgain"} / jvm_memory_max_bytes{area="nonheap", job="gridgain"}) threshold: "0.7"
-
-
You can deploy the above configurations with the following commands:
kubectl apply -n gridgain -f keda-scaled-object.yaml kubectl apply -n gridgain -f keda-recovery-scaled-job.yaml
Installation Troubleshooting
If any issues occur during the installation:
-
Check the logs of specific pods:
kubectl logs <pod-name> -n <namespace>
-
Review events in the namespace:
kubectl get events -n <namespace>
Limitations and Considerations
When running GridGain 9 in a Kubernetes environment, the node configuration becomes read-only and cannot be modified by using the gridgain9 node config update
CLI command. This is by design, as node configuration is managed via Kubernetes resources. To change your configuration:
-
Manually update the corresponding ConfigMap;
-
Restart all cluster pods by executing
kubectl delete pod
for each replica.
The updated configuration will take effect after the pods are recreated by the Kubernetes controller.
© 2025 GridGain Systems, Inc. All Rights Reserved. Privacy Policy | Legal Notices. GridGain® is a registered trademark of GridGain Systems, Inc.
Apache, Apache Ignite, the Apache feather and the Apache Ignite logo are either registered trademarks or trademarks of The Apache Software Foundation.