GridGain Developers Hub

Rolling Upgrades


Rolling Upgrades is an Enterprise and Ultimate Edition feature that allows nodes with different versions of GridGain to co-exist in one cluster while you roll out new versions. This prevents any downtime when performing software upgrades.

With Rolling Upgrades enabled, you can have a new node with a newer version of GridGain join the cluster. For example, if you have ten nodes running on ten physical servers, you can start a new node with a new version of GridGain in any one of the servers. The data is rebalanced to accommodate this new member of the cluster. Once the rebalancing of data is complete, you can shut down the old node. Repeat this process for all server nodes. In this case, the cluster is upgraded one node at a time, without any downtime.

Guidelines and Incompatible Versions

Before you start to upgrade to a newer version of GridGain, please note the following:

Upgrades are supported for minor and maintenance versions of the same major series only. The feature is not supported for upgrades between major versions due to changes that might be not backward-compatible. For instance, it is possible to upgrade from 8.5.6 to 8.7.5 without downtime, but Rolling Upgrades will not work from version 7.9.1 to 8.7.5. For major versions upgrades, contact GridGain for a possible zero-downtime solution.

There is always a way to upgrade to the latest GridGain version from any version of the same major series without downtime. In most cases, it is a direct one-step upgrade. For instance, you can upgrade from 8.5.6 to 8.7.5 with Rolling Upgrades, avoiding production outages.

If there are any breaking changes between minor or maintenance versions of the same series then the upgrade will involve a transitioning version: moving from version A to version B via C without production outages. For instance, if you need to upgrade from 8.1.3 to 8.8.1 then 8.5.3 has to be used as a transitioning version: 8.1.3 → 8.5.3 → 8.8.1.

Enabling Rolling Upgrades

Rolling Upgrades are disabled by default for performance reasons. If you plan to upgrade GridGain without stopping the whole cluster, you need to enable the feature. Below is a configuration example for enabling rolling upgrades:

<bean class="org.apache.ignite.configuration.IgniteConfiguration">

    <property name="pluginConfigurations">
            <bean class="org.gridgain.grid.configuration.GridGainConfiguration">
                <property name="rollingUpdatesEnabled" value="true"/>

GridGainConfiguration pluginCfg = new GridGainConfiguration();

IgniteConfiguration cfg = new IgniteConfiguration();

Ignite ignite = Ignition.start(cfg);
var cfg = new GridGainPluginConfiguration()
    RollingUpdatesEnabled = true
This API is not presently available for C++. You can use XML configuration.

Upgrading to a New GridGain Version

If you have a multi-node cluster and you want to upgrade the cluster to a new GridGain version without stopping the whole cluster, use the following steps:

  1. Download the new GridGain version.

  2. If you are using persistence: Configure storagePath, walPath and walArchivePath properties. If these properties were explicitly configured in the current version of GridGain you are using, then provide the same value of these properties to the configuration of the new GridGain version. If these properties were not set, provide the default location used by these properties.

  3. Stop one running node.

  4. Restart the node with the new version. At this point the cluster will start to rebalance.

  5. Wait while the rebalancing is going on. You can monitor the logs for something like this:

    Rebalancing scheduled [order=[Cache1, Cache2, Cache3], top=AffinityTopologyVersion [topVer=59, minorTopVer=1], force=false, evt=DISCOVERY_CUSTOM_EVT, node=79794392-5518-4609-ac13]
    Completed (final) rebalancing [fromNode=node=79794392-5518-4609-ac13, cacheOrGroup=Cache2, topology=AffinityTopologyVersion [topVer=59, minorTopVer=1], time=370550 ms]

    After this there will be a partition map exchange for the new minor topology change, which is caused by Late Affinity Assignment. So, instead of checking the rebalancing for all the caches, you can look for a message about the exchange:

    Finish exchange future [startVer=AffinityTopologyVersion [topVer=42, minorTopVer=1], resVer=AffinityTopologyVersion [topVer=42, minorTopVer=1], err=null]

    Alternatively, if rebalancing is not needed, it will be skipped:

    Skipping rebalancing (no affinity changes) [top=AffinityTopologyVersion [topVer=42, minorTopVer=1], rebTopVer=AffinityTopologyVersion [topVer=42, minorTopVer=0], evt=DISCOVERY_CUSTOM_EVT, evtNode=38640608-5518-4609-ac13-0b35eea1d6cb, client=false]
  6. After rebalancing is complete, repeat steps 1-6 for all nodes.

Performance Considerations

The duration of the rolling upgrade procedure is largely affected by the speed of the rebalancing process. By default, rebalancing is performed in one thread. However, you can increase the number of threads dedicated to rebalancing to enhance performance. The rebalancing thread pool size is controlled by the IgniteConfiguration.rebalanceThreadPoolSize property. See Configuring Rebalance Thread Pool for more information.

From GridGain v. 8.7.4 on, you can change the value of the thread pool size in a manner similar to the rolling upgrade procedure. If you have already upgraded to version 8.7.x where x ≥ 4, you can stop an individual node, set the parameter and start the node, repeating this action for each node in the cluster.

Monitoring Rolling Upgrades

GridGain Control Center allows you to monitor the process of Rolling Upgrades as you move to a newer version of GridGain.

The Rebalance widget in Control Center displays the cluster nodes and their versions. As you perform node-by-node migration to a newer version, watch the status of the rebalancing progress and proceed to the next node when it’s finished.

Performing a Rolling Restart

If you do not want to upgrade to a new version of GridGain but would like to make some changes to the server nodes instead — for example, modifying the configuration file — you can do a rolling restart.

To do a rolling restart, use the following steps:

  1. Stop one node in the cluster.

  2. Restart the node with the new configuration file. At this point, data rebalancing will start to accommodate this node into the cluster.

  3. Wait until data rebalancing is complete.

  4. Repeat steps 1-3 until all the nodes are upgraded to the new configuration.