Data Center Replication from GridGain 8
In certain environments, downtime associated with migrating data between GridGain versions cannot be afforded. For these scenarios, GridGain provides a one-way data migration tool based on GridGain 8 Data Center Replication. Data can be copied to your GridGain 9 cluster as if it was a GridGain 8 cluster running in a different data center, and afterwards the load can be seamlessly transferred to the GridGain 9 cluster.
Prerequisites
As Data Center Replication is only available in GridGain 8 Enterprise and Ultimate editions, you need to have the corresponding license to start replication process.
Before starting replication, you need to create an empty table the data will be written to. You will be able to manually configure what cache fields correspond to what table columns. If not specified, fields will be matched to columns with the same name.
Additionally, you must have created the nullable drVersion
column of the VARBINARY
data type in the target GridGain 9 table. This column will be used to store information about data center replication process and can be safely deleted once replication is no longer required. You cannot start replication without this column.
Limitations
The replication process between GridGain 8 and GridGain 9 has the following limitations:
-
There are differences in GridGain 9 and GridGain 8 support for data types. Make sure data is mapped to columns of equivalent types.
-
Only one-way replication from the master GridGain 8 to a replica GridGain 9 cluster is supported. Consequently, only active-passive replication is supported for this scenario.
-
You cannot transfer data to the GridGain 9 cluster by using full state transfer over snapshot.
-
Data can only be transferred from a single GridGain 8 cluster at a time.
-
GridGain 9 cluster being already under load may cause some data from the transfer being overwritten by new data. It is recommended to only transmit data to clusters not under load.
Conflict Resolution
In most scenarios, there should be no conflicts, as data should be copied to a fresh cluster. However, in some scenarios, there still may be conflicts. These are resolved as follows:
-
Any data from GridGain 9 cluster has priority and will not be overwritten.
-
If the same row already exists in the table, newly transferred data will always overwrite old data.
Failure Scenarios
If the replication is stopped for any reason, when it is restarted, full state transfer will be performed first, followed by incremental updates once data is fully synchronized.
Configuring Replication
Configuring replication to GridGain 9 cluster on GridGain 8 side is not different from configuring replication to GridGain 8 cluster. On GridGain 9 side, you need to start a connector tool that converts data from GridGain 8 and transfers it to GridGain 9.
Tombstone time-to-live, configured by using the tombstoneTtl
configuration property, on receiving side must be same or greater than on sending side. By default, the same value is used in GridGain 8 and GridGain 9 (30 minutes).
Make sure to note the ID of the data center the master cluster is running in. This ID will be used in connector configuration to filter incoming connections.
The following rules are used for mapping values between GridGain 8 caches and GridGain 9 tables:
-
Cache key fields are mapped to primary key columns of a target table.
-
Cache value fields are mapped to non-primary key columns of a target table.
-
If a cache field is not present in the configuration it is mapped to a target table column with the same name.
-
By default, a non-BinaryObject key is mapped to a column named
_key
in a target table. -
By default, a non-BinaryObject value is mapped to a column named
_value
in a target table.
Connector Configuration
The connector configuration files should be using the HOCON format.
Here is an example configuration:
dr-service-config = {
cacheMapping = [
# Only one mapping at a time is supported.
# Mapping for key and value object fields to distinct columns.
{
cache = "cacheName"
table = "schemaName.tableName"
keyFields = [
{ field = "key1", column = "C1" }
{ field = "key2", column = "C2" }
{ field = "key3", column = "C3" }
]
valueFields = [
{ field = "value1", column = "C4" }
{ field = "value2", column = "ignored_column", ignore = true }
]
}
# # Alternative mapping.
# # The fields option maps fields to columns regardless of if they are key of value fields.
# {
# cache = "cacheName"
# table = "schemaName.tableName"
# fields = [
# { field = "_key", column = "ID" }
# { field = "field1", column = "C1" }
# { field = "field2", column = "C2" }
# ]
# }
]
clientConfiguration = {
serverEndpoints = [ "127.0.0.1:10800" ]
port = 10800
connectTimeout = 0
metricsEnabled = false
heartbeatInterval = 30000
heartbeatTimeout = 5000
backgroundReconnectInterval = 30000
ssl = {
enabled = false
ciphers = ""
keyStore = {
password = ""
path = ""
type = "PKCS12"
}
trustStore = {
password = ""
path = ""
type = "PKCS12"
}
}
authenticator = {
type = "BASIC"
identity = "<user>"
secret = "<password>"
}
}
drReceiverConfiguration = {
dataCenterId = 1
inboundHost = ""
inboundPort = 9090
drReceiverMetricsEnabled = true
idleTimeout = 60000
ignoreMissingTable = true
selectorCnt = 4
socketReceiveBufferSize = 0
socketSendBufferSize = 0
tcpNodelay = true
tombstoneTtl = 1800000
workerThreads = 10
writeTimeout = 60000
ssl = {
enabled = false
clientAuth = "none"
ciphers = ""
keyStore = {
password = ""
path = ""
type = "PKCS12"
}
trustStore = {
password = ""
path = ""
type = "PKCS12"
}
}
timeZone = "Europe/Berlin"
}
}
The following fields are available:
Property | Default | Description |
---|---|---|
|
The name of the GridGain 8 cache on the master cluster. The cache name is case-sensitive and should not include the schema. |
|
|
The name of the GridGain 9 table on the replica cluster. Table name is case-insensitive by default, so case-sensitive identifiers must be quoted. This name can include the schema as a prefix, in the format |
|
|
The array denoting the mapping of cache key fields to table columns. Not compatible with |
|
|
The case-sensitive name of the source cache field to get data from. |
|
|
The case-sensitive name of the target column to write data to. |
|
|
The array denoting the mapping of cache value fields to table columns. Not compatible with |
|
|
The case-sensitive name of the source cache field to get data from. |
|
|
The case-sensitive name of the target column to write data to. |
|
|
false |
If set to |
|
Alternative configuration. The array denoting the mapping of cache fields to table columns. |
|
|
Alternative configuration. The case-sensitive name of the source cache field to get data from. |
|
|
Alternative configuration. The case-sensitive name of the target column to write data to. |
|
|
The list of addresses of GridGain 9 nodes with configured client connectors. |
|
|
The port of the connector. |
|
|
The connection timeout, in milliseconds. |
|
|
If the metrics are enabled for the connection. |
|
|
Heartbeat message interval, in milliseconds. |
|
|
Heartbeat message timeout, in milliseconds. |
|
|
The period of time after which the connector will try to reestablish lost connection, in milliseconds. |
|
|
false |
If SSL is enabled for the connection. |
|
The list of ciphers to enable, separated by comma. |
|
|
PKCS12 |
Keystore type. |
|
Keystore password. |
|
|
Path to the keystore. |
|
|
PKCS12 |
Truststore type. |
|
Truststore password |
|
|
Path to the truststore. |
|
|
The list of addresses of GridGain 9 nodes with configured client connectors. |
|
|
The list of addresses of GridGain 9 nodes with configured client connectors. |
|
|
The list of addresses of GridGain 9 nodes with configured client connectors. |
|
|
1 |
The ID of the master GridGain 8 cluster’s datacenter. Only connections from this data center will be allowed. |
|
Local host name of the connector. |
|
|
10800 |
The port used for data replication |
|
true |
If metrics are enabled. |
|
60000 |
How long the connector can be idle before the connection is dropped, in milliseconds. |
|
true |
If missing tables fail the replication. |
|
lowest of 4 and the number of available cores |
The number of threads handling connections. |
|
0 |
Socket receive buffer size in bytes. |
|
0 |
Socket send buffer size in bytes. |
|
true |
If the |
|
1800000 |
Tombstone expiration timeout, in milliseconds. |
|
All available threads |
The number of worker threads handling for batches processing. |
|
60000 |
Write timeout for TCP server connection. |
|
false |
If SSL is enabled for the connection. |
|
none |
SSL client authentication. Possible values:
|
|
The list of ciphers to enable, separated by comma. |
|
|
PKCS12 |
Keystore type. |
|
Keystore password. |
|
|
Path to the keystore. |
|
|
PKCS12 |
Truststore type. |
|
Truststore password |
|
|
Path to the truststore. |
|
|
Specifies the time zone of the data migrated from GridGain 8 to GridGain 9. Set as "Area/City" (for example, "Europe/Berlin") or as a fixed offset like GMT+5 to ensure correct conversion of |
Sample Cache Mapping
The configuration below maps fields K1, K2, K3 of cache key objects to columns COL_1, COL_2, COL_3 of the target table and fields V1, V2, V3 from cache value objects to columns COL_4, COL_5, COL_6 of the target table, and ignoring field V4 from the cache value objects.
dr-service-config = {
cacheMapping = [
{
cache = "cacheName"
table = "schemaName.tableName"
keyFields = [
{ field = "K1", column = "COL_1" }
{ field = "K2", column = "COL_2" }
{ field = "K3", column = "COL_3" }
]
valueFields = [
{ field = "V1", column = "COL_4" }
{ field = "V2", column = "COL_5" }
{ field = "V3", column = "COL_6" }
{ field = "V4", column = "ignored_column", ignore = true }
]
}
]
}
Running Connector Tool
Running the Tool Locally
To start the connector tool, use the start.sh
script. You can provide the configuration for the script by using the config-path
parameter and specifying the path to the Connector Configuration.
start.sh --config-path etc/example.conf
Running the Tool in Docker
The connector tool is available on DockerHub.
When running the connector tool in docker, keep in mind the following:
-
You need to mount the configuration in your container to start .
-
Open the port for connection from your GridGain 8 cluster. This port is specified in the
dr-service-config.drReceiverConfiguration.inboundPort
Connector Configuration parameter. -
If run GridGain 9 in a different network, open the port for the connection to that network. GridGain 9 server endpoint is specified in the
dr-service-config.clientConfiguration.serverEndpoints
Connector Configuration parameter.
To start the connector tool, start the container, specifying all required parameters, for example:
docker run -p {host_port}:{inbound_port} -v /host_config_path:/opt/gridgain-dr-connector/etc/custom gridgain/gridgain-dr-connector:9.1.8 --config-path /opt/gridgain-dr-connector/etc/custom/config.conf
You can also use docker compose. Here is the example of the above configuration in docker compose format:
services:
dr-connector:
container_name: dr-connector
image: gridgain-dr-connector:9.1.8
tty: true
volumes:
- /host_config_path:/opt/gridgain-dr-connector/etc/custom
ports:
- "{host_port}:{inbound_port}"
command: ["--config-path", "/opt/gridgain-dr-connector/etc/custom/config.conf"]
Then you can start the docker image with docker compose:
docker compose up -d
Starting Replication
Once the client connector is started, it will behave in the same way as a replica GridGain 8 cluster. Start replication as described in the GridGain 8 documentation.
Stopping Connector Tool
Once data replication is no longer needed (for example, the load was shifted to GridGain 9 cluster), stop the replication process on the GridGain 8 master cluster and then use the stop.sh
script to stop the connector.
stop.sh
© 2025 GridGain Systems, Inc. All Rights Reserved. Privacy Policy | Legal Notices. GridGain® is a registered trademark of GridGain Systems, Inc.
Apache, Apache Ignite, the Apache feather and the Apache Ignite logo are either registered trademarks or trademarks of The Apache Software Foundation.