Data Center Replication from GridGain 8
In certain environments, downtime associated with migrating data between GridGain versions cannot be afforded. For these scenarios, GridGain provides a one-way data migration tool based on GridGain 8 Data Center Replication. Data can be copied to your GridGain 9 cluster as if it was a GridGain 8 cluster running in a different data center.
After replication is complete, a brief pause is required to stop writes, flush the replication backlog, and switch traffic to GridGain 9. As such, this is a "near-zero downtime" migration, not a seamless one. A brief maintenance window is required during the final cutover to ensure data consistency.
Prerequisites
As Data Center Replication is only available in GridGain 8 Enterprise and Ultimate editions, you need to have the corresponding license to start replication process.
Before starting replication:
-
Create an empty table the data will be written to. By default, fields will be matched to columns with the same name. You will be able to manually configure what cache fields correspond to what table columns. Make sure that:
-
For caches with a single key field, use the
_keyfield name to specify the key field as shown below. -
For caches with a single value field, use the
_valfield name to specify the value field.
-
-
Create the nullable
drVersioncolumn of theVARBINARYdata type in the target GridGain 9 table. This column will be used to store information about data center replication process and can be safely deleted once migration is complete. You cannot start replication without this column. -
Recreate indexes from GridGain 8 in the target GridGain 9 tables. Indexes are not transferred automatically during DCR migration and must be created manually to maintain query performance. See Recreating Indexes for detailed instructions.
The following considerations are also important for successful migration:
-
Target GridGain 9 tables must be empty before starting replication. Existing data may cause conflicts or be overwritten.
-
Indexes are not automatically transferred from GridGain 8. All indexes must be recreated manually in GridGain 9 before starting replication.
-
DR connector uses case-sensitive field names. Column names in GridGain 9 must either be quoted to preserve case or capitalized in the connector configuration.
-
Replication is one-way only (GridGain 8 to GridGain 9). Once you switch to GridGain 9, changes cannot be replicated back to GridGain 8.
-
Full state transfer over snapshot is not supported. Initial data transfer must use standard full state transfer.
Creating Equivalent Tables
Here’s an example of migrating a GridGain 8 cache schema to GridGain 9:
-
Original GridGain 8 cache:
// GridGain 8 cache with composite key public class PersonKey { private Long id; private String region; } public class Person { private String name; private Integer age; private BigDecimal salary; } -
Creating equivalent GridGain 9 table:
CREATE TABLE IF NOT EXISTS PUBLIC.PERSON ( ID BIGINT NOT NULL, REGION VARCHAR NOT NULL, NAME VARCHAR, AGE INT, SALARY DECIMAL(10,2), "drVersion" VARBINARY, -- Required for replication PRIMARY KEY (ID, REGION) );
Recreating Indexes
Since indexes are not automatically transferred during DCR migration, you need to identify existing indexes in GridGain 8 and recreate them in GridGain 9.
To identify indexes in GridGain 8, query the system views:
-- List all indexes for a specific table
SELECT * FROM SYS.INDEXES
WHERE SCHEMA_NAME = 'PUBLIC' AND TABLE_NAME = 'PERSON';
Alternatively, you can use the control script to examine cache configuration:
./bin/control.sh --cache idle_verify --dump --cache-filter PERSON
After identifying the indexes, create them in GridGain 9 before starting replication:
-- Example: recreating indexes for the PERSON table
CREATE INDEX idx_person_age ON PERSON(AGE);
CREATE INDEX idx_person_name ON PERSON(NAME);
CREATE INDEX idx_person_salary_region ON PERSON(SALARY, REGION);
The indexes should be created after table creation but before starting the replication process for optimal performance during data transfer.
Limitations
The replication process between GridGain 8 and GridGain 9 has the following limitations:
-
Only one-way replication from the master GridGain 8 to a replica GridGain 9 cluster is supported. Consequently, only active-passive replication is supported for this scenario.
-
You cannot transfer data to the GridGain 9 cluster by using full state transfer over snapshot.
-
Data can only be transferred from a single GridGain 8 cluster at a time.
-
GridGain 9 cluster being already under load may cause some data from the transfer being overwritten. It is recommended to only transmit data to clusters not under load.
-
There are differences in GridGain 9 and GridGain 8 support for data types. Make sure data is mapped to columns of equivalent types. The table below provides a reference for commonly used types:
GridGain 8 Type GridGain 9 Type Notes BinaryObject
VARBINARY
For complex nested objects
java.lang.Integer
INT or INTEGER
java.lang.Long
BIGINT
java.lang.String
VARCHAR
Specify length if needed
java.math.BigDecimal
DECIMAL(p,s)
Specify precision and scale
java.util.Date
TIMESTAMP
Consider timezone settings
byte[]
VARBINARY
java.lang.Boolean
BOOLEAN
java.lang.Double
DOUBLE
java.lang.Short
SMALLINT
java.lang.Float
REAL
java.lang.Byte
TINYINT
Conflict Resolution
In most scenarios, there should be no conflicts, as data should be copied to a fresh cluster. However, in some scenarios, there still may be conflicts. In this case, any data from GridGain 9 cluster has a priority. If the same row already exists in the table, newly transferred data will always overwrite old data.
Failure Scenarios
If the replication is stopped for any reason, when it is restarted, full state transfer will be performed first, followed by incremental updates once data is fully synchronized.
Configuring Replication
Configuring replication to GridGain 9 cluster on GridGain 8 side is not different from configuring replication to GridGain 8 cluster. On GridGain 9 side, you need to start a connector tool that converts data from GridGain 8 and transfers it to GridGain 9.
Tombstone time-to-live, configured by using the tombstoneTtl configuration property, on receiving side must be same or greater than on sending side. By default, the same value is used in GridGain 8 and GridGain 9 (30 minutes).
Make sure to note the ID of the data center the master cluster is running in. This ID will be used in connector configuration to filter incoming connections.
The following rules are used for mapping values between GridGain 8 caches and GridGain 9 tables:
-
Cache key fields are mapped to primary key columns of a target table.
-
Cache value fields are mapped to non-primary key columns of a target table.
-
If a cache field is not present in the configuration it is mapped to a target table column with the same name.
-
By default, a non-BinaryObject key is mapped to a column named
_keyin a target table. -
By default, a non-BinaryObject value is mapped to a column named
_valuein a target table.
Connector Configuration
The connector configuration files should be using the HOCON format.
See the default port to use with DR connector
Here is an example configuration:
dr-service-config = {
cacheMapping = [
# Only one mapping at a time is supported.
# Mapping for key and value object fields to distinct columns.
{
cache = "cacheName"
table = "schemaName.tableName"
keyFields = [
{ field = "key1", column = "C1" }
{ field = "key2", column = "C2" }
{ field = "key3", column = "C3" }
{ field = "key4", column = "ignored_column", ignore = true }
]
valueFields = [
{ field = "value1", column = "C4" }
{ field = "value2", column = "ignored_column", ignore = true }
]
}
# # Alternative mapping.
# # The fields option maps fields to columns regardless of if they are key of value fields.
# {
# cache = "cacheName"
# table = "schemaName.tableName"
# fields = [
# { field = "_key", column = "ID" }
# { field = "field1", column = "C1" }
# { field = "field2", column = "C2" }
# { field = "field3", column = "ignored_column", ignore = true }
# ]
# }
]
clientConfiguration = {
serverEndpoints = [ "127.0.0.1:10800" ]
port = 10800
connectTimeout = 0
metricsEnabled = false
name = "drClient"
heartbeatInterval = 30000
heartbeatTimeout = 5000
backgroundReconnectInterval = 30000
ssl = {
enabled = false
ciphers = ""
keyStore = {
password = ""
path = ""
type = "PKCS12"
}
trustStore = {
password = ""
path = ""
type = "PKCS12"
}
}
authenticator = {
type = "BASIC"
identity = "<user>"
secret = "<password>"
}
}
drReceiverConfiguration = {
dataCenterId = 1
inboundHost = ""
inboundPort = 10800
drReceiverMetricsEnabled = true
idleTimeout = 60000
ignoreMissingTable = true
selectorCnt = 4
socketReceiveBufferSize = 0
socketSendBufferSize = 0
tcpNodelay = true
tombstoneTtl = 1800000
workerThreads = 10
writeTimeout = 60000
ssl = {
enabled = false
clientAuth = "none"
ciphers = ""
keyStore = {
password = ""
path = ""
type = "PKCS12"
}
trustStore = {
password = ""
path = ""
type = "PKCS12"
}
}
}
timeZone = "Europe/Berlin"
}
The following fields are available:
| Property | Default | Description |
|---|---|---|
|
The name of the GridGain 8 cache on the master cluster. The cache name is case-sensitive and should not include the schema. |
|
|
The name of the GridGain 9 table on the replica cluster. Table name is case-insensitive by default, so case-sensitive identifiers must be quoted. This name can include the schema as a prefix, in the format |
|
|
The array denoting the mapping of cache key fields to table columns. Not compatible with |
|
|
The case-sensitive name of the source cache field to get data from. |
|
|
The case-sensitive name of the target column to write data to. |
|
|
The array denoting the mapping of cache value fields to table columns. Not compatible with |
|
|
The case-sensitive name of the source cache field to get data from. |
|
|
The case-sensitive name of the target column to write data to. |
|
|
false |
If set to |
|
Alternative configuration. The array denoting the mapping of cache fields to table columns. |
|
|
Alternative configuration. The case-sensitive name of the source cache field to get data from. |
|
|
Alternative configuration. The case-sensitive name of the target column to write data to. |
|
|
The list of addresses of GridGain 9 nodes with configured client connectors. |
|
|
The port of the connector. |
|
|
The connection timeout, in milliseconds. |
|
|
If the metrics are enabled for the connection. |
|
|
Defines the unique client name. If not specified, generated automatically based on client number in |
|
|
Heartbeat message interval, in milliseconds. |
|
|
Heartbeat message timeout, in milliseconds. |
|
|
The period of time after which the connector will try to reestablish lost connection, in milliseconds. |
|
|
false |
If SSL is enabled for the connection. |
|
The list of ciphers to enable, separated by comma. |
|
|
PKCS12 |
Keystore type. |
|
Keystore password. |
|
|
Path to the keystore. |
|
|
PKCS12 |
Truststore type. |
|
Truststore password |
|
|
Path to the truststore. |
|
|
The type of authentication to use (for example, |
|
|
The username or identity for authentication. |
|
|
The password or secret for authentication. |
|
|
1 |
The ID of the master GridGain 8 cluster’s data center. Only connections from this data center will be allowed. |
|
Local host name of the connector. |
|
|
10800 |
The port used for data replication |
|
true |
If metrics are enabled. |
|
60000 |
How long the connector can be idle before the connection is dropped, in milliseconds. |
|
true |
If missing tables fail the replication. |
|
lowest of 4 and the number of available cores |
The number of threads handling connections. |
|
0 |
Socket receive buffer size in bytes. |
|
0 |
Socket send buffer size in bytes. |
|
true |
If the |
|
1800000 |
Tombstone expiration timeout, in milliseconds. |
|
All available threads |
The number of worker threads handling for batches processing. |
|
60000 |
Write timeout for TCP server connection. |
|
false |
If SSL is enabled for the connection. |
|
none |
SSL client authentication. Possible values:
|
|
The list of ciphers to enable, separated by comma. |
|
|
PKCS12 |
Keystore type. |
|
Keystore password. |
|
|
Path to the keystore. |
|
|
PKCS12 |
Truststore type. |
|
Truststore password |
|
|
Path to the truststore. |
|
|
Specifies the time zone of the data migrated from GridGain 8 to GridGain 9. Set as "Area/City" (for example, "Europe/Berlin") or as a fixed offset like GMT+5 to ensure correct conversion of |
Sample Cache Mappings
Mapping a Composite Key Cache
The configuration below maps fields K1, K2, K3 of cache key objects to columns COL_1, COL_2, COL_3 of the target table and fields V1, V2, V3 from cache value objects to columns COL_4, COL_5, COL_6 of the target table, and ignoring fields K4 and V4.
dr-service-config = {
cacheMapping = [
{
cache = "cacheName"
table = "schemaName.tableName"
keyFields = [
{ field = "K1", column = "COL_1" }
{ field = "K2", column = "COL_2" }
{ field = "K3", column = "COL_3" }
{ field = "K4", column = "ignored_column", ignore = true }
]
valueFields = [
{ field = "V1", column = "COL_4" }
{ field = "V2", column = "COL_5" }
{ field = "V3", column = "COL_6" }
{ field = "V4", column = "ignored_column", ignore = true }
]
}
]
}
Mapping a Single Key Cache
When mapping a cache with a single key, it is required to use the _key field name to specify the key field.
The example below shows how you can map a single key:
dr-service-config = {
cacheMapping = [
{
cache = "cacheName1"
table = "cacheName1"
keyFields = [
# This example is valid.
{ field = "_key", column = "ID" }
# The example below would not work.
#{ field = "id", column = "ID" }
]
valueFields = [
{ field = "name", column = "NAME" }
{ field = "age", column = "AGE" }
]
}
]
}
Mapping a Single Value Cache
When mapping a cache with a single value, it is required to use the _val field name to specify the value field.
The example below shows how you can map a single value:
dr-service-config = {
cacheMapping = [
{
cache = "cacheName1"
table = "cacheName1"
keyFields = [
{ field = "USERID", column = "USERID" }
{ field = "CITYID", column = "CITYID" }
]
valueFields = [
# This example is valid.
{ field = "_val", column = "NAME" }
# This example would not work
# { field = "AGE", column = "AGE" }
]
}
]
}
Mapping a Cache With Keys Duplicated in Values
If both the key and one of the values have the same name, you have to ignore one of the fields for the migration to work correctly.
The example below shows how you can ignore the duplicate:
dr-service-config = {
cacheMapping = [
{
cache = "cacheName2"
table = "cacheName2"
keyFields = [
{ field = "_key", column = "ID" }
]
valueFields = [
{ field = "name", column = "NAME" }
{ field = "age", column = "AGE" }
# Ignore the duplicate.
{ field = "id", column = "", ignore = true }
]
}
]
}
Custom Data Transformations
For complex data migration scenarios that cannot be handled by simple field-to-column mapping, the DR connector supports a transformer Java API that can be used to implement custom transformation logic. To implement it, you create the transformer class, package it into a jar and add it to DR connector’s classpath.
Implementing a Transformer Factory
The transformer factory class must be public and contain a static EntryTransformer create() class. This method must return an instance of EntryTransformer that implements your transformation logic.
Below is a basic example that converts all field values to strings:
public class StringifyTransformerFactory {
// This class must exist with this specific name.
public static EntryTransformer create() {
return (keyReader, valueReader) -> {
// Read key fields and convert to strings
Tuple key = Tuple.create();
for (String fieldName : keyReader.fieldNames()) {
Object fieldValue = keyReader.readField(fieldName);
key.set(fieldName, String.valueOf(fieldValue));
}
// Handle deletions - valueReader is null for removed entries
if (valueReader == null) {
return Result.remove(key);
}
// Read value fields and convert to strings
Tuple value = Tuple.create();
for (String fieldName : valueReader.fieldNames()) {
Object fieldValue = valueReader.readField(fieldName);
value.set(fieldName, String.valueOf(fieldValue));
}
return Result.upsert(key, value);
};
}
}
Handling Results
The Result class supports three operation types:
| Type | Description |
|---|---|
|
Insert or update an entry. Both key and value must be non-null. |
|
Delete an entry. Only key is required, value is null. |
|
Skip the entry (no operation). Use this to filter out entries. |
Handling Binary Objects
The BinaryObjectReader interface provides methods to access fields from GridGain 8 binary objects:
-
typeName()- Returns the type name of the binary object -
fieldNames()- Returns a list of all field names in the object -
readField(String name)- Reads and returns the value of the specified field
Field values are returned as Java objects and can be null either when the field doesn’t exist or when the field value is null.
Adding Custom Data Transformer
To use a custom transformer in your replication, make it available to the connector and set the configuration to use the factory class.
-
Add a jar containing the transformer factory class to the DR connector classpath (the
/libdirectory in the DR connector by default). The transformer factory must be implemented as described below. -
Specify the factory class name in your cache mapping configuration using the
transformerFactoryClassNameproperty:
dr-service-config = {
cacheMapping = [
{
cache = "cacheName"
table = "tableName"
transformerFactoryClassName = "com.example.MyTransformerFactory"
}
]
}
Running Connector Tool
Deployment Recommendations
Only one connector instance is supported per replication stream. Do not run multiple connectors for the same cache mapping.
Running the Tool Locally
To start the connector tool, use the start.sh script. You can provide the configuration for the script by using the config-path parameter and specifying the path to the Connector Configuration.
start.sh --config-path etc/example.conf
Running the Tool in Docker
The connector tool is available on DockerHub.
When running the connector tool in docker, keep in mind the following:
-
You need to mount the configuration in your container.
-
Open the port for connection from your GridGain 8 cluster. This port is specified in the
dr-service-config.drReceiverConfiguration.inboundPortConnector Configuration parameter. -
If you run GridGain 9 in a different network, open the port for the connection to that network. GridGain 9 server endpoint is specified in the
dr-service-config.clientConfiguration.serverEndpointsConnector Configuration parameter. -
If you are using Custom Data Transformer, mount the directory with the jar file, and then use the
EXTRA_CLASSPATHenvironment variable to load it.
To start the connector tool, start the container, specifying all required parameters, for example:
docker run -p {host_port}:{inbound_port} \
-v /host_config_path:/opt/gridgain-dr-connector/etc/custom \
-v /host_libs_directory:/opt/custom-libs \
-e EXTRA_CLASSPATH=/opt/custom-libs \
gridgain/gridgain-dr-connector:9.1.18 \
--config-path /opt/gridgain-dr-connector/etc/custom/config.conf
The example above assumes mounts host_config_path and /host_libs_directory directories that should contain configuration and custom libraries respectively.
You can also use docker compose. Here is the example of the above configuration in docker compose format:
services:
dr-connector:
container_name: dr-connector
image: gridgain-dr-connector:9.1.18
tty: true
volumes:
- /host_config_path:/opt/gridgain-dr-connector/etc/custom
- /host_libs_directory:/opt/custom-libs
ports:
- "{host_port}:{inbound_port}"
environment:
- EXTRA_CLASSPATH=/opt/custom-libs
command: ["--config-path", "/opt/gridgain-dr-connector/etc/custom/config.conf"]
Then you can start the docker image with docker compose:
docker compose up -d
Starting Replication
Step 1: Start the Connector
Start the connector tool on GridGain 9 side (see Running Connector Tool above).
Step 2: Verify Connector is Running
Check the connector logs to ensure it’s listening on the configured port. Look for the "Listening port" line in the log, signifying that the connector is running and ready to receive data on the specified port.
Step 3: Initiate Replication from GridGain 8
Start data center replication on the GridGain 8 cluster:
# Via control script
./bin/control.sh --dr full-state-transfer start --caches cacheName --sender-group grp1 --data-centers 2 --yes
The drReceiverConfiguration.datacenterId parameter in the GridGain 9 DR connector must match the datacenterId parameter of a data center that sends data. The datacenterId parameter in DrSenderConnectionConfiguration is ignored.
The example below shows where in GridGain 8 configuration you can find data center ID.
<bean class="org.gridgain.grid.configuration.GridGainConfiguration">
<!-- This parameter must match the drReceiverConfiguration.datacenterId parameter. -->
<property name="dataCenterId" value="1"/>
...
</bean>
Once replication is started, you can find the connection information in the node’s logs:
A sender hub has successfully connected to a remote receiver
Step 4: Trigger Full State Transfer
Start full state transfer from your GridGain 8 cluster:
./bin/control.sh --dr cache cacheName --action full-state-transfer --yes
This will transfer all information from the caches to GridGain 9.
Stopping Connector Tool
Once data replication is no longer needed (for example, the load was shifted to GridGain 9 cluster), stop the replication process on the GridGain 8 master cluster and then use the stop.sh script to stop the connector.
stop.sh
© 2026 GridGain Systems, Inc. All Rights Reserved. Privacy Policy | Legal Notices. GridGain® is a registered trademark of GridGain Systems, Inc.
Apache, Apache Ignite, the Apache feather and the Apache Ignite logo are either registered trademarks or trademarks of The Apache Software Foundation.