Getting Started with GridGain 9 Persistent Storage
Introduction
About this Guide
This guide will walk you through the basics of setting up and using GridGain 9’s RocksDB-based persistent storage with the Chinook database in a Docker-based environment.
Prerequisites
-
Up-to-date versions of Docker and Docker Compose;
-
Terminal or command line access;
-
Basic SQL knowledge;
-
At least 8GB of free RAM for the GridGain cluster.
Understanding Persistence in GridGain
Persistence Architecture
GridGain persistence is designed to provide quick and responsive persistent storage. When using persistent storage:
-
GridGain stores all data in it on disk;
-
It loads as much data as possible into RAM for processing;
-
Data is split into multiple partitions, with each partition stored in a separate file on disk;
-
In addition to data partitions, GridGain stores indexes and metadata on disk.
This architecture combines the performance benefits of in-memory computing with the durability of disk-based storage.
Storage Engine Types
Persistent Storage Options
-
AIPersist Engine - Default persistent storage engine with checkpointing;
-
RocksDB Engine - LSM-tree based persistent storage optimized for write-heavy workloads.
Volatile Storage Options
-
AIMem Engine - In-memory storage with no persistence.
Storage Profiles
In GridGain 9, persistence is configured by using storage profiles. A storage profile defines how data is stored, cached, and managed by the storage engine.
Each storage profile has specific properties depending on the engine type, but all profiles must specify the following properties:
-
name - A unique identifier for the profile
-
engine - The storage engine to use
Distribution Zones
Distribution zones control how data is distributed across the cluster and which storage profiles to use. They allow you to:
-
Control the number of data replicas;
-
Specify which nodes can store data;
-
Define how data is partitioned;
-
Assign storage profiles to determine persistence type.
Setting Up a Persistent Cluster
Docker Environment Configuration
We will use Docker Compose to create a multi-node GridGain cluster with persistent storage.
Creating the Docker Compose File
Create a docker-compose.yml
file in your working directory:
name: gridgain9
x-gridgain-def: &gridgain-def
image: gridgain/gridgain:latest
environment:
JVM_MAX_MEM: "4g"
JVM_MIN_MEM: "4g"
configs:
- source: node_config
target: /opt/gridgain/etc/ignite-config.conf
services:
node1:
<<: *gridgain-def
command: --node-name node1
ports:
- "10300:10300"
- "10800:10800"
volumes:
- ./data/node1:/opt/gridgain/work
node2:
<<: *gridgain-def
command: --node-name node2
ports:
- "10301:10300"
- "10801:10800"
volumes:
- ./data/node2:/opt/gridgain/work
node3:
<<: *gridgain-def
command: --node-name node3
ports:
- "10302:10300"
- "10802:10800"
volumes:
- ./data/node3:/opt/gridgain/work
configs:
node_config:
content: |
ignite {
network {
port: 3344
nodeFinder.netClusterNodes = ["node1:3344", "node2:3344", "node3:3344"]
}
"storage": {
"profiles": [
{
name: "rocksDbProfile"
engine: "rocksdb"
}
]
}
}
The node_config
configuration in the Docker Compose file:
-
Adds a storage profile named
rocksDbProfile
that uses the RocksDB engine; -
Sets the storage size to 256MB (268435456 bytes) by default;
-
Stores persistent data in the
data
directory where docker was run.
Starting the Cluster
Run the following command to start your cluster:
docker-compose up -d
Verifying Cluster Deployment
Check that all nodes are running:
docker compose ps
You should see output similar to:
NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS gridgain9-node1-1 gridgain/gridgain9:latest "docker-entrypoint.s…" node1 44 seconds ago Up 6 seconds 0.0.0.0:10300->10300/tcp, 3344/tcp, 0.0.0.0:10800->10800/tcp gridgain9-node2-1 gridgain/gridgain9:latest "docker-entrypoint.s…" node2 44 seconds ago Up 6 seconds 3344/tcp, 0.0.0.0:10301->10300/tcp, 0.0.0.0:10801->10800/tcp gridgain9-node3-1 gridgain/gridgain9:latest "docker-entrypoint.s…" node3 44 seconds ago Up 6 seconds 3344/tcp, 0.0.0.0:10302->10300/tcp, 0.0.0.0:10802->10800/tcp
Verify the Docker network:
docker network ls
Configuring Persistent Storage
Connecting to the Cluster
Connect to the GridGain CLI:
docker run --rm -it --network=host -v /opt/etc/license.json:/opt/gridgain/etc/license.json -e LANG=C.UTF-8 -e LC_ALL=C.UTF-8 gridgain/gridgain9:latest cli
When the CLI tool offers to connect to default node, confirm the connection. If you ever get disconnected, you can connect again by typing the following command:
connect http://localhost:10300
Initializing the Cluster
Before using the cluster, initialize it:
cluster init --name=GridGain --license=/opt/gridgain/etc/license.json
You should see the message "Cluster was initialized successfully".
Examining Storage Profiles
Verify the configured storage profiles:
node config show ignite.storage
You should see output showing the rocksDbProfile
configuration along with the default profiles.
Creating Distribution Zones for Persistence
Enter the interactive SQL CLI:
sql
Create a distribution zone that uses our RocksDB storage profile:
CREATE ZONE ChinookRocksDB WITH replicas=2, storage_profiles='rocksDbProfile';
Building the Chinook Database with Persistence
About the Chinook Database
The Chinook database represents a digital media store with tables for artists, albums, tracks, and more. It’s commonly used as a sample database for demonstrating database features.
Creating Persistent Database Tables
Create the necessary tables for the Chinook database using our RocksDB persistent zone:
-- Create Artist table
CREATE TABLE Artist (
ArtistId INT NOT NULL,
Name VARCHAR(120),
PRIMARY KEY (ArtistId)
) ZONE ChinookRocksDB;
-- Create Album table
CREATE TABLE Album (
AlbumId INT NOT NULL,
Title VARCHAR(160) NOT NULL,
ArtistId INT NOT NULL,
PRIMARY KEY (AlbumId, ArtistId)
) COLOCATE BY (ArtistId) ZONE ChinookRocksDB;
-- Create Genre table
CREATE TABLE Genre (
GenreId INT NOT NULL,
Name VARCHAR(120),
PRIMARY KEY (GenreId)
) ZONE ChinookRocksDB;
-- Create MediaType table
CREATE TABLE MediaType (
MediaTypeId INT NOT NULL,
Name VARCHAR(120),
PRIMARY KEY (MediaTypeId)
) ZONE ChinookRocksDB;
-- Create Track table
CREATE TABLE Track (
TrackId INT NOT NULL,
Name VARCHAR(200) NOT NULL,
AlbumId INT,
MediaTypeId INT NOT NULL,
GenreId INT,
Composer VARCHAR(220),
Milliseconds INT NOT NULL,
Bytes INT,
UnitPrice NUMERIC(10,2) NOT NULL,
PRIMARY KEY (TrackId, AlbumId)
) COLOCATE BY (AlbumId) ZONE ChinookRocksDB;
Loading Sample Data
Insert sample data into the tables:
-- Insert data into MediaType table
INSERT INTO MediaType (MediaTypeId, Name) VALUES
(1, 'MPEG audio file'),
(2, 'Protected AAC audio file');
-- Insert data into Artist table
INSERT INTO Artist (ArtistId, Name) VALUES
(1, 'AC/DC'),
(2, 'Accept'),
(3, 'Aerosmith'),
(4, 'Alanis Morissette'),
(5, 'Alice In Chains');
-- Insert data into Album table
INSERT INTO Album (AlbumId, Title, ArtistId) VALUES
(1, 'For Those About To Rock We Salute You', 1),
(2, 'Balls to the Wall', 2),
(3, 'Restless and Wild', 2),
(4, 'Let There Be Rock', 1),
(5, 'Big Ones', 3);
-- Insert data into Genre table
INSERT INTO Genre (GenreId, Name) VALUES
(1, 'Rock'),
(2, 'Jazz'),
(3, 'Metal'),
(4, 'Alternative & Punk'),
(5, 'Rock And Roll');
-- Insert data into Track table
INSERT INTO Track (TrackId, Name, AlbumId, MediaTypeId, GenreId, Composer, Milliseconds, Bytes, UnitPrice) VALUES
(1, 'For Those About To Rock (We Salute You)', 1, 1, 1, 'Angus Young, Malcolm Young, Brian Johnson', 343719, 11170334, 0.99),
(2, 'Balls to the Wall', 2, 2, 1, 'U. Dirkschneider, W. Hoffmann, H. Frank, P. Baltes, S. Kaufmann, G. Hoffmann', 342562, 5510424, 0.99),
(3, 'Fast As a Shark', 3, 2, 1, 'F. Baltes, S. Kaufman, U. Dirkscneider & W. Hoffman', 230619, 3990994, 0.99),
(4, 'Restless and Wild', 3, 2, 1, 'F. Baltes, R.A. Smith-Diesel, S. Kaufman, U. Dirkscneider & W. Hoffman', 252051, 4331779, 0.99),
(5, 'Princess of the Dawn', 3, 2, 1, 'Deaffy & R.A. Smith-Diesel', 375418, 6290521, 0.99);
Querying the Database
Test that your data was inserted correctly:
SELECT a.Name AS Artist, al.Title AS Album, t.Name AS Track
FROM Track t
JOIN Album al ON t.AlbumId = al.AlbumId
JOIN Artist a ON al.ArtistId = a.ArtistId
WHERE t.AlbumId = 1;
Testing Persistence Capabilities
Verifying Data Before Restart
Perform additional queries to ensure your data is properly stored:
-- Count tracks by genre
SELECT g.Name AS Genre, COUNT(t.TrackId) AS TrackCount
FROM Track t
JOIN Genre g ON t.GenreId = g.GenreId
GROUP BY g.Name;
-- Check all albums by artist
SELECT a.Name AS Artist, COUNT(al.AlbumId) AS AlbumCount
FROM Album al
JOIN Artist a ON al.ArtistId = a.ArtistId
GROUP BY a.Name;
Restarting the Cluster
To restart the cluster, you need to first exit the CLI tool.
-
Exit the SQL CLI with the
exit;
command, -
Then exit the main CLI with the
exit
command.
Restart the Docker containers:
docker-compose down
docker-compose up -d
Verifying Data Persistence After Restart
Reconnect to the CLI:
docker run -it --rm --net host gridgain/gridgain9 cli
The cluster is already initialized, so you can go directly to the SQL CLI:
sql
Run the same query to verify the data persisted through the restart:
SELECT a.Name AS Artist, al.Title AS Album, t.Name AS Track
FROM Track t
JOIN Album al ON t.AlbumId = al.AlbumId
JOIN Artist a ON al.ArtistId = a.ArtistId
WHERE t.AlbumId = 1;
Wrap Up
Summary
GridGain 9 with RocksDB persistent storage provides a powerful way to maintain data durability while leveraging in-memory computing performance. RocksDB is particularly well-suited for write-intensive workloads, making it an excellent choice for many production environments.
Additional Resources
© 2025 GridGain Systems, Inc. All Rights Reserved. Privacy Policy | Legal Notices. GridGain® is a registered trademark of GridGain Systems, Inc.
Apache, Apache Ignite, the Apache feather and the Apache Ignite logo are either registered trademarks or trademarks of The Apache Software Foundation.