GridGain 8.8: a Technical Deep-Dive into Database-Engine Advancements

It’s been a while since we published a major release of GridGain In-Memory Computing Platform. There is a reason for that. We’ve been advancing our multi-tier database engine, powered by Apache Ignite. And, with GridGain 8.8, we are rolling out the first set of advancements (yep, more to come) that enable you to leverage the disk tier of the database to query larger datasets, reduce the total cost of ownership, and secure sensitive and personal data at rest.

Let’s do a technical deep-dive into the advancements. Also, you can join the “New Advances in GridGain's Multi-Tier Database Engine” webinar to learn about the details and to watch demos that illustrate the key improvements that GridGain 8.8 offers.

Advanced Disk Defragmentation

When you add data into GridGain, the database engine allocates data pages. The pages are always stored in memory and, if you enable Ignite persistence, on disk. When records are deleted from a data page, the page is linked into a free list so that the page can be used for new records. Because the data page can reside in any on-disk-file location, over time, after numerous inserts and deletes, on-disk files become fragmented.

Previous releases of GridGain and Ignite (before GridGain 8.8) do not provide automatic defragmentation, which compacts used space and makes more space available for on-disk records. As a solution, developers tend to defragment in a rolling-upgrade fashion by stopping each cluster node (one at a time), removing significantly fragmented files, and rebalancing the node’s data to the node after the node is rebooted and rejoins the cluster.

With GridGain 8.8, the database engine handles defragmentation mechanics.  You can use the operations that we added to our tooling to shrink data files and reclaim disk space without interrupting operations. For additional details, review the defragmentation section in the GridGain documentation

Data Compression

Disk defragmentation enables you to reclaim unused disk space.  Data compression enables you to reduce the amount of memory and disk space that existing records consume. Compression can significantly reduce infrastructure costs. It is especially efficient when data contains a high number of duplicates. If the disk tier is enabled, the use of data compression can even improve performance for I/O-intensive workloads, because applications can store more data in memory.

Compression works at a record level, using a dictionary-based Zstandard library algorithm. A pre-trained dictionary can allow compression ratios of up to 60% on real-world scenarios.

You can watch the data compression demo video to learn more about feature setup

Transparent Data Encryption

Since its introduction in earlier versions of Ignite, transparent data encryption has been significantly improved. It made its way to GridGain 8.8 after being thoroughly tested and certified for production.

New key maintenance procedures have been added to GridGain 8.8. So, now, GridGain can manage and rotate the master key and cache-specific encryption keys. The master key is used to encrypt cache encryption keys in order to secure the cache keys while they are stored at rest or being copied to another cluster. With the latest improvements, you can change an encryption key on the fly. When a key is changed,  data is automatically re-encrypted with no downtime.

To learn more about transparent data encryption, view our video demo “Transparent Data Encryption for GridGain and Ignite.”

Disk Tier Usage for Query Result Sets via SQL Quotas

SQL memory quotas avoid out-of-memory issues and use the disk tier when SQL queries that require a lot of memory space are running.

With the quotas, you can allocate a specified amount of memory on a per-node and per-query level. If a SQL result set doesn’t fit in the allocated space, either the query is terminated or continues to be executed by offloading the result set to disk. The offload option enables you to run greedy or complex analytical queries that can pull and process gigabytes of data.

Optional Memory Warm-Ups on Restarts

The beauty of the GridGain database engine is that it doesn’t require you to warm memory up from disk on restarts. Basically, as soon as a cluster is formed, an application can query and compute on it.

Many of the application developers who use Ignite and GridGain championed the memory warm-up on restarts feature for ultra-low-latency applications. For this type of application, developers are ready to sacrifice cluster availability time for their business-drive SLAs and wait while the data is loaded in memory on restarts.

Incremental State Transfer Between Data Centers

The incremental state transfer feature extends data-center replication.  In earlier versions of GridGain, you use the full-state transfer feature if you need to copy data sets among two or more clusters . With this approach, all the data is transferred on an entry-by-entry basis over the network.

GridGain 8.8 offers an incremental-state transfer option. You can transfer a delta between data centers. Also, you can restore remote data centers from a snapshot, and then you can use the incremental-state transfer feature to transfer the deltas of the updates that the snapshot did not have.

What’s Next

All right,  that’s a sneak peek into the most notable changes. Explore our resources to learn more about the advancements that the recently released GridGain database engine offers: