GridGain Developers Hub
GitHub logo GridGain iso GridGain.com
GridGain Software Documentation

Ignite Persistence

Overview

Native Persistence is a set of features designed to provide persistent storage in GridGain. When it is enabled, GridGain always stores all the data on disk, and loads as much data as it can into RAM for processing. For example, if there are 100 entries and RAM has the capacity to store only 20, then all 100 will be stored on disk and only 20 will be cached in RAM for better performance.

When Ignite persistence is turned off and no external storage is used, GridGain behaves as a pure in-memory store.

When persistence is enabled, every individual cluster node persists a subset of the data that only includes the partitions that are assigned to that node (including backup partitions if backups are enabled).

GridGain Native Persistence is based on the following features:

  • Storing data partitions on disk

  • Write-ahead logging

  • Checkpointing

  • Usage of OS swap

When persistence is enabled (for a region), GridGain will store each partition in a separate file on disk. The data format of the partition files is the same as that of the data when it is kept in memory. If backup partitions are enabled, they are also saved on disk.

In addition to data partitions, GridGain stores indexes and metadata.

persistent store structure

You can change the default location of data files in the configuration.

Baseline Topology and Cluster Activation

The baseline topology is a set of nodes meant to persist data on disk. The concept of baseline topology was introduced to give you the ability to control when you want to rebalance the data in the cluster.

In an in-memory only cluster, whenever a node joins or leaves the cluster (including occasional network failures), data partitions are redistributed among the new set of nodes. When persistence is enabled, your data set may be significantly larger and it will be stored on disk. In this scenario, data rebalancing may take a long time. To avoid unnecessary data transfer, you can decide when you want to start rebalancing by changing the baseline topology manually.

Because persistence is configured per data region, there is a difference between in-memory data regions and regions with persistence with respect to data rebalancing.

In-memory data region Data region with persistence

When a node joins/leaves the cluster, PME is triggered and followed by data rebalancing.

PME is performed. Data rebalancing is triggered when the baseline topology is changed.

Write-Ahead Log

The write-ahead log is a log of all data modifying operations (including deletes) that happen on a node. When a page is updated in RAM, the update is not directly written to the partition file but is appended to the tail of the WAL.

The purpose of the write-ahead log is to provide a recovery mechanism for scenarios where a single node or the whole cluster goes down. In case of a crash or restart, the cluster can always be recovered to the latest successfully committed transaction by relying on the content of the WAL.

The WAL consists of several files (called active segments) and an archive. The active segments are filled out sequentially and are overwritten in a cyclical order. Once the 1st segment is full, its content is copied to the WAL archive (see the WAL Archive section below). While the 1st segment is being copied, the 2nd segment is treated as an active WAL file and accepts all the updates coming from the application side. By default, there are 10 active segments.

WAL Modes

There are three WAL modes. Each mode differs in how it affects performance and provides different consistency guarantees.

Mode Description Consistency Guarantees

FSYNC

The changes are guaranteed to be persisted to disk for every atomic write or transactional commit.

Data updates are never lost surviving any OS or process crashes, or power failure.

LOG_ONLY

The default mode.

The changes are guaranteed to be flushed to either the OS buffer cache or a memory-mapped file for every atomic write or transactional commit.

The memory-mapped file approach is used by default and can be switched off by setting the IGNITE_WAL_MMAP system property to false.

Data updates survive a process crash.

BACKGROUND

When the IGNITE_WAL_MMAP property is enabled (default), this mode behaves like the LOG_ONLY mode.

If the memory-mapped file approach is disabled then the changes stay in node’s internal buffer and are periodically flushed to disk. The frequency of flushing is specified via the walFlushFrequency parameter.

When the IGNITE_WAL_MMAP property is enabled (default), the mode provides the same guarantees as LOG_ONLY mode.

Otherwise, recent data updates may get lost in case of a process crash or other outages.

NONE

WAL is disabled. The changes are persisted only if you shut down the node gracefully. Use Ignite.active(false) to deactivate the cluster and shut down the node.

Data loss might occur.

If a node is terminated abruptly during update operations, it is very likely that the data stored on the disk will be out-of-sync or corrupted.

WAL Archive

The WAL archive is used to store WAL segments that may be needed to recover the node after a crash. The number of segments kept in the archive is such that the total size of all segments does not exceed the specified size of the WAL archive.

By default, the maximum size of the WAL archive is defined as 4 times the size of the checkpointing buffer. You can change that value in the configuration.

Changing WAL Segment Size

The default WAL segment size (64 MB) may be inefficient in high load scenarios because it causes WAL to switch between segments too frequently and switching/rotation is a costly operation. A larger size of WAL segments can help increase performance under high loads at the cost of increasing the total size of the WAL files and WAL archive.

You can change the size of the WAL segment files in the data storage configuration. The value must be between 512KB and 2GB.

<bean class="org.apache.ignite.configuration.IgniteConfiguration" id="ignite.cfg">

    <property name="dataStorageConfiguration">
        <bean class="org.apache.ignite.configuration.DataStorageConfiguration">

            <!-- set the size of wal segments to 128MB -->
            <property name="walSegmentSize" value="#{128 * 1024 * 1024}"/>

            <property name="defaultDataRegionConfiguration">
                <bean class="org.apache.ignite.configuration.DataRegionConfiguration">
                    <property name="persistenceEnabled" value="true"/>
                </bean>
            </property>

        </bean>
    </property>
</bean>
IgniteConfiguration cfg = new IgniteConfiguration();
DataStorageConfiguration storageCfg = new DataStorageConfiguration();
storageCfg.getDefaultDataRegionConfiguration().setPersistenceEnabled(true);

storageCfg.setWalSegmentSize(128 * 1024 * 1024);

cfg.setDataStorageConfiguration(storageCfg);

Ignite ignite = Ignition.start(cfg);

Disabling WAL

There are situations when it is reasonable to have the WAL disabled to get better performance. For instance, it is useful to disable WAL during initial data loading and enable it after the pre-loading is complete.

IgniteConfiguration cfg = new IgniteConfiguration();
DataStorageConfiguration storageCfg = new DataStorageConfiguration();
storageCfg.getDefaultDataRegionConfiguration().setPersistenceEnabled(true);

cfg.setDataStorageConfiguration(storageCfg);

Ignite ignite = Ignition.start(cfg);

ignite.cluster().active(true);

String cacheName = "myCache";

ignite.getOrCreateCache(cacheName);

ignite.cluster().disableWal(cacheName);

//load data

ignite.cluster().enableWal(cacheName);
var cacheName = "myCache";
var ignite = Ignition.Start();
ignite.GetCluster().DisableWal(cacheName);

//load data

ignite.GetCluster().EnableWal(cacheName);
ALTER TABLE Person NOLOGGING

//...

ALTER TABLE Person LOGGING

WAL Archive Compaction

You can enable WAL Archive compaction to reduce the space occupied by the WAL Archive. By default, WAL Archive contains segments for the last 20 checkpoints (this number is configurable). If compaction is enabled, all archived segments that are 1 checkpoint old will be compressed in ZIP format. If the segments are needed (for example, to re-balance data between nodes), they will be uncompressed to RAW format.

See the Configuration section below to learn how to enable WAL archive compaction.

Disabling WAL Archive

In some cases, you may want to disable WAL archiving, for example, to reduce the overhead associated with copying of WAL segments to the archive. There can be a situation where GridGain writes data to WAL segments faster than the segments are copied to the archive. This may create an I/O bottleneck that can freeze the operation of the node. If you experience such problems, try disabling WAL archiving.

It is safe to disable WAL archiving because a cluster without the WAL archive provides the same data retention guarantees as a cluster with a WAL archive. Moreover, disabling WAL archiving can provide better performance.

To disable archiving, set the WAL path and the WAL archive path to the same value. In this case, segments will not be copied to the archive; instead, active segments will be overwritten in a cyclical order.

Checkpointing

Checkpointing is the process of copying dirty pages from RAM to partition files on disk. A dirty page is a page that was updated in RAM but was not written to the respective partition file (the update, however, was appended to the WAL).

After a checkpoint is created, all changes are persisted to disk and will be available if the node crashes and is restarted.

Checkpointing and write-ahead logging are designed to ensure durability of data and recovery in case of a node failure.

checkpointing persistence

This process helps to utilize disk space frugally by keeping pages in the most up-to-date state on disk. After a checkpoint is passed, you can delete the WAL segments that were created before that point in time.

Configuration

Native Persistence is configured per data region. You can have in-memory data regions and data regions with persistence at the same time.

<bean class="org.apache.ignite.configuration.IgniteConfiguration">
  <!-- Enabling native persistence. -->
  <property name="dataStorageConfiguration">
    <bean class="org.apache.ignite.configuration.DataStorageConfiguration">
      <property name="defaultDataRegionConfiguration">
        <bean class="org.apache.ignite.configuration.DataRegionConfiguration">
          <property name="persistenceEnabled" value="true"/>
        </bean>
      </property>
    </bean>
  </property>
</bean>
IgniteConfiguration cfg = new IgniteConfiguration();
// Ignite persistence configuration.
DataStorageConfiguration storageCfg = new DataStorageConfiguration();

storageCfg.getDefaultDataRegionConfiguration().setPersistenceEnabled(true);

cfg.setDataStorageConfiguration(storageCfg);

Ignite ignite = Ignition.start(cfg);
var cfg = new IgniteConfiguration
{
    DataStorageConfiguration = new DataStorageConfiguration
    {
        DefaultDataRegionConfiguration = new DataRegionConfiguration
        {
            Name = "Default_Region",
            PersistenceEnabled = true
        }
    }
};

Ignition.Start(cfg);

The following table describes some properties of DataStorageConfiguration.

Property Name Description Default Value

persistenceEnabled

Set this property to true to enable Native Persistence.

false

storagePath

The path where data is stored.

${IGNITE_HOME}/work/db/node{IDX}-{UUID}

walPath

The path to the directory where active WAL segments are stored.

${IGNITE_HOME}/work/db/wal/

walArchivePath

The path to the WAL archive.

${IGNITE_HOME}/work/db/wal/archive/

walCompactionEnabled

Set to true to enable WAL archive compaction.

false

walSegmentSize

The size of a WAL segment file in bytes.

64MB

walMode

Write-ahead logging mode.

LOG_ONLY

walCompactionLevel

WAL archive compression level. 1 indicates the fastest speed, and 9 indicates the best compression.

1

maxWalArchiveSize

The maximum size the WAL archive can occupy on the file system.

Four times the size of the checkpointing buffer.