GridGain Developers Hub

Ignite Persistence

Overview

Ignite Persistence, or Native Persistence, is a set of features designed to provide persistent storage. When native persistence is enabled, Ignite stores all the data on disk and loads as much data as it can to RAM for processing. For example, if there are 100 entries and RAM has the capacity to store only 20, then all 100 are stored on disk and only 20 are cached in RAM for better performance.

When persistence is disabled, and no external storage is used, GridGain behaves as a pure in-memory store.

When persistence is enabled, every server node persists a subset of the data that only includes the partitions assigned to that node.

The native persistence functionality is based on the following features:

  • Storing data partitions on disk

  • Checkpointing

  • Write-ahead logging (WAL)

GridGain stores each partition in a separate file on disk. The data format of the partition files is the same as that of the data when it is kept in memory. If partition backups are enabled, they are also saved on disk. In addition to data partitions, GridGain stores indexes and metadata. It stores all indexes defined for a cache in a single index file.

persistent store structure 1

Enabling Persistent Storage

Native persistence is configured per data region. To enable persistent storage, set the persistenceEnabled property to true in the data region configuration. You can have in-memory data regions and data regions with persistence at the same time.

The following example shows how to enable persistent storage for the default data region.

<bean class="org.apache.ignite.configuration.IgniteConfiguration">
    <property name="dataStorageConfiguration">
        <bean class="org.apache.ignite.configuration.DataStorageConfiguration">
            <property name="defaultDataRegionConfiguration">
                <bean class="org.apache.ignite.configuration.DataRegionConfiguration">
                    <property name="persistenceEnabled" value="true"/>
                </bean>
            </property>
            <property name="walSegmentSize" value="128 * 1024 * 1024"/>
        </bean>
    </property>
</bean>
IgniteConfiguration cfg = new IgniteConfiguration();
//data storage configuration
DataStorageConfiguration storageCfg = new DataStorageConfiguration();

storageCfg.getDefaultDataRegionConfiguration().setPersistenceEnabled(true);


cfg.setDataStorageConfiguration(storageCfg);

Ignite ignite = Ignition.start(cfg);
var cfg = new IgniteConfiguration
{
    DataStorageConfiguration = new DataStorageConfiguration
    {
        DefaultDataRegionConfiguration = new DataRegionConfiguration
        {
            Name = "Default_Region",
            PersistenceEnabled = true
        }
    }
};

Ignition.Start(cfg);
This API is not presently available for C++. You can use XML configuration.

Configuring Persistent Storage Directory

By default, nodes stores user data, indexes, and WAL files in the {IGNITE_WORK_DIR}/db directory (a.k.a. storage directory). The following sub-directories are included in the storage directory:

Subdirectory name Description

{WORK_DIR}/db/{nodeId}

Contains cache data and indexes

{WORK_DIR}/db/wal/{nodeId}

Contains WAL files

{WORK_DIR}/db/wal/archive/{nodeId}

Contains WAL archive files

The nodeId part is either the consistent node ID (if it’s defined in the node configuration) or auto-generated node id. It is used to ensure uniqueness of the directories for the node. If multiple nodes share the same work directory, they use different sub-directories.

If the work directory contains persistence files for multiple nodes (there are multiple {nodeId} subdirectories with different nodeIds), the node picks up the first subdirectory that is not being used. A temporary lock file ensures that no subdirectory is claimed by multiple nodes.

To make sure a node always uses a specific subdirectory and, thus, specific data partitions even after a restart, set IgniteConfiguration.setConsistentId to a cluster-wide unique value in the node configuration.

You can change the location of data files by modifying the storagePath property (see Configuration Properties).

<bean class="org.apache.ignite.configuration.IgniteConfiguration">
    <property name="dataStorageConfiguration">
        <bean class="org.apache.ignite.configuration.DataStorageConfiguration">
            <property name="defaultDataRegionConfiguration">
                <bean class="org.apache.ignite.configuration.DataRegionConfiguration">
                    <property name="persistenceEnabled" value="true"/>
                </bean>
            </property>
            <property name="storagePath" value="/opt/storage"/>
            <property name="walSegmentSize" value="128 * 1024 * 1024"/>
        </bean>
    </property>
</bean>
IgniteConfiguration cfg = new IgniteConfiguration();
//data storage configuration
DataStorageConfiguration storageCfg = new DataStorageConfiguration();

storageCfg.getDefaultDataRegionConfiguration().setPersistenceEnabled(true);

storageCfg.setStoragePath("/opt/storage");

cfg.setDataStorageConfiguration(storageCfg);

Ignite ignite = Ignition.start(cfg);
var cfg = new IgniteConfiguration
{
    DataStorageConfiguration = new DataStorageConfiguration
    {
        StoragePath = "/ssd/storage",

        DefaultDataRegionConfiguration = new DataRegionConfiguration
        {
            Name = "Default_Region",
            PersistenceEnabled = true
        }
    }
};

Ignition.Start(cfg);
This API is not presently available for C++. You can use XML configuration.

You can change the Write-Ahead Log (WAL) and WAL Archive paths to point directories outside of the storage directory.

Checkpointing

Checkpointing is designed to ensure durability of data and recovery in case of a node failure. This process synchronizes dirty pages between RAM and the partition files on disk. A dirty page is a page that was updated in RAM but was not written to the respective partition file (the update, however, was appended to the WAL).

After the checkpoint process is started, all changes are persisted to disk. They will be available if the node crashes and is restarted.

checkpointing persistence

This process helps utilize disk space frugally by keeping pages in the most up-to-date state on disk. After a checkpoint is passed, you can delete the WAL segments that were created before that point in time.

The checkpointing frequency is set when a node is created by DataStorageConfiguration.setCheckpointFrequency. You can change the initial value in the configuration object (for example, in the XML config file). However, to apply such a change, you would need to restart the node. To override the configuration-defined frequency without node restarting, you can use a dynamic property available through the control script:

control.sh --property set --name checkpoint.frequency --val <value in milliseconds>
control.bat --property set --name checkpoint.frequency --val <value in milliseconds>

You can configure the checkpointing buffer, throttling, etc. For more information, see:

Write-Ahead Log (WAL)

The write-ahead log (WAL) is a log of all data modifying operations (including deletes) that happen on a node. When a page is updated in RAM, the update is not directly written to the partition file but is appended to the tail of the WAL.

The purpose of the WAL is to ensure durability of data and and as a recovery mechanism for scenarios where a single node or the whole cluster goes down. In case of a crash or restart, the cluster can always be recovered to the latest successfully committed transaction by relying on the content of the WAL. The WAL is enabled by default. You can disable it - see Disabling WAL.

The WAL consists of several files (called active segments) and an archive. The active segments are filled out sequentially and are overwritten in a cyclical order. If WAL Archive is enabled, once the 1st segment is full, its content is copied to that archive. While the 1st segment is being copied, the 2nd segment is treated as an active WAL file and accepts all the updates coming from the application side. By default, there are 10 active segments. Yoy can change this value using walSegments - see Configuration Properties.

By default, WAL archive is enabled. You can disable it - see Disabling WAL Archive.

Disabling WAL

There are situations when it is reasonable to have the WAL disabled to get better performance. For instance, it is useful to disable WAL during initial data loading and enable it after the pre-loading is complete.

IgniteConfiguration cfg = new IgniteConfiguration();
DataStorageConfiguration storageCfg = new DataStorageConfiguration();
storageCfg.getDefaultDataRegionConfiguration().setPersistenceEnabled(true);

cfg.setDataStorageConfiguration(storageCfg);

Ignite ignite = Ignition.start(cfg);

ignite.cluster().state(ClusterState.ACTIVE);

String cacheName = "myCache";

ignite.getOrCreateCache(cacheName);

ignite.cluster().disableWal(cacheName);

//load data
ignite.cluster().enableWal(cacheName);
var cacheName = "myCache";
var ignite = Ignition.Start();
ignite.GetCluster().DisableWal(cacheName);

//load data

ignite.GetCluster().EnableWal(cacheName);
ALTER TABLE Person NOLOGGING

//...

ALTER TABLE Person LOGGING
This API is not presently available for C++.

WAL Archive

The WAL archive is used to store WAL segments that may be needed to recover a node after crash, as well as for PITR (point-in-time recovery). The number of segments kept in the archive is such that the total size of all segments does not exceed the specified size of the WAL archive.

By default, the maximum size of the WAL archive (total space it occupies on disk) is defined as 1 Gb. You can change this value using maxWalArchiveSize - see Configuration Properties.

The minWalArchiveSize, defines the size starting from which the WAL archive begins self-cleaning to prevent uncontrolled growth. By default, the value of this property is half the maxWalArchiveSize, i.e., 500 Mb - for details, see Configuration Properties.

Here is how you can define the WAL archive in the configuration:

<bean class="org.apache.ignite.configuration.DataStorageConfiguration">
  <property name="walSegments" value="20"/>
  <property name="walSegmentSize" value="2000000"/>
  <property name="walPath" value="db/wal_path"/>
  <property name="maxWalArchiveSize" value="2000000000"/>
  <property name="minWalArchiveSize" value="1000000000"/>
  <property name="walArchivePath" value="db/wal_path/archive"/>
</bean>

WAL Modes

There are three WAL modes. Each mode differs in how it affects performance and provides different consistency guarantees.

Mode Description Consistency Guarantees

FSYNC

The changes are guaranteed to be persisted to disk for every atomic write or transactional commit.

Data updates are never lost surviving any OS or process crashes, or power failure.

LOG_ONLY

The default mode.

The changes are guaranteed to be flushed to either the OS buffer cache or a memory-mapped file for every atomic write or transactional commit.

The memory-mapped file approach is used by default and can be switched off by setting the IGNITE_WAL_MMAP system property to false.

Data updates survive a process crash.

BACKGROUND

When the IGNITE_WAL_MMAP property is enabled (default), this mode behaves like the LOG_ONLY mode.

If the memory-mapped file approach is disabled then the changes stay in node’s internal buffer and are periodically flushed to disk. The frequency of flushing is specified via the walFlushFrequency parameter.

When the IGNITE_WAL_MMAP property is enabled (default), the mode provides the same guarantees as LOG_ONLY mode.

Otherwise, recent data updates may get lost in case of a process crash or other outages.

NONE

WAL is disabled. The changes are persisted only if you shut down the node gracefully. Use Ignite.active(false) to deactivate the cluster and shut down the node.

Data loss might occur.

If a node is terminated abruptly during update operations, it is very likely that the data stored on the disk becomes out-of-sync or corrupted.

Changing WAL Segment Size

The default WAL segment size (64 MB) may be inefficient in high load scenarios because it causes WAL to switch between segments too frequently and switching/rotation is a costly operation. A larger size of WAL segments can help increase performance under high loads at the cost of increasing the total size of the WAL files and WAL archive.

You can change the size of the WAL segment files in the data storage configuration. The value must be between 512KB and 2GB. This change can be made while stopping and restarting nodes one-by-one, rather than restarting the entire cluster.

<bean class="org.apache.ignite.configuration.IgniteConfiguration">
    <property name="dataStorageConfiguration">
        <bean class="org.apache.ignite.configuration.DataStorageConfiguration">
            <property name="defaultDataRegionConfiguration">
                <bean class="org.apache.ignite.configuration.DataRegionConfiguration">
                    <property name="persistenceEnabled" value="true"/>
                </bean>
            </property>
            <property name="storagePath" value="/opt/storage"/>
            <property name="walSegmentSize" value="128 * 1024 * 1024"/>
        </bean>
    </property>
</bean>
IgniteConfiguration cfg = new IgniteConfiguration();
DataStorageConfiguration storageCfg = new DataStorageConfiguration();
storageCfg.getDefaultDataRegionConfiguration().setPersistenceEnabled(true);

storageCfg.setWalSegmentSize(128 * 1024 * 1024);

cfg.setDataStorageConfiguration(storageCfg);

Ignite ignite = Ignition.start(cfg);
This API is not presently available for C#/.NET. You can use XML configuration.
This API is not presently available for C++. You can use XML configuration.

Node Recovery Without WAL

If a node crashes while WAL is disabled, you may need to do some cleanup to restart the node.

  • If the node has crashed outside of checkpointing process, the node will just restart. It will be missing updates since the last checkpoint, but they will be replicated from other nodes.

  • If the node has failed during checkpointing, it will restart in Maintenance Mode. This node will not connect to the cluster or receive any requests from applications. While in this mode, the caches with disabled WAL need to be cleaned up first by removing all data.

Back up the corrupted data files (optional), then clean up persistence ase described in the troubleshooting section.

WAL Archive Compaction

You can enable WAL archive compaction to reduce the space the archive occupies. If compaction is enabled, all archived segments older than the last checkpoint are:

  • Compacted - the physical records are removed and ony the logical records are left; for details, see this wiki page

  • Compressed to the ZIP format

Compaction can reduce the archive size by more than an order of magnitude.

If the previously compressed segments are needed (for example, to re-balance data between nodes), they are uncompressed to the RAW format.

See the Configuration Properties section below to learn how to enable WAL archive compaction.

<bean class="org.apache.ignite.configuration.DataStorageConfiguration">
  <property name="maxWalArchiveSize" value="2000000000"/>
  <property name="walArchivePath" value="db/wal_path/archive"/>
  <property name="walCompactionEnabled" value="true"/>
</bean>

Disabling WAL Archive

In some cases, you may want to disable WAL archiving - for example, to reduce the overhead associated with copying of WAL segments to the archive. There can be a situation where GridGain writes data to WAL segments faster than the segments are copied to the archive. This may create an I/O bottleneck that can freeze the operation of the node. If you experience such problems, try disabling WAL archiving.

To disable archiving, set walPath and walArchivePath to the same value. This will prevent GridGain from copying segments to the archive. Instead, it will create new segments in the WAL folder. Old segments will be deleted as the WAL grows beyond the maxWalArchiveSize value.

Configuration Properties

The following table describes native persistence properties.

Property Name Description Default Value

persistenceEnabled

Set this property to true to enable Native Persistence.

false

storagePath

The path where data is stored.

${IGNITE_HOME}/work/db/node{IDX}-{UUID}

walPath

The path to the directory where active WAL segments are stored.

${IGNITE_HOME}/work/db/wal/

walArchivePath

The path to the WAL archive.

${IGNITE_HOME}/work/db/wal/archive/

walCompactionEnabled

Set to true to enable WAL archive compaction.

false

walSegmentSize

The size of a WAL segment file in bytes.

64MB

walSegments

The number of active segments in the WAL.

10

walMode

Write-ahead logging mode.

LOG_ONLY

walCompactionLevel

WAL archive compression level. 1 indicates the fastest speed, and 9 indicates the best compression.

1

maxWalArchiveSize

The maximum size (in bytes) the WAL archive can occupy on the file system. Observed as long as oit does not prevent completion of a checkpoint. If a specific checkpoint causes the archive to grow beyond the maximum size, the system starts self-cleanup as soon as this checkpoint is completed. "-1" means there is no archive size limit.

1 Gb

minWalArchiveSize

The size (in bytes) starting from which the WAL archive begins self-cleanup.

Half the value of maxWalArchiveSize; initially, 500 Mb