GridGain Developers Hub

Monitoring Memory

GridGain provides metrics for Data Regions and Persistence.

Data Region Metrics

GridGain’s memory-centric storage can be monitored via several parameters exposed through the DataRegionMetrics interface, and related JMX bean. Having access to the data region metrics can help you track overall memory utilization, measure its performance, and execute required optimizations.

The DataRegionMetrics interface is the main entry point. It provides memory-related metrics of a specific Ignite node. Since there can be several regions configured on a node, metrics for every region are collected and obtained individually.

Disabling Data Region Metrics

To disable data region metrics, set DataRegionConfiguration.setMetricsEnabled(false) for every region you do not want to collect the metrics for.

<bean class="org.apache.ignite.configuration.IgniteConfiguration">
  <property name="dataStorageConfiguration">
    <bean class="org.apache.ignite.configuration.DataStorageConfiguration">
      <property name="dataRegionConfigurations">
        <list>
          <bean class="org.apache.ignite.configuration.DataRegionConfiguration">
            <!-- Custom region name. -->
            <property name="name" value="myDataRegion"/>

            <!-- Disable metrics for this data region  -->
            <property name="metricsEnabled" value="false"/>

            <!-- Other configurations -->
            ...
          </bean>
        </list>
      </property>
    </bean>
  </property>

  <!-- Other Ignite configurations -->
  ...
</bean>
// Ignite configuration.
IgniteConfiguration cfg = new IgniteConfiguration();

// Durable Memory configuration.
DataStorageConfiguration storageCfg = new DataStorageConfiguration();

// Create a new data region.
DataRegionConfiguration regionCfg = new DataRegionConfiguration();

// Region name.
regionCfg.setName("myDataRegion");

// Disable metrics for this region.
regionCfg.setMetricsEnabled(false);

// Set the data region configuration.
storageCfg.setDataRegionConfigurations(regionCfg);

// Other configurations
...

// Apply the new configuration.
cfg.setDataStorageConfiguration(storageCfg);

Using JMX Bean

All DataRegionMetrics of a local node are visible through the JMX interface: DataRegionMetricsMXBean. You can connect to the bean from any JMX-compliant tool or API.

Use the DataRegionMetricsMXBean.enableMetrics() method exposed by a special JMX bean to activate collecting of data region metrics.

The JMX beans expose the same set of metrics that DataRegionMetrics has, as well as a few additional ones. See the DataRegionMetricsMXBean JavaDoc for more details.

Getting Metrics

Use DataRegionMetricsMXBean or the Ignite.dataRegionMetrics() interface method to get the latest metrics snapshot and iterate over it, as shown in the example below:

// Get the metrics of all the data regions configured on a node.
Collection<DataRegionMetrics> regionsMetrics = ignite.dataRegionMetrics();

// Print out some of the metrics.
for (DataRegionMetrics metrics : regionsMetrics) {
    System.out.println(">>> Memory Region Name: " + metrics.getName());
    System.out.println(">>> Allocation Rate: " + metrics.getAllocationRate());
    System.out.println(">>> Fill Factor: " + metrics.getPagesFillFactor());
    System.out.println(">>> Allocated Size: " + metrics.getTotalAllocationSize());
    System.out.println(">>> Physical Memory Size: " + metrics.getPhysicalMemorySize());
}

Available Data Region Metrics

Here is the list of metrics available for data region:

Method Name Description

getName()

Returns the name of a data region the metrics belong to.

getTotalAllocatedPages()

Gets the total number of allocated pages related to the data region. When Ignite persistence is disabled, this metric shows the total number of pages in RAM. When Ignite persistence is enabled, this metric shows the total number of pages in memory and on disk.

getAllocationRate()

Gets pages allocation rate of this region.

getEvictionRate()

Gets pages eviction rate of this region.

getLargeEntriesPagesPercentage()

Gets the percentage of pages that are fully occupied by large entries that go beyond the page size. Large entities are split into fragments in a way that each fragment can fit into a single page.

getPagesFillFactor()

Gets the percentage of used space.

getDirtyPages()

Gets the number of dirty pages (pages for which content varies from the content of the same pages on disk). This metric is used only when the Ignite persistence is enabled.

getPagesReplaceRate()

Gets the rate (pages per second) at which pages that are in RAM are replaced with other pages from disk. The metric effectively represents the rate at which pages get 'evicted' from RAM in favor of bringing other pages​ from disk. This metric is used only when the Ignite persistence is enabled.

getPhysicalMemoryPages()

Gets the number of pages currently loaded in RAM. When Ignite persistence is disabled, this metric is the same as getTotalAllocatedPages().

getTotalAllocatedSize()

Gets the total size (in bytes) of memory allocated to the data region. When Ignite persistence is disabled, this metric shows the total size of pages in RAM. When Ignite persistence is enabled, this metric shows the total size of pages in memory and on disk.

getPhysicalMemorySize()

Gets the total size (in bytes) of pages loaded in RAM. When persistence is disabled, this metric is same as getTotalAllocatedSize().

getCheckpointBufferPages()

Gets checkpoint buffer size in pages.

getCheckpointBufferSize()

Gets checkpoint buffer size in bytes.

getPageSize()

Gets memory page size.

Persistent Data Storage

GridGain has various sets of metrics for persistent data when it’s enabled for a data region: Data Volume, Data Storage, and Page Replacement.

Disabling Data Storage Metrics (Java)

To disable Ignite persistence Data Storage metrics, set DataStorageConfiguration.setMetricsEnabled(false), like so:

<bean class="org.apache.ignite.configuration.IgniteConfiguration">
  <property name="dataStorageConfiguration">
    <bean class="org.apache.ignite.configuration.DataStorageConfiguration">
      <!-- Disable metrics for Ignite persistence  -->
      <property name="metricsEnabled" value="false"/>

      <!-- Other configurations -->
      ...
    </bean>
  </property>

  <!-- Other Ignite configurations -->
  ...
</bean>
// Ignite configuration.
IgniteConfiguration cfg = new IgniteConfiguration();

// Durable Memory configuration.
DataStorageConfiguration storageCfg = new DataStorageConfiguration();

// Disable metrics for Ignite persistence.
storageCfg.setMetricsEnabled(false);

// Other configurations
...

// Apply the new configuration.
cfg.setDataStorageConfiguration(storageCfg);
DataStorageConfiguration storageCfg = new DataStorageConfiguration();
DataRegionConfiguration regionCfg = new DataRegionConfiguration();
regionCfg.setName("myDataRegion");
// Disable metrics.
storageCfg.setMetricsEnabled(false); // Metrics for data storage.
regionCfg.setMetricsEnabled(true); // Metrics for a particular data region.
storageCfg.setDataRegionConfigurations(regionCfg);

Using JMX Bean

Ignite persistence Data Storage metrics can be collected through the JMX interface - DataStorageMetricsMXBean. You can connect to the bean from any JMX-compliant tool or API.

Use the DataStorageMetricsMXBean.enableMetrics() method to activate collecting of Ignite persistence related metrics.

The JMX beans expose the same set of metrics that DataStorageMetrics has, as well as a few additional ones. See DataStorageMetricsMXBean JavaDocs for more details.

Using Java

Call Ignite.dataStorageMetrics() to get the latest persistence metrics snapshot, as shown in the example below:

// Getting metrics.
DataStorageMetrics pm = ignite.dataStorageMetrics();

System.out.println("Fsync duration: " + pm.getLastCheckpointFsyncDuration());

System.out.println("Data pages: " + pm.getLastCheckpointDataPagesNumber());

System.out.println("Checkpoint duration:" + pm.getLastCheckpointDuration());

Useful Data Storage Beans and Metrics

Data Storage Beans

  • DataStorageMetricsMXBean:

    • DirtyPages (total number of dirty pages for the next checkpoint)

    • CheckpointTotalTime (total checkpoint time from last restart)

    • LastCheckpointDuration (the duration of the last checkpoint in milliseconds)

    • UsedCheckpointBufferSize (used checkpoint buffer size in bytes)

    • LastCheckpointPagesWriteDuration (the duration of last checkpoint pages write phase in milliseconds)

    • LastCheckpointMarkDuration (the duration of last checkpoint mark phase in milliseconds)

    • LastCheckpointTotalPagesNumber (the total number of pages written during the last checkpoint)

    • etc.

Data Volume Beans

  • DataStorageMetricsMXBean:

    • WalTotalSize (total size in bytes for storage Write Ahead Log files)

    • TotalAllocatedSize (total size of memory allocated in bytes)

    • OffheapUsedSize (total used offheap size in bytes)

    • etc.

  • DataRegionMetricsMXBean:

    • TotalAllocatedPages (total number of allocated pages related to the data region)

    • AllocationRate (page allocation rate of a memory region)

    • PagesFillFactor (the percentage of the available space currently used)

    • etc.

Page Replacement Beans

  • DataRegionMetricsMXBean:

    • PagesReplaceRate (rate (pages per second) at which pages get replaced with other pages from persistent storage)

    • PagesReplaceAge (average age, in milliseconds, for the pages being replaced from the disk storage)

    • PagesReplaced (number of replaced pages from last restart)

Memory Usage Calculation

You can also obtain metrics for caches associated with a particular CacheGroup. Currently, these metrics are available only through JMX, via CacheGroupMetricsMXBean. Refer to the CacheGroupMetricsMXBean JavaDoc for a complete list of metrics available.

Single Node Memory Usage

The following examples show how to calculate:

  • current node size - the total size of data on a node in MB/GB

  • current cache size - the size of data in a cache in MB/GB.

    1. Current node size is DataStorageMetricsMXBean.getTotalAllocatedSize

    2. The current size of a specific cache on a node is CacheGroupMetricsMXBean.getTotalAllocatedSize. Note that there should be only one cache within a cache group (default behavior) to make use of the metric.

Cluster-Wide Memory Usage

  1. To calculate the total cluster size, you can sum the DataStorageMetricsMXBean.getTotalAllocatedSize of all nodes.

  2. Current cache size is the sum of CacheGroupMetricsMXBean.getTotalAllocatedSize of all nodes. Note that there should be only one cache within a cache group (default behavior) to make use of the metric.