Distributed Caching with GridGain

Enterprises increasingly use in-memory computing solutions that leverage distributed caching to improve application speed and performance. In-memory caching removes the major performance delays that occur when an application built on a disk-based database must retrieve data from disk prior to processing. Combined with distributed computing, distributing caching allows users to meet the real-time decision making demands of the modern digital enterprise. By deploying a distributed caching solution with a distributed computing solution, organizations realize:

  • Up to a 1000 times improvement in application performance 
  • The ability to scale applications to petabytes of in-memory data
  • Real-time analytics across data lake and operational data
  • Cost effectiveness versus hardware-based in-memory computing solutions

A common data access layer that can provide applications access to data from multiple application datastores.

Distributed Caching
Architect
What is Distributed Caching?

Caching solutions such as memcache are typically deployed on a single node, but they cannot scale to meet today’s big data demands due to the physical limitations on RAM in a single server. Distributed caching allows multiple nodes to work together to hold massive amounts of cached data. The large data set is sharded and distributed across all of the nodes in the in-memory computing cluster, allowing the cluster to hold an amount of data that is determined by the number of nodes in the cluster rather than the maximum RAM that can be deployed on an individual node. But latency and performance problems can arise if each individual piece of data is not localized with the corresponding compute jobs. If the data is not localized, then data must be moved over the network between nodes to where the compute processes the data. Moving the data introduces delays prior to processing.

Distributed Caching in the GridGain Platform

The GridGain® in-memory computing platform, built on Apache® Ignite™, sits between the application and data layers to provide in-memory speed and massive scalability to applications built on disk-based databases. GridGain works seamlessly with existing application and data layers including all popular RDBMS, NoSQL and Hadoop databases. The Unified API allows you to easily integrate with your existing applications using a variety of common protocols including SQL, Java, C++, .NET, and many more. Advanced ANSI-99 SQL support, which includes DDL and DML, allows you to interact with the system using standard SQL commands.

Any Database
Distributed Computing
GridGain In-Memory Compute Grid

The GridGain in-memory computing platform includes an in-memory compute grid that distributes compute across the system’s cluster nodes. By localizing and distributing the compute, the system can perform massively parallel processing (MPP), which reduces or eliminates the need to move data between node caches prior to processing.

GridGain In-Memory Data Grid

The GridGain in-memory data grid leverages the features of the GridGain distributed caching capability and the compute grid to improve the performance of applications built on disk-based databases by more than 1,000x. Inserted between the application and data layers, GridGain maintains a copy of disk-based data from RDBMS, NoSQL or Hadoop databases in RAM and keeps the distributed cache and the underlying databases synchronized as new transactions are received and processed.
The GridGain in-memory data grid is horizontally scalable and supports adding nodes in real-time. It can linearly scale out to thousands of nodes with strong semantics for data locality and affinity data routing to reduce redundant data noise.

Speed and Scale with In-Memory Computing

The system accelerates compute times by moving data from disk into RAM and by leveraging the GridGain compute grid to use all the compute power of the cluster with minimal or no data movement for massively parallel processing. New nodes can be added to the GridGain cluster at any time so the system is highly scalable. The automatic rebalancing feature redistributes the in-memory data between nodes when new nodes are added while maintaining redundant copies of the data across cluster nodes to prevent downtime in case of a node failure.

In addition to the high performance GridGain in-memory data grid, the GridGain in-memory computing platform includes powerful features such as an in-memory database, streaming analytics, and a continuous learning framework for real-time machine and deep learning.