The GridGain® in-memory computing platform is used for streaming analytics by companies worldwide to ingest, process, store and publish streaming data for large-scale, mission critical business applications. It is used by some of the world's largest banks for trade processing, settlement and compliance; by telecommunications companies to deliver call services over telephone networks and the Internet; by retailers and ecommerce vendors to deliver an improved real-time experience; and by leading cloud infrastructure and SaaS vendors as the in-memory computing foundation of their offerings. Companies have been able to ingest and process streams with millions of events per second on a moderately-sized cluster.

GridGain is integrated and used with major streaming technologies including Apache Camel, Apache Kafka, Apache Spark® and Apache Storm, Java Message Service (JMS) and MQTT to ingest, process and publish streaming data. Once loaded into the cluster, companies can leverage GridGain’s built-in massively parallel processing libraries for concurrent data processing, including concurrent SQL queries and machine and deep learning. Clients can then subscribe to continuous queries which execute and identify important events as streams are processed.

Streaming Analytics

GridGain also provides the broadest in-memory computing integration with Apache Spark. The integration includes native support for Spark DataFrames, a GridGain RDD API for reading in and writing data to GridGain as mutable Spark RDDs, optimized SQL, and an in-memory implementation of HDFS with the GridGain File System (GGFS). When deployed together, Spark can:

  • Access all of the in-memory data in GridGain, not just data streams
  • Share data and state across all Spark jobs
  • Take advantage of all of GridGain’s in-memory loading and processing capabilities including machine and deep learning to train models in real-time to improve outcomes for in-process HTAP applications