GridGain Developers Hub

GridGain ML Overview

What is GridGain ML?

The GridGain ML Inference Framework eliminates the gap between ML models and data by bringing inference directly to the database layer. Rather than moving data to your models, deploy trained models directly to GridGain where data resides, maximizing performance through data colocation while enabling real-time insights from operational data.

Key Concepts

ML Inference

The process of using a trained machine learning model to make predictions on new data.

Data Colocation

Running predictions where data lives, eliminating network transfer overhead.

Model Deployment

Distributing ML models across cluster nodes as deployment units.

Batch Processing

Efficiently processing multiple predictions in a single operation.

Benefits

Inference where data lives

Run predictions directly on database nodes to eliminate data transfer bottlenecks.

Support for diverse models

Use models from PyTorch, TensorFlow, and ONNX without conversion or retraining.

Integrated Infrastructure

Remove the need for separate ML serving infrastructure.

Scalable Inference

Support prediction workloads across the cluster via GridGain Compute.

Real-time decision making

Enable immediate insights from operational data.

Use Cases

Single Prediction

Run inference on individual data points for testing or real-time processing.

Example

Test a sentiment analysis model on a single product review to validate model behavior.

Batch Predictions

Process multiple inputs together in a single operation for improved throughput.

Example

Score 500 customer profiles simultaneously to generate recommendation lists.

SQL-Based Predictions

Execute model inference directly on query results from database tables.

Example

Run anomaly detection on sensor data selected by SQL WHERE clauses.

Colocated Predictions

Process data on the specific cluster node where it’s stored, eliminating network transfer.

Example

Generate personalized recommendations using user data stored on its partition node.

Cross-Framework Model Loading

Load and run models from different ML frameworks through a unified API.

Supported Frameworks

PyTorch, TensorFlow, ONNX.

Limitations

Currently, there are the following limitations:

  • SQL Predictions: Single column inference only, maximum 5000 rows.

  • Colocated Predictions: Key-based inference for single value column only.

  • Model Size: Large models may impact memory usage across cluster.