Product
Overview
We have a pretty simple formula for our product :)
GridGain = Java * Scala + Compute grid + Data grid + Cloud auto-scaling
GridGain is the first 100% integrated distributed software middleware that combines compute and data
grid technology with unique auto-scaling capabilities on any managed infrastructure - from a
single laptop to a large hybrid cloud consisting of thousands of nodes. GridGain provides
native support for both Java and Scala programming languages.
Using GridGain you can quickly build distributed applications in Java or Scala that work
natively in the grid or cloud environment:
- scale up or down based on the demand
- cache distributed data in data grid
- speed up long running tasks using MapReduce
GridGain can work standalone but also integrates natively with the many projects in
Java and Scala eco-system like servlet containers, JEE servers, DI and web frameworks, JMS middleware, etc.
GridGain 3.0 comes in two editions:
- Community Edition, open source and free.
- Enterprise Edition, many additional features.
Download GridGain 3.0 White Paper
Top
Wiki
key features
GridGain 3.0 represents more than 20 months of cumulative research and development performed by
GridGain Systems.
Although the product effectively doubled in size from GridGain 2.0 and most of its APIs have been
significantly enhanced, it is largely backward compatible and in majority of cases the migration
from the previous version only requires a simple recompilation. Such longevity of the design is a
sound testament to GridGain core design principles.
Some of the key features of GridGain 3.0 are:
Functional Programming
GridGain 3.0 is the industry first JVM-based post-functional distributed middleware that
combined traditional object-oriented programming approach with comprehensive functional
programming support. The result - is a powerful, flexible and highly expressive APIs that
are simple and productive to use.
To support functional programming GridGain 3.0 introduced two new features:
- Java-based functional programming framework built from the ground up in GridGain 3.0
providing its users with the most comprehensive functional programming capabilities
for Java programming language.
- Scalar - Scala-based internal Domain Specific Language (DSL) built on top of
GridGain 3.0 Java-based functional core allowing Scala developers native access to
GridGain functionality.
Top
Wiki
Native Java & Scala Support
With support for functional programming in Java and availability of internal DSL for Scala,
GridGain 3.0 becomes the only distributed middleware that is native to both Java and Scala
programming languages.
New Java and Scala functional APIs (additionally to object-oriented and AOP-based) enable developers
to significantly simplify many often used distributed operations - yet gain in readability and
expressiveness of their business logic build on top of GridGain 3.0.
Consider the following example of the method that calculates the number of non-whitespace
characters in a string.
In Java:
public int length(String msg) {
return msg.replaceAll("\\s", "").length();
}
In Scala:
def length(msg: String) = msg.replaceAll("\\s", "").length
In order to grid-enable it (for the sake of this example) we would split input string into
substrings separated by whitespaces, distribute the calculation of length for each string to
the remote nodes and finally aggregate all sub-lengths from remote nodes to a final value of
non-whitespace length for the input string.
Here is how this could be coded in Java using GridGain 2.0 AOP-based approach:
@Gridify(taskClass = Task.class)
public int length(String msg) {
return msg.replaceAll("\\s", "").length();
}
class Task extends GridifyTaskSplitAdapter<Integer> {
@Override
protected Collection<? extends GridJob> split(int gridSize,
GridifyArgument arg) throws GridException {
String msg = (String)arg.getMethodParameters()[0];
String[] words = msg.split("\\w", 0);
Collection<GridJob> jobs = new ArrayList<GridJob>();
for (final String word : words) {
jobs.add(new GridJobAdapter() {
@Override
public Object execute() throws GridException {
return word.length();
}
});
}
return jobs;
}
@Override
public Integer reduce(List<GridJobResult> results) throws GridException {
int length = 0;
for (GridJobResult res : results) {
length += (Integer)res.getData();
}
return length;
}
}
The code above creates a grid task that “knows” how to split and aggregate the original method
that is marked with @Gridify annotation to link it with its task.
The boilerplate code for such task, however, can in most cases be eliminated since the actual
logic consists of just two simple functions:
- One that produces a collection of closures based on the input string,
- and second one that sums up collection of integers.
With GridGain 3.0 this problem can be solved more effectively using functional APIs.
Instead of AOP we will simply modify the original code to work in a distributed fashion.
This usage paradigm is central to new GridGain 3.0 functional design.
In Java (using GridGain 3.0 functional APIs):
public int length(String msg) {
return G.grid().reduce(SPREAD, F.yield(msg.split(“\\s”, 0),
F.<String, Integer>c1("length")), F.sumIntReducer());
}
In Scala (using GridGain 3.0 and Scalar DSL):
def length(msg: String) = grid !!< (
for (w < msg.split(“\\s”, 0)) yield () => w.length(),
(s: Seq[Int]) => (0 /: s)(_+_)
)
As seen in these examples the simplification is dramatic. And it is not only cosmetic -
it highlights the actual business logic leaving the distribution plumbing completely
behind the scene.
Another important point is that distribution logic is brought onto language level in
Java and even more so in Scala. In Java, closures and typedefs significantly reduce the
boilerplate code, and in Scala the distributed operations brought completely on an operator
syntax level via internal Scalar DSL making complex distributed cloud operations no different
syntactically or semantically from the standard Scala code.
Top
Wiki
100% Integrated Platform
GridGain = Compute + Data + Cloud
GridGain 3.0 is the first version of GridGain software that delivers on its original vision:
to provide highly integrated, cohesive and easy to use distributed middleware that combines
computational grid, data grids and auto-scaling on any managed infrastructure.
It is also the first distributed middleware of any kind to provide such fully integrated platform
for developing applications that work and scale natively on any managed infrastructure. The
benefits of such integration are obvious and plentiful for the end users: from dramatically
shorten learning curve and development cycle to unique technical capabilities not found in any
other products such system-wide zero deployment, full functional APIs, unified configuration and
management, etc.
Top
Wiki
Advanced Data Grid
GridGain 3.0 Data Grid subsystem is fully integrated into the core of GridGain and is built on
top of the existing functionality such as pluggable auto-discovery, communication, and marshaling,
peer-to-peer on demand class loading, and support for functional programming. Among its key
features are:
- Expiration policies (LFU, LRU, time-based)
- Named caches
- Read-through and write-through logic with pluggable cache store
- Synchronous and asynchronous cache operations
- Pluggable data overflow storage via new swap space SPI
- Pessimistic, optimistic and eventually consistent transactions
- JTA/JCA integration
- Data replication and data invalidation in synchronous and asynchronous modes
- Partitioned cache with active replicas
- Advanced distributed query capability including SQL based, Lucene based, H2 text
based and predicate based scanning with support for pagination, local and remote
filtering, transformation and reduction
- Full integration for compute grid for non-trivial affinity based routing
- Functional and object-oriented APIs
GridGain 3.0 is also the first data grid featuring zero-deployment capability enabling users to
simply bring up default GridGain nodes online and they immediately become part of the data grid
topology and can store any user objects without any need for explicit deployment of user’s classes.
Top
Wiki
Advanced Compute Grid
MapReduce paradigm is at core of GridGain industry leading compute grid technology. It defines
the process of splitting original computational task into multiple sub-tasks, executing
these sub-tasks in parallel on any managed infrastructure and aggregating (reducing) results back
to one final result.
GridGain provides the most advanced implementation of MapReduce paradigm with the
following features:
- Direct API support for split and aggregation
- Pluggable failover and topology resolution
- Distributed task session
- AOP, OOP-based, FP-based, synch/asynch execution models
- Cron-based scheduling
- Redundant mapping
- Zero deployment with peer-to-peer class loading
- Partial asynchronous reduction
- Support for weighted and adaptive split
- Checkpoints for long running tasks
- Early and late load balancing
- Affinity routing with data grids
Top
Wiki
Seamless Cloud Enabling
GridGain is the only grid computing platform that allows to grid-enable existing code without any
modification. This is achieved by using Java annotations and AOP-style crosscutting effectively
forming a grid-enabling DSL. In some case, like for example with JBoss, you may not need to even
have the source code to grid-enable it as annotations and AOP-crosscutting can be applied via
external file.
Top
Wiki
GridGain Shell
GridGain 3.0 Enterprise Edition introduces GridGain Shell - a pluggable and scriptable command
line management and monitoring tool. Some of its key features are:
- Allows to “script” various operations on GridGain deployment
- Interactive and command modes
- Fully extensible via user defined pluggable commands
- Seamless connectivity to the running GridGain deployment
- Some of the available out-of-the-box commands:
- Review and monitor topology
- Execute grid tasks
- Monitor status
- Get statistics
- Query data grid
- Query distributed events
Top
Wiki
Pay-Per-Usage Pricing Model
GridGain 3.0 introduces a new pay-per-usage pricing model for its Enterprise Edition.
GridGain 3.0 becomes the first distributed software middleware that works on any managed
infrastructure - from a single laptop to a thousands of nodes in the cloud - that offers a single
and unified pay-per-usage pricing.
GridGain 3.0 also incorporates a unique built-in idle-detection technology that prevents
charging for an active and running node if it is idling for more than hour. Idling is defined as
not performing any user operations and not storing any user data as part of the data grid.
This technology allows GridGain customers to safely over-provision yet pay for the actual use only.
This removes the cost penalty for such frequent over-provisioning scenarios such as planned
over-capacity for anticipated load spikes, staging and QA environments, hot standbys, scheduled builds,
disaster recovery sites, etc.
Top
Wiki
Hybrid Cloud Support
To go along with cloud enhancements GridGain 3.0 Enterprise Edition features two built
from the ground up SPI implementations for discovery and communication. These new implementations
are designed specifically to work in a large hybrid cloud environment with one-directional
connections, and through-cloud routing capabilities.
These new discovery and communication implementations support following non-trivial hybrid deployments:
- Multiple private/public clouds with no direct connectivity
- Multiple private/public cloud with out-connectivity or in-connectivity only
- Geographically distributed hybrid cloud
- Single cloud with WAN/LAN/VPN connectivity between nodes
Top
Wiki
API-Level Cloud Control
GridGain 3.0 also introduces advanced API-level control of cloud operations. Anchored by a new
cloud SPI with three out-of-the-box implementations for EC2, RackSpace and in-memory cloud, it enables
developers to control any managed infrastructure right from the code of their applications largely
removing any need for 3rd party cloud management solutions to perform such operations as:
- Starting, stopping and managing virtual instances
- Querying cloud resources such as images, storage devices, network quotas, etc.
- Changing virtual instance profile (where supported)
In hybrid cloud deployments GridGain 3.0 provides transparent topology and unified view with a
single API on the entire virtual cloud comprised of any number of physical cloud providers.
This unification greatly simplifies auto-scaling capabilities of business application built
with GridGain.
GridGain 3.0 also provides cloud strategies and policies that enable automated
elasticity as well as automated SLA/QoS control capabilities. Both are completely pluggable
and are managed by GridGain runtime.
Top
Wiki
SPI-Based Architecture
Service Provider Interface (SPI)-based architecture is at the core of configuration and
customization capabilities of GridGain. GridGain exposes all major functional areas of
its infrastructure via SPI allowing developers to customize practically every aspect of
GridGain functionality from communication between nodes, auto-discovery and topology management
to deployment, load balancing and collision resolution.
Unified configuration of SPIs enables LEGO-like approach to assembling your GridGain framework with
specific set of SPI implementations. In the same time grid task implementations (the actual business
logic that is running on the grid) or data grid access logic are kept unaffected by different SPIs
running underneath.
Top
Wiki
Zero Deployment
GridGain 3.0 is also the first fully integrated cloud platform providing zero deployment
capability where all necessary classes and resources are loaded on demand. GridGain further
provides 4 different modes of peer-to-peer deployment supporting the most complex deployment
environments like custom class loaders, WAR/EAR files, etc.
Zero deployment technology enables users to simply bring up default GridGain nodes online
and they immediately become part of the data and compute grid topology and can store any user
objects or perform any user tasks without any need for explicit deployment of user’s classes
or resources.
Top
Wiki
Advanced Load Balancing
GridGain provides both early and late load balancing that are defined by load balancing and
collision (scheduling) resolution SPIs - effectively enabling full customization of entire load
balancing process. Early and late load balancing allows adapting the grid task execution to
non-deterministic nature of execution on the grid.
In fact, grid environment is often heterogeneous and non-static, tasks can change their complexity
profiles dynamically at runtime and external resources can affect execution of the task at any
point. All these factors underscore the need for proactive load balancing during initial mapping
operation as well as on destination node where jobs can be in waiting queues.
Top
Wiki
Pluggable Fault Tolerance
Failover management and resulting fault tolerance is a key property of any grid computing
infrastructure. Based on its SPI-based architecture GridGain provides totally pluggable failover
logic with several popular implementations available out-of-the-box. Unlike other grid computing
frameworks GridGain allows to failover the logic and not only the data.
With grid task being the
atomic unit of execution on the grid the fully customizable failover logic enables developer to
choose specific policy much the same way as one would choose concurrency policy in RDBMS
transactions.
This allows to fine tune how grid task reacts to the failure, for example:
- fail entire task immediately upon failure of any of its jobs (fail-fast approach)
- failover any failed job to other nodes until all nodes are exhausted for this job
(fail-slow approach)
Top
Wiki
REST APIs for GridGain Access
GridGain 3.0 introduces new SPI for allowing access to GridGain from the outside of
GridGain deployment. Default implementation uses built-in Jetty container and REST-style API with
XML and JSON data formats supported.
This addition allows non-Java environments such as browser’s JavaScript or Flex applications
access GridGain functionality.
Top
Wiki
Management and Monitoring
GridGain comes with an extensive collection of JMX MBeans that exposes all major monitoring and
statistical information about all nodes in the grid. This information is available via programmatic
interface as well as through any JMX-compliant Web or standalone GUI viewer such as VisualVM.
GridGain also comes pre-integrated with NewRelic RPM
technology for advanced SaaS-based monitoring capabilities.
Top
Wiki
|