http://www.gridgain.com/images/gg_main_logo.jpg

GridGain is a software middleware that enables development of high performance compute and data intensive distributed applications for real-time Big Data processing.

This book provides an in-depth knowledge on how to use GridGain software.

About this Book

This book is a current work in progress by GridGain team. Chapters are written often not in a logical order and until this book is finished we apologize for this inconvenience.

1. Introduction

http://www.gridgain.com/images/faces.gif

First of all, the whole GridGain team thanks you for picking up this book and devoting your time to know more about GridGain project. Although this book is being written primarily by Nikita Ivanov and Dmitri Setrakyan - all of us here at GridGain are pitching in with reviews, tests, proof-reading, examples and creative ideas.

GridGain project has been an amazing journey for all of us to this point and as you read these lines we are continuing our work on adding new features and improving existing ones, fixing bugs (happens to the best of us) and keep thinking on how to make GridGain more enjoyable and productive to use.

GridGain open source project started in the spring of 2005 with just Nikita and Dmitriy working on it in their spare time. We’ve managed to get our first official release out in the summer of 2007. In three years since then - in the late 2010 (when this introduction is written) - our software is now starting every 10 seconds around the globe and undoubtedly is one of the most popular distributed programing frameworks in JVM ecosystem - so we must be doing something right.

This is even more exciting for us since GridGain Systems is an engineering company first and foremost. We remain small as we believe in small "surgical" teams and every member of our company still writes code (some of us less so as we need to travel, speak and write books like this one - which we enjoy greatly). We’ve visited more than 50 conferences and Java Use Groups around the globe in the last 3 years to talk about GridGain - and we are grateful to each and everyone of you who came up to our talks. That was the only "marketing" that we could afford but hour and a half was usually enough to convince folks to try GridGain out. In fact, where else you could see a full-fledged MapReduce application running on multiple nodes written from scratch in front of your eyes in less than 5 minutes?

We want to thank you again and we hope that you’ll find this book useful and effective guide for discovering GridGain.

1.1. What is this book about?

This book is about how to use GridGain software to develop innovative distributed compute and data intensive applications that run on any managed infrastructure - from a simple laptop on which this book was written to a large grids and all types of clouds. When developing with GridGain and reading this book you can use both Java, Scala or Groovy programming languages. At the time of this writing - GridGain was the only distributed computing middleware with native Scala support.

This book does not replace API manual references, and you would still need to consult them from time to time for up to date method signatures, parameter description, etc. As we are always saying in our project - documentation is the code too and we pay great deal of attention to API References (Javadoc, Scaladoc and Groovydoc) - in fact we have one of the best organized and maintained code level documentation among any relevant projects.

This book is short considering its subject - and it is on purpose. We strongly believe that one of the main reason for slow adoption of grid and cloud computing in the last decade was over-complication and unnecessary "dramatization" of entire subject. In fact, the original idea behind GridGain came from rather unproductive experience developing application with Globus toolkit - innovative piece of software for early 90s but awfully over-engineered and out of place by the turn of the century with rapid advancements in server-side JVM-based programming.

Note We’ve designed this book in a way that you can take a first pass over it during the weekend and have a pretty good grasp on all major APIs and concepts. As you continue to work with GridGain you’ll be coming back to specific chapters for more details and to refresh on some of the less obvious parts. This is a perfectly normal way to read this book and we highly encourage it.

As you discover in this book the distributed computing is not necessarily complex or unwieldy as it may seem from the outside - in fact, most of its concepts are readily familiar for the most of you. Where it was usually getting complicated in the past is the tooling and framework support. That was a sore state of the affairs for a long time - most of engineering community just accepted that the distributed programming (grid and cloud computing included) just has to be "involved" because…. it is so, right?

In reality - it’s largely inaccurate. With the right tooling and framework support the distributed computing can be relatively simple and very productive. One of the main ideas in this book is to try to convince you that it is so. Every month or so we at GridGain receive email or a forum post where someone relates his or her experience of downloading GridGain during the day and by the midnight having the first MapReduce application running on Amazon EC2. By the time you finish this book it won’t seem like an exaggeration and you will have your first application running on EC2 much, much quicker.

This book covers GridGain starting with version 3.0 and provides in-depth manual for all three main technologies that are tightly integrated into GridGain cloud application platform (as of October 2010 GridGain is the only distribute middleware that provides all three technologies in the same platform - let alone for Java, Scala and Groovy languages):

  • computational grids (a.k.a. MapReduce)

  • data grid (a.k.a. distributed caching)

  • zero deployment with auto-scaling (a.k.a. elasticity on clouds)

Each of these main topics is covered in depth with plenty of examples in both Java, Scala and Groovy.

All in all, this book’s character perfectly reflects on what we think modern distributed programming should be - simple, effective and amazingly productive.

That is if you use GridGain…

1.2. About Authors…

http://www.gridgain.com/images/nivanov_dsetrakyan.png

This book is written primarily by Nikita Ivanov and Dmitriy Setrakyan. As always we encourage to contact us directly should you have any questions about this book or about GridGain. You can reach Nikita at nivanov@gridgain.com and Dmitriy at dsetrakyan@gridgain.com.

Both Nikita and Dmitriy have over 30 years of combined experience in distributed programming mostly in Java (some MPI C/C++ back in 90s - ouch) and now Scala. Nikita and Dmitriy were at the beginning of the project and to this day coordinate and lead most of the development work.

They both write code almost every day for GridGain despite heavy travel and frequent speaking engagements. And when they don’t write code - you’ll find them rooting for San Jose Sharks - the favorite hockey team of GridGain.

There are many way to stay in touch with GridGain project:

2. Overview

In this chapter we’ll lay down some of the basic ideas about grid and cloud computing and how GridGain fits into it. The goal of this chapter is to make sure we are on the same page with you, the reader, as far as fundamentals of grid and cloud computing (and you won’t believe how far apart we can be on these…)

2.1. What is GridGain?

In a nutshell - GridGain is a JVM-based middleware software that enables the development of compute and data intensive High Performance Distributed Applications. Applications developed with GridGain can scale up on any infrastructure - from a single Android device to a large cloud.

GridGain provides two major areas of functionality:

  • Compute Grids

  • In-Memory Data Grids

What is GridGain?

High Performance Cloud Computing

or

GridGain = (Java + Scala + Groovy) * (Compute Grid + In-Memory Data Grid)

On top of that it provides the multitude of surrounding technologies many of which are frequently used by our clients on their own.

With GridGain your applications can:

  • Work in a zero-deployment mode.

  • Scale up or down based on demand.

  • Cache distributed data in data grid.

  • Co-locate data and computations.

  • Run sql queries against cached data.

  • Store and query JSON objects.

  • Speed up task using MapReduce processing.

  • Use distributed thread pools.

  • Distribute the workload on the grid.

  • Use distributed queues and atomics.

  • Effectively exchange messages.

  • Auto-discover all grid resources.

  • Execute closures on the grid.

  • Grid-enable Java, Groovy and Scala code.

  • … and much more

2.2. Why Compute and In-Memory Data Grid?

Compute and In-Memory Data Grid act as two main axiomatic technologies for a modern distributed programming. They are fundamental because they solve two underlying problems faced by any distributed system:

  • distribution of computations

  • distribution of data

I always like to provide this analogy: every computing device - from Turing machine to the latest iPod - contains memory and a processing unit. Think about it… memory and the processing form the foundation of our computing capabilities. And so is the ability to distribute computations and data form the foundation of distributed programming.

Axiomatic Technologies

Compute and In-Memory Data Grid technologies are fundamental to any distributed systems as they address two underlying principles we use to gain scalability in the distributed context:

  • parallelization of computations

  • parallelization of data storage & access

And just like in late 1960s we’ve had first "system on the chip" where memory and processing units were finally integrated and combined on the same chip providing for cheaper, more energy efficient and much faster overall systems - GridGain has pioneered integrated middleware that combines compute and in-memory data grids in one cohesive and integrated distributed middleware software. This has resulted in similar benefits of simplified programming model, easier applicability and unified configuration and management.

Compute and in-memory data grids are the key topics in this book and we’ll talk a lot more about these two in the following chapters.

2.3. Why High Performance and Cloud Computing?

You noticed in the previous chapter that we call GridGain as a software middleware for developing High Performance Cloud Computing application. But why we focus on High Performance and what do we mean by that? And what does it have to do with Cloud Computing?

Important
GridGain Philosophy
The term High Performance Cloud Computing really came in the 2010 after almost 5 years of GridGain development. We believe it reflects perfectly the design goals that we originally had. Many, if not all, of GridGain’s features, designs and approaches stem from these goals.

Let’s talk about Cloud Computing first.

2.3.1. Cloud Computing

Despite all the buzz about cloud computing we believe strongly that from the software development perspective the cloud computing is almost synonymous with a traditional distributed programming.

In fact, that “almost” above accounts simply for a fact that unlike the traditional data centers, grids and clusters of the last decade clouds offer more fine grained resources virtualization and more management options. In a nutshell - your application development principles remain largely the same - but you have more options and more choices in how your application is deployed and how it utilizes available computing resources. Rest of the parallel distributed programming challenges of the last 25 years remain fully intact.

Note
10 Years Ago…
While ten years ago the most you should have accounted for was a new server coming up in your local grid - today you need to be prepared for not just a new server but an extra CPUs or extra disk storage or extra RAM appearing for your application that is snapshot and migrated potentially half way across the globe (just look at RackSpace Flavors, for example).

Still, the absolutely majority of the problems and challenges you are facing today while developing distributed software systems coalesce around parallelization of computing and data high availability in the distributed context - both are which are absolutely critical for any scalable distributed software system.

So, when we say Cloud Computing we mean Distributed Computing plus a few new important details.

Important
When We Say "Cloud Computing".
Cloud Computing = Distributed Computing + Data Center Virtualization

2.3.2. High Performance (HP)

High Performance aspect is equally interesting. When we present about GridGain on the conferences we inevitably get asked about this… why High Performance and Cloud in the same sentence?

The answer is very simple: not every cloud application needs to be high performance. In fact, most of today’s cloud applications (i.e. the application that are deployed on the clouds) are not high performance.

Note
Cloud Applications
Most of today cloud applications (i.e. application deployed in the clouds) are not high performance.

We use the term High Performance to specifically categorize applications that use distribution as means for processing parallelization, i.e. to achieve the scalability and/or performance that is theoretically unattainable on a single processing unit.

On the other hand, distributed applications that are not High Performance use cloud deployment as more convenient or economical deployment option without much need for improved scalability or performance.

Note
High Performance vs. Not High Performance

The distinction is very important:

  • HP applications use (cloud) distribution to achieve the scalability and/or performance that is theoretically unattainable on a single processing unit.

  • Non-HP applications use (cloud) distribution as more convenient or economical deployment option.

Naturally, some HP applications use cloud deployment because of convenience and economy as well.

So, tieing it all together GridGain is a High Performance Cloud Application platform because:

  1. It is designed to dramatically simplify the development of distributed software systems, including those that are deployed in the clouds

  2. It is designed to provide extreme scalability and performance for the software systems in the distributed context

2.3.3. Real-Time Cloud Applications

For another take on High Performance Cloud Application look at this blog entry I wrote in the middle of 2011:

Real-Time - A New Era of Cloud Applications

There’s a significant shift that has been happening in the last 12 months for many, if not all, BigData and BigCompute cloud applications - a shift to real-time processing.

Note This shift is nothing short of tectonic change and it is disrupting many software design approaches that are utilized today.

Now, when we talk about real-time processing we, of course, mean a near real-time (nR/T) since nothing can be really real-time in JVM world. Essentially, anything that can be processed within a reasonable user response time expectation (typically no longer than a coupe of seconds) can be considered a real-time for enterprise applications.

…Many analysts first got a hunch of this change when Google decided to drop a batch-oriented MapReduce design towards more real-time approach in their search implementations with what they call a Streaming MapReduce. Facebook followed earlier this year with dumping Hadoop-like processing in favor of different design that would have finally allowed them to tackle real-time performance.

Now, why all the fuss?

Fundamentally, the answer is pretty simple. First, just look around at devices and services you use every day: your TV, your iPhone or Android, Google or Bing, Facebook and Twitter, eBay and Amazon… Apart from slow internet connections what was the last time you needed to wait for 10 or even 5 seconds to get your result?

Your TV switches program instantly, Google and Bing return search results within a few seconds at most, almost all of the apps on iPhone and Android work in real-time (or it seems so), eBay processes your bids seemingly in real-time, and Amazon can put suggestions for you to purchase instantly. So as everyday users of these devices and services we are accustomed to instant response or… a real-time capabilities of these services.

However, when we apply the same expectation to today’s enterprise and business applications the picture is very different. And while delays in consumer devices and services lead to mostly frustrations - the delays in business applications often lead to broken business processes and significant revenue loss. Just a few real-life examples we at GridGain have witnessed:

  • In insurance industry many complex products cannot be currently priced or quoted on the spot (i.e. while having customer on the phone) because they require compute and data intensive processing and are usually done overnight. Sales reps have to hang up on customer and promise him to call back with the numbers next day (or worse - send in a letter).

    Up to 30% of customers lost due to this awkward process.

  • In investment banks and hedge funds automated or algorithmic trading is often done on models that are regenerated overnight or even less frequently - typically as part of pre-trade activity. Options and futures are prime examples… If market conditions change beyond the model’s parameters from pre-trade the auto-trading may be stopped all together since models are no longer valid - hence the loss of the revenue. What’s even worse is that less than critical deviation on the market are not accounted in rigid models and revenue is lost still even if trading continues.

Note Quite simply - inability to maintain complex quantitative financial models live in real-time is the main reason for this obvious hole in otherwise highly effective financial world.

But how do you implement complex business algorithms in real-time?

The answer is the ability to massively parallelize the business algorithm in such a way that its processing happens entirely in memory and can linearly scale up (and down) on demand. I call it a PDC theorem as in " Processing, Data, and Co-Location".

The idea behind PDC is pretty simple. Real-time response in a highly distributed system is not achievable unless the following 3 rules are followed:

  1. Processing must be distributable for in-memory computation

  2. Data storage must be distributable (i.e. partitioned) for in-memory storage

  3. Co-location must be ensured between processing and data units to provide locality of remote operations

Few important notes:

  • We, of course, are talking about business or perceptual real-time (a.k.a. Near Real-Time or nRT) and not about hardware real-time. Perceptual real-time response is not well defined but you can conceptually visualize it as the time the user of the system willing to wait for the response that he or she expects right away… In most cases it means few seconds or less. In rarer cases like FOREX trading, for example, the real-time would mean microseconds.

  • It is critically important that your processing supports algorithmic parallelization. Not all tasks can be parallelized and therefore not all tasks can be optimized for real-time processing. However, many of the typical business and social graph tasks can be split into multiple sub-tasks executing in parallel – and therefore are trivially parallelizable.

  • Data have to be partitioned and stored in-memory. Any outside calls to get data from NoSQL storage, file systems like HDFS, or traditional SQL storage renders any real-time attempts useless in most cases. This is one of the most critical design element and it is often overlooked. In other words – in no time the remote processing should escape the boundaries of the local JVM it is executing on.

  • Co-location of the processing and data (a.k.a affinity-based routing referring to the fact that should be an affinity between the computation and the data this computation needs) is the main mechanism to ensure that there is no noise data transfer between remote nodes where a task is being processed. Such unnecessary data transfer will violate the locality principle of the remote operations making real-time processing often unachievable.

It’s also quite obvious that PDC theorem doesn’t guarantee the real-time processing – it merely states these three rules are necessary but not enough on their own for a real-time response. Latencies of the atomic remote operations will often dictate whether or not real-time response is achievable in practice.

It is interesting to note that combination of PDC and CAP theorems really defines the fundamentals of high performance distributed programming today.

We at GridGain have been working on real-time BigData and BigCompute processing for several years now. These ideas led to develop the first middleware that natively combines both Compute Grid and In-Memory Data Grid into one product - making an ideal middleware software to build real-time cloud applications.

Note
GridGain
Using GridGain you can easily build systems that span 100s and 1000s of nodes while maintaining all necessary data cached in-memory and all computational processing fully parallelized and co-located.

2.4. Grid and Cloud Computing

To start off this sub-chapter I’ll tell you one story that happened to me few years ago.

I was presenting about GridGain at Java User Group at Dayton, Ohio. For those of you who don’t know - Dayton, OH is essentially a "sleeping quarters" for Wright-Patterson Air Force Base, one of the largest military research facilities in the world. It employees almost 30,000 people and headquarters massive Air Force Institute of Technology and Air Force Research Laboratory just to boot and it is known to have one of the largest super-computing centers in the world too. So, I’ve had few people from the base on my presentation…

Note Wright-Patterson Air Force Base is rumored to be the center of UFO research or so many people believe after Nevada’s Area 51 was closed few decades ago…

After the presentation I’ve chatted with some folks and conversation drifted into general topic of grid computing and what we all understand by that (and by relatively new back then concept of cloud computing). It was a real surprise to me (to say the least) that I got three very different answers from three guys working on the base - essentially working in one big company.

One answer was that grid is really nothing new alluding to parallel Fortran of almost 50 years old, another one was more in line with common understanding of grid computing being just a new re-tooling of traditional parallel processing, and yet another answer was that the whole thing is just a hype and multi-core CPUs will displace it all together in 5-10 years max.

I remember on my flight back I was a bit perplexed by how far apart those guys were while working side by side in albeit large organization and not just on technicality - but on fundamental view of distributed programming (grid or cloud - doesn’t matter). Starting with my next presentation onward I’ve made a rule to always state what I believe about grids and clouds to at least make sure we have common frame of reference. Whether or not you agree with - is another question…

So, here’s my take. I don’t particularly like terms grid and cloud computing. There’s nothing that resembles a "grid" in grid computing and obviously there’s nothing that is performed in the "cloud" when it comes to cloud computing. Both marketing terms "grid computing" and "cloud computing" represent slight variations of traditional distributed programming.

Note Put in more canonical form, if you have more than one computing resource working on the same problem in parallel - you have a grid and you are engaged in grid computing. If these computing resources (all or in part) are virtualized and available to you on demand - they represent a cloud and you are doing cloud computing.

In the nutshell - that’s it.

As you can see the difference between grid computing and cloud computing is only in representation of computing resources which in most cases should be irrelevant. The lines between grids and clouds are getting blurrier by the day and we use both terms interchangeably throughout this book (as long as context is clear).

It is, however, important to know the difference and new challenges that cloud infrastructure brings to the table for you as a software developer.

2.5. IaaS, PaaS, and SaaS

Only at the pick of the hype around the cloud computing can you get a chapter named like that…

Nowadays these terms are thrown often without much regard or understanding and while IaaS and SaaS are somewhat well defined, the PaaS is something that poorly defined if at all. Let us try to define these terms and see where GridGain fits in the picture.

Note Understanding this nomenclature is not essential for everyday usage of GridGain and most of the concepts in GridGain stay clear from these high level marketing terms. Yet - a cursory look is worth while and it will help you navigate the plethora of marketing literature that surrounds the cloud computing today.

Picture below provide basic overview of how GridGain related to IaaS and PaaS:

http://www.gridgain.com/images/iaas_paas.png

2.5.1. IaaS - Infrastructure As a Service

IaaS stands for "Infrastructure As a Service". IaaS is often (wrongly) synonymous with cloud computing. It essentially means providing virtualized computing resources as a services. Think of Amazon EC2, for example.

Note And who would ever thought that a nascent online book seller and one of the few dotcom survivors would spearhead the revolution that is so much bigger that just an online retailing - a revolution that is radically changing the way we think about information systems!

Amazon had its own data center and sometime ago decided to earn extra money by renting out their often unused computing capacity. So, they put a hardware virtualization (like VMs from VMWare or Citrix) on their servers and exposed the management of these VMs via Web browser so that anyone could create an account and start managing the VM instances.

In a nutshell - that’s all there’s to it.

It is important to note that IaaS (or clouds) can be public, private or hybrid. Public clouds are based on infrastructure that is publicly available, i.e. IaaS provides generally gives access to its data center to anyone. Private clouds are built by individual organizations for their internal use. And hybrid clouds exhibit both types of behavior.

While public and private clouds are usually a physical infrastructures with difference being who gets the access to the clouds - the hybrid clouds are almost always a virtual clouds. Similar the Virtual Private Networks (VPN), the virtual clouds are created on top of one or more physical public and private clouds and provide its end users seamless cloud/IaaS transparency. These hybrid clouds are often created for business applications by either PaaS or software middleware like GridGain.

Note Clouds can be public, private, and hybrid. While public and private clouds are usually physical infrastructures - the hybrid clouds are always virtualized clouds built via software on top of one or more physical public or private clouds. Analogy between hybrid clouds and VPN is almost one-to-one.

There are plenty of IaaS provides all following in steps with Amazon all providing different sets of functionality and different twists. A quick look at Amazon AWS offerings will show how complex and diverse it has become in recent years.

It is also a strong sign that hardware virtualization and services around it will be rapidly advancing. We are just few years away from being able to add an extra core to our application or acquire extra network bandwidth capacity on demand for our system’s pick time and scale it down the moment it doesn’t need it anymore.

These capabilities will undoubtedly change the way we develop our applications.

2.5.2. PaaS - Platform As a Service

PaaS stands for "Platform As a Service". PaaS essentially provides an abstraction over various IaaS providers and adds additional services. Additional services mostly consist of some set of deployment and provisioning services aiming at supporting application multi-tenancy, i.e. ability to host multiple applications in secure isolation on the same VM or a set of VMs (don’t confuse it with Java VM).

Note
VMs vs. Java VMs

Throughout this book we’ll use terms VM to denote hardware virtualization virtual machine (VM). To denote Java Virtual Machine we’ll use term JVM.

The problem with PaaS is that no one has a precise definition of what PaaS really is… Its definition is largely based on specific vendor capabilities. There is, however, one clear trait of PaaS: it abstracts out its users from worrying about specifics of various IaaS providers and differences in their operations and functionality.

PaaS also sprung out the notion of DevOps - a symbiosis of application development and traditional IT functions. It is often said that PaaS provides abstraction over IaaS and DevOps services.

Note PaaS provides abstraction over IaaS and DevOps services.

Most of the PaaS vendors today (early 2011) concentrate mainly on providing deployment and application provisioning services. PaaS from VMWare/Spring, Google AppEngine, CloudBees and RedHat/JBoss, for example, do exactly that. They all allow you to take your whole application and through a serious of manual steps move or deploy it onto IaaS infrastructure with some limited, if any, scale out functionality.

PaaS as a technology today is in its very early stages. It is clear that PaaS as a concept and technology will likely see the most amount of changes in the coming years - and by the time you read this book some of these changes may significantly affect your understanding of what PaaS can do for you.

Note IaaS abstracts out data center and exposes it as a service.
PaaS abstracts out IaaS providers and adds DevOps.

2.5.3. SaaS - Software As a Service

IaaS stands for "Software As a Service". Surprisingly for most casual observers, SaaS has relatively nothing to do with cloud computing or IaaS and PaaS specifically. Essentially, if you run your "application" in the browser - it is SaaS application.

That’s it.

Historically, SaaS came from ASP (Application Service Provider) businesses and it shares almost everything with ASP (except for more catchy name). Interestingly enough SaaS was the first "as a service" abbreviation long before IaaS and PaaS came to light. But when hardware virtualization and surrounding services became popular, "as a service" moniker was a logical progression for providing computing Infrastructure and Platforms "as a service".

2.5.4. How GridGain Fits?

By looking at the picture few paragraphs above you can see that GridGain can easily work directly with IaaS (like Amazon AWS) or through the PaaS. In fact, GridGain is completely independent from either PaaS or IaaS - it can work without any specific cloud or grid or cluster infrastructure.

This ability, this lightweight approach, is one of the key design advantages of GridGain. It provides you, the developer, exactly the same services whether you run GridGain on a simple Android device, a laptop, few servers, small grid or a large cloud.

So, you can select the desired DevOps approach for your application and GridGain will happily support it!

2.6. License

GridGain is dual-licensed:

  • GridGain Community Edition is free open-sourced and licensed under GPL version 3.

  • GridGain Enterprise Edition is closed-sourced and commercially licensed.

EULA (end-used license agreement) for Enterprise Edition is available at GridGain root installation folder after you install GridGain and it is also available on download page on the http://www.gridgain.com website.

GridGain System also provides OEM, Enterprise an Academic licenses with further details available upon request at sales@gridgain.com

Important
GridGain 1.x-2.x
Note also that previous version of GridGain 1.x-2.x were licensed under LGPL.

Note that throughout the book we don’t directly distinguish between features available in Enterprise and Community Edition. In cases where it is important we make a note.

2.7. Support

The best way to get a free support on GridGain software is to dip into our active community with wealth of information on our free support forum: http://jive.gridgain.org. This forum is closely monitored by GridGain System’s engineers and we try our best to provide free support there when applicable.

GridGain Systems, Inc., as a company behind GridGain project, also provides full spectrum of commercial services around GridGain software including:

  • Commercial subscription for Community and Enterprise editions

  • Consulting and professional services

  • OEM licensing

  • "Bronze", "Silver" and "Gold" levels of support

  • GridGain Training seminars

When it comes to training and support - we at GridGain have a very simple philosophy: we do our own heavy lifting. We believe that we are the best people to support our own software and who is it better to learn about GridGain from but the people who develop it daily?

All information about services provided by GridGain Systems can be found at http://www.gridgain.com/services.html

3. Taste of GridGain

Before we start digging into nitty-gridy details of GridGain functionality let’s quickly look at what we can accomplish in 10-15 minutes. We’ll create one application in Java and one in Scala utilizing Compute and In-Memory Data Grids.

Let’s see how quickly we can do both.

Note
Installation

We have a whole chapter dedicated to installation. However, installation of GridGain is rather trivial - and if you haven’t done it already here is a quick 3 steps:

  • Download GridGain ZIP archive from http://www.gridgain.com/downloads.shtml

  • Unzip ZIP archive into installation folder in your system

  • Set GRIDGAIN_HOME environment variable to point to installation folder

You are done!

Note
Unix

For the rest of this chapter (and most of this book) we will assume the JetBrain IDEA 10 and Unix environment like Unix, Linux or Mac OSX. Most of the steps and instructions apply almost verbatim to Eclipse, Emacs or NetBeans running on Windows with obvious changes to paths and certain project management capabilities of IDEs.

Now - the first step when developing distributed application is to have a… grid. With GridGain you can have it anywhere but for the purpose of this example we will create one right on the same computer where you will be running the main example.

Note
Multiple GridGain Nodes

One the coolest capability of GridGain is its ability to run multiple GridGain nodes on the same computer or even… inside the same JVM. Think about it - you can launch the entire cloud in a single JVM and enjoy local debugging of your application while it is running in the local virtualized cloud. Now - that’s pretty powerful and that’s exactly how we at GridGain to debug and test most of our complex internal distributed logic.

To have a local grid we are going to have two GridGain nodes running standalone and the third node will be embedded into our applications that we will develop. When our application starts - it will join the grid (i.e. join the topology) making the grid of three nodes.

Open the command shell and assuming you are in GRIDGAIN_HOME folder just type this:

$ bin/ggstart.sh

If everything is fine (you set GRIDGAIN_HOME environment variable properly and you have Java installed) you will see the output similar to this:

[11:52:50]   _____     _     _______      _         ____   ____
[11:52:50]  / ___/____(_)___/ / ___/___ _(_)___    |_  /  / __/
[11:52:50] / (_ // __/ // _  / (_ // _ `/ // _ \  _/_ <_ /__ \
[11:52:50] \___//_/ /_/ \_,_/\___/ \_,_/_//_//_/ /____(_)____/
[11:52:50]
[11:52:50]  ---==++ HIGH PERFORMANCE CLOUD COMPUTING ++==---
[11:52:50]                ver. x.x.x-DDMMYYYY
[11:52:50]  Copyright (C) 2005-2011 GridGain Systems, Inc.
[11:52:50]
[11:52:50] Quiet mode.
[11:52:50]   ^-- To disable add -DGRIDGAIN_QUIET=false or "-v" to ggstart.{sh|bat}
[11:52:50] << Enterprise Edition >>
[11:52:50] Daemon mode: off
[11:52:50] Language runtime: Java Platform API Specification ver. 1.6
[11:52:50] Remote Management [restart: on, REST: on, JMX (remote: on, port: 49113, auth: off, ssl: off)]
[11:52:50] GRIDGAIN_HOME=/Users/nivanov/svnroot/gg-trunk
[11:52:50] (!) SMTP is not configured - email notifications are off.
[11:52:50] (!) Cache is not configured - data grid is off.
[11:52:53] Topology snapshot [nodes=1, CPUs=4, hash=0xB12A5F18]
[11:52:54] License info:
[11:52:54]     Licensed to 'GridGain Systems, Internal Development Only' on Feb 3, 2011
[11:52:54]     License [ID=7D5CB773-225C-4165-8162-3BB67337894B, type=ENT]
[11:52:54]       ^--License limits [<none>]
[11:52:54] System info:
[11:52:54]     JVM: Apple Inc., Java(TM) SE Runtime Environment ver. 1.6.0_26-b03-383-11A511
[11:52:54]     OS: Mac OS X 10.7.1 x86_64, nivanov
[11:52:54]     VM name: 4837@NIKITA-IVANOVs-MacBook-Pro.local
[11:52:54] Local ports used [TCP:8080 TCP:47100 UDP:47200 TCP:47300]
[11:52:54] GridGain started OK
[11:52:54]   ^-- [grid=default, nodeId8=a26743ce, order=1315939970778, CPUs=4, addrs=[192.168.1.103]]
[11:52:54] ZZZzz zz z...

Start another command shell and type the same command again:

$ bin/ggstart.sh

This time you get almost identical output with few important changes:

[12:34:09]   _____     _     _______      _         ____   ____
[12:34:09]  / ___/____(_)___/ / ___/___ _(_)___    |_  /  / __/
[12:34:09] / (_ // __/ // _  / (_ // _ `/ // _ \  _/_ <_ /__ \
[12:34:09] \___//_/ /_/ \_,_/\___/ \_,_/_//_//_/ /____(_)____/
[12:34:09]
[12:34:09]  ---==++ HIGH PERFORMANCE CLOUD COMPUTING ++==---
[12:34:09]                ver. x.x.x-DDMMYYYY
[12:34:09]  Copyright (C) 2005-2011 GridGain Systems, Inc.
[12:34:09]
[12:34:09] Quiet mode.
[12:34:09]   ^-- To disable add -DGRIDGAIN_QUIET=false or "-v" to ggstart.{sh|bat}
[12:34:09] << Enterprise Edition >>
[12:34:09] Daemon mode: off
[12:34:09] Language runtime: Java Platform API Specification ver. 1.6
[12:34:09] Remote Management [restart: on, REST: on, JMX (remote: on, port: 49114, auth: off, ssl: off)]
[12:34:09] GRIDGAIN_HOME=/Users/nivanov/svnroot/gg-trunk
[12:34:09] (!) SMTP is not configured - email notifications are off.
[12:34:09] (!) Cache is not configured - data grid is off.
[12:34:10] Node JOINED [nodeId8=a26743ce, addr=[192.168.1.103], CPUs=4] 1
[12:34:12] Topology snapshot [nodes=2, CPUs=4, hash=0xC287D25B] 2
[12:34:14] (!) Jetty failed to start (retrying every 3000 ms). Another node on this host?
[12:34:14] License info:
[12:34:14]     Licensed to 'GridGain Systems, Internal Development Only' on Feb 3, 2011
[12:34:14]     License [ID=7D5CB773-225C-4165-8162-3BB67337894B, type=ENT]
[12:34:14]       ^--License limits [<none>]
[12:34:14] System info:
[12:34:14]     JVM: Apple Inc., Java(TM) SE Runtime Environment ver. 1.6.0_26-b03-383-11A511
[12:34:14]     OS: Mac OS X 10.7.1 x86_64, nivanov
[12:34:14]     VM name: 5227@NIKITA-IVANOVs-MacBook-Pro.local
[12:34:14] Local ports used [TCP:47101 UDP:47200 TCP:47301]
[12:34:14] GridGain started OK
[12:34:14]   ^-- [grid=default, nodeId8=b8aac044, order=1315942449848, CPUs=4, addrs=[192.168.1.103]]
[12:34:14] ZZZzz zz z...

This output is a bit more interesting as it shows that both nodes discovered each other:

1 Event of node joining the topology
2 Snapshot of the topology showing total number of nodes and CPUs

So at this point we have two nodes running (they are simply idling since we are not processing anything). Notice how didn’t have specify any configuration properties or configure anything at all. Everything works out-of-the-box as expected.

Note
SPIs

As you will learn later in the book GridGain is composed of almost a dozen of different SPIs each providing pluggable kernel-level functionality. Two of these SPIs are discovery and communication SPIs that are responsible for maintaining distributed topology and exchanging the data between nodes.

When you start GridGain with default configuration (like we just did) it starts with default SPI implementations (IP-multicast discovery and TCP/IP-based communication respectively) - and they work perfectly in our case.

3.1. First GridGain Java App

Now that we have topology set we are going to switch to writing the actual code of our first application. Code examples below don’t have any dependencies on IDE - and you can follow up using any text editor of your choice.

The first app we are going to write would be computational MapReduce. It will calculate number of non-space characters in a given string by splitting the string into individual words, calculating word’s length on the remote nodes and aggregating results back.

We’ll use FP-based approach that GridGain natively support (even in Java):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
import org.gridgain.grid.*;
import org.gridgain.grid.typedef.*;
import java.util.*;
import static org.gridgain.grid.GridClosureCallMode.*;

public class GridFunctionalMapReduceExample {
    public static void main(final String[] args) throws GridException {
        if (args.length == 1 && args[0].length() > 0)
            GridFactory.in(new GridInClosureX<Grid>() {
                @Override public void applyx(Grid g) throws GridException {
                    System.out.println("Length of input argument is " + g.reduce(
                        SPREAD,
                        GridFunc.<String, Integer>cInvoke("length"),
                        Arrays.asList(args[0].split(" ")),
                        GridFunc.sumIntReducer()
                    ));
                }
            });
    }
}

Not that there are many ways you can write this particular program in GridGain:

  • We can use direct Grid Task and Grid Jobs approach

  • We can use AOP-based grid enablement

  • We can GridGain 3.0 imperative APIs

  • Even using GridGain FP APIs there are different ways to code this program

FP-based approach above, however, yields probably the shortest program but can be initially confusing since Java isn’t really supporting FP natively. So let me explain step by step how this works.

Lines 1-4
We start by importing all necessary classes and constants into the scope.

Line 7
We define a main(…) method that will take input string.

Line 9
Method GridFactory.in(…) simply takes a closure and:

  • Starts the default GridGain node (custom configuration can be passed in as a parameter)

  • Executes passed in closure

  • Stop the GridGain node

So, essentially, it allows for a quick execution of the piece of code within context of a running GridGain node.

Line 9, 10
As a parameter to GridFactory.in(…) method call we are passing a newly created closure of type GridInClosureX<Grid>. The body of this closure will be executed within context of a running GridGain node.

Line 11-15
GridInClosureX<Grid> has one method applyx(Grid) which passed a Grid interface instance. Inside of applyx(Grid) we call reduce(…) method that performs MapReduce operation. It accepts four parameters:

  • Distribution mode. In our case we use GridClosureCallMode.SPREAD to spread the processing to all available nodes

  • A closure to execute on every remote node. We use utility method GridFunc.cInvoke(…) that creates closure via reflection based on the method name

  • A list of arguments that will be passed to each closure on the remote nodes

  • Reducing closure that takes results from the remote nodes and aggregates them into one final result. We, again, use pre-defined integer accumulator returned by GridFunc.sumIntReducer() method.

The logic of this computational MapReduce should be clear by now. We split input string by spaces into individual words, we then send every word to a remote node where a method length will be called on that word, and results of these calls will be returned back to reducer that will simply sum them up.

Now that we have a basic understanding of what is happening inside of this code let’s run this example. Depending on whether or not you are using IDE, Maven or Ant build you simply need to include gridgain.jar file that’s located in GRIDGAIN_HOME directory and all JARs under GRIDGAIN_HOME/libs folder to your classpath. Also, if you use IDEs - make sure that either environment variable GRIDGAIN_HOME is inherited by IDE process or system property with the same name GRIDGAIN_HOME is setup in your runtime configuration.

When it’s all set and done I’ve passed "GridGain is awesome" string my IDEA 10 runtime configuration and the got the following output in my IDEA output window:

/System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home/bin/java -ea -DGRIDGAIN_HOME=...
[12:28:11]   _____     _     _______      _         ____   ____
[12:28:11]  / ___/____(_)___/ / ___/___ _(_)___    |_  /  / __/
[12:28:11] / (_ // __/ // _  / (_ // _ `/ // _ \  _/_ <_ /__ \
[12:28:11] \___//_/ /_/ \_,_/\___/ \_,_/_//_//_/ /____(_)____/
[12:28:11]
[12:28:11]  ---==++ HIGH PERFORMANCE CLOUD COMPUTING ++==---
[12:28:11]                ver. x.x.x-DDMMYYYY
[12:28:11]  Copyright (C) 2005-2011 GridGain Systems, Inc.
[12:28:11]
[12:28:11] Quiet mode.
[12:28:11]   ^-- To disable add -DGRIDGAIN_QUIET=false or "-v" to ggstart.{sh|bat}
[12:28:11] << Enterprise Edition >>
[12:28:11] (!) SMTP is not configured - email notifications are off.
[12:28:11] (!) Cache is not configured - data grid is off.
[12:28:11] Daemon mode: off
[12:28:11] Language runtime: Java Platform API Specification ver. 1.6
[12:28:11] Remote Management [restart: off, REST: on, JMX (remote: off)]
[12:28:11] GRIDGAIN_HOME=/Users/nivanov/svnroot/gg-trunk
[12:28:13] Node JOINED [nodeId8=b324d6c3, addr=[192.168.1.103], CPUs=4]
[12:28:15] Topology snapshot [nodes=2, CPUs=4, hash=0xCFFF5AA0]
[12:28:15] Node JOINED [nodeId8=e854d435, addr=[192.168.1.103], CPUs=4]
[12:28:15] Topology snapshot [nodes=3, CPUs=4, hash=0xF7C10287]
[12:28:15] License info:
[12:28:15]     Licensed to 'GridGain Systems, Internal Development Only' on Feb 3, 2011
[12:28:15]     License [ID=7D5CB773-225C-4165-8162-3BB67337894B, type=ENT]
[12:28:15]       ^--License limits [<none>]
[12:28:15] New version is available at www.gridgain.com: 3.2.1c.05082011
[12:28:15] System info:
[12:28:15]     JVM: Apple Inc., Java(TM) SE Runtime Environment ver. 1.6.0_26-b03-383-11A511
[12:28:15]     OS: Mac OS X 10.7.1 x86_64, nivanov
[12:28:15]     VM name: 9791@NIKITA-IVANOVs-MacBook-Pro.local
[12:28:15] Local ports used [TCP:8080 TCP:47102 UDP:47200 TCP:47302]
[12:28:15] GridGain started OK
[12:28:15]   ^-- [grid=default, nodeId8=1cdf9436, order=1316028492427, CPUs=4, addrs=[192.168.1.103]]
[12:28:15] ZZZzz zz z...
Length of input argument is 16
[12:28:17] GridGain stopped OK [uptime=00:00:01:363]

As you can see the output is very similar to standalone nodes we’ve started few minutes ago. But in the end we have output of our MapReduce task which says:

Length of input argument is 16

correctly computing number of non-empty characters in input string "GridGain is awesome". Notice also we’ve had topology snapshot with three nodes (as expected). If you check other standalone nodes you will see the similar output to this:

[12:28:13] Node JOINED [nodeId8=1cdf9436, addr=[192.168.1.103], CPUs=4]
[12:28:13] Topology snapshot [nodes=3, CPUs=4, hash=0xF7C10287]
[12:28:17] Node LEFT [nodeId8=1cdf9436, addr=[192.168.1.103], CPUs=4]
[12:28:17] Topology snapshot [nodes=2, CPUs=4, hash=0x71DE65CC]

indicating that when our application started we’ve had three nodes in the topology and when our application completed and stopped we were back to two nodes (i.e. two standalone nodes).

Note
Zero Deployment

Did you notice any deployment steps? Any Ant or Maven build? Did we create any JAR files to copy to remote nodes?

As you probably guessed - the answer is no. We didn’t need to do any of these awkward and expensive steps because GridGain sports pretty unique technology that allows it deploy necessary classes on-demand in a distributed fashion completely transparently to the developer. In fact - you just write the code as if it is completely local and GridGain will take care of proper distribution, versioning, class loading, etc.

Now, let’s advance our example. As you probably noticed we don’t see any evidence on remote nodes that any processing is happening there. In fact, by default GridGain starts in QUIET mode and most of the output is suppressed (if you need to start in normal mode - use -v flag for ggstart.sh script).

Let’s modify our example so that we’ll see what is being process and where:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
import org.gridgain.grid.*;
import org.gridgain.grid.typedef.*;
import java.util.*;
import static org.gridgain.grid.GridClosureCallMode.*;

public class GridFunctionalMapReduceExample {
    public static void main(final String[] args) throws GridException {
        if (args.length == 1 && args[0].length() > 0)
            GridFactory.in(new GridInClosureX<Grid>() {
                @Override public void applyx(Grid g) throws GridException {
                    System.out.println("Length of input argument is " + g.reduce(
                        SPREAD,
                        new GridClosure<String, Integer>() {
                            @Override public Integer apply(String s) {
                                System.out.println("Calculating for: " + s);

                                return s.length();
                            }
                        },
                        //GridFunc.<String, Integer>cInvoke("length"),
                        Arrays.asList(args[0].split(" ")),
                        GridFunc.sumIntReducer()
                    ));
                }
            });
    }
}

We’ve commented out the reflection-based closure and added direct closure creation that prints out what string it is working on and returns its length. If we re-run our application we’ll now get the following output on three nodes:

Remote Node 1

[14:14:27] Node JOINED [nodeId8=6e77be3c, addr=[192.168.1.103], CPUs=4]
[14:14:27] Topology snapshot [nodes=3, CPUs=4, hash=0x89DEB0E3]
Calculating for: is 1
[14:14:31] Node LEFT [nodeId8=6e77be3c, addr=[192.168.1.103], CPUs=4]
[14:14:31] Topology snapshot [nodes=2, CPUs=4, hash=0x71DE65CC]

Remote Node 2

[14:14:27] Node JOINED [nodeId8=6e77be3c, addr=[192.168.1.103], CPUs=4]
[14:14:27] Topology snapshot [nodes=3, CPUs=4, hash=0x89DEB0E3]
Calculating for: awesome 1
[14:14:31] Node LEFT [nodeId8=6e77be3c, addr=[192.168.1.103], CPUs=4]
[14:14:31] Topology snapshot [nodes=2, CPUs=4, hash=0x71DE65CC]

Local node in IDE

[14:14:29] GridGain started OK
[14:14:29]   ^-- [grid=default, nodeId8=6e77be3c, order=1316034866664, CPUs=4, addrs=[192.168.1.103]]
[14:14:29] ZZZzz zz z...
Calculating for: GridGain 1
Length of input argument is 16 2
[14:14:31] GridGain stopped OK [uptime=00:00:01:237]
1 - That’s the output from our closure executing on remote nodes.
2 - That’s the output from the reduction step executing on the local (initiating) node.

Note that local node (i.e. the node running in IDE) performs calculation as well as performing the final reduction step. If we don’t want it to participate in the actual calculation and only perform the final reduction - we can simply change this line:

System.out.println("Length of input argument is " + g.reduce(

to this

System.out.println("Length of input argument is " + g.remoteProjection().reduce(
Note
Consider this…

Look at these 20 lines of code and consider that this application includes:

  • auto topology discovery

  • auto load balancing

  • distributed fail over

  • collision resolution

  • zero code deployment & provisioning

  • pluggable marshalling & communication

Pretty neat, right? And so just in about two dozens lines of code and 10 minutes we’ve got our first MapReduce application running.

3.2. First GridGain Scala App

Now - let’s move to In-Memory Data Grid application and we’ll use Scala for that, more specifically - Scalar, our Scala-based DSL for GridGain.

Note
Scalar DSL

The idea behind Scalar is to simply adopt Java-side APIs for usage in Scala. Scalar by design does not add any additional new functionality to GridGain but adopts Java APIs to Scala. This is a very important point to understand that there’s no additional or left out functionality when you are switching between Java and Scala - 100% of GridGain is available in both languages (nad natively so).

Note that GridGain also comes with Grover - Groovy++ DSL.

Since we are going to use In-Memory Data Grid in this example we need to restart our standalone nodes with enabled data grid. Note that by default there are no caches configured (for obvious performance reasons). GridGain comes with handy example configuration that comes with three caches configurated examples/config/spring-cache.xml.

You can stop existing nodes by simply Ctrl-C and then start them again using:

bin/ggstart.sh examples/config/spring-cache.xml

The output from the nodes is almost the same as in previous example with one notable change:

...
[13:28:05] Configured caches ['partitioned', 'replicated', 'local']
...

indicating that we now have three configured caches that are named based on their type.

Now that we have standalone nodes running with necessary configuration let’s turn to writing our application that will utilize the In-Memory Data Grid (we’ll use shorter term data grid going forward) side of GridGain. We are going to create an application that will populate data grid with a set of key-value pairs and then execute the set of closures where each closure will have an affinity with the specific key in the data grid - and therefore it will be co-located with the data for that key instead of just randomly be executed on some node in the grid.

Important
Affinity Co-Location

Affinity co-location is extremely important use case in real-time processing as it underpins the system design that can scale linearly regardless of the size of the data set.

Scalar-based (Scalar is GridGain’s DSL based on Scala) application looks pretty simple:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
import org.gridgain.scalar.scalar
import scalar._
import org.gridgain.grid.cache.GridCache

object ScalarCacheAffinitySimpleExample {
    /** Number of keys. */
    private val KEY_CNT = 20

    def main(args: Array[String]) {
        scalar("examples/config/spring-cache.xml") {
            val c = grid$.cache[Int, String]("partitioned")

            populate(c) // Comment out on subsequent runs.
            colocate(c)
        }
    }

    private def populate(c: GridCache[Int, String]) {
        (0 until KEY_CNT).foreach(i => c += (i -> i.toString))
    }

    private def colocate(c: GridCache[Int, String]) {
        (0 until KEY_CNT).foreach(i =>
            grid$.affinityRun("partitioned", i,
                () => println("Co-located [key= " + i + ", value=" + c.peek(i) + ']'))
        )
    }
}

Even if you are not familiar with Scala - the code looks pretty self-explanatory. Let’s go line by line like we did for Java example above.

Line 1-3
Necessary imports including import for Scalar

Line 7
Defines number of keys we’ll be storing in the data grid and number of closures we’ll be executing later.

Line 10
Initializes Scalar with the same configuration file as we used for standalone nodes examples/config/spring-cache.xml. Note that initializing Scalar essentially means starting up the local node.

Line 11
We are getting an instance of the cache named partitioned (which is a partitioned cache named so for clarity). Cache is typed for Int keys and String values.

Line 13, 14
Calls functions populate() and colocate() that are defined later.

Line 18-20
Function populate() simply puts key/value pairs into data grid. Node that partitioned cache will store particular key/value pair on one of the nodes (potentially including the local one) as well as on one back up node. Since we have three nodes in the topology (remember - one local node and two standalone noes) - each key/value pair will be store on two nodes.

Note
Running Multiple Time

Note that if you run this example multiple time you need to comment out line 13 since we don’t need to override value that are already in the data grid. Don’t get confused here: even though we are stopping the local node when our application finished and all data that was stored on this node will be lost - the key/value pairs are duplicated on backup nodes (i.e. stored twice in the data grid).

When we start our application again the pre-loading process will optimally reshuffle the data from two existing nodes to new three nodes topology.

Note that number of backup nodes and details of pre-loading process are fully configurable.

Line 22-26
Function colocate() executes number of closures where each closure gets affinity co-located with some key in the data grid. Note that the closure itself simply prints the trace message and uses function peek() that gets the value only if it’s locally available - which should be since we are co-locating closure with the node where data is stored (so called master node).

Note
Affinity Co-Location

The colocate() function is the key functionality here. Look how simple it is to co-locate the computational logic (a closure) with the data this logic need to process (data in cache).

Let’s go ahead and start our application. Starting Scala application is no different than starting Java application (at least if you use IDEs). Local node running in IDEA prints out the following log (abbreviated):

[22:57:01]   _____     _     _______      _         ____   ____
[22:57:01]  / ___/____(_)___/ / ___/___ _(_)___    |_  /  / __/
[22:57:01] / (_ // __/ // _  / (_ // _ `/ // _ \  _/_ <_ /__ \
[22:57:01] \___//_/ /_/ \_,_/\___/ \_,_/_//_//_/ /____(_)____/
[22:57:01]
[22:57:01]  ---==++ HIGH PERFORMANCE CLOUD COMPUTING ++==---
[22:57:01]                ver. x.x.x-DDMMYYYY
[22:57:01]  Copyright (C) 2005-2011 GridGain Systems, Inc.
[22:57:01]
[22:57:01] Quiet mode.
[22:57:01]   ^-- To disable add -DGRIDGAIN_QUIET=false or "-v" to ggstart.{sh|bat}
[22:57:01] << Enterprise Edition >>
...
[22:57:02] Topology snapshot [nodes=3, CPUs=4, hash=0xAB10A0C]
...
[22:57:05] ZZZzz zz z...
Co-located [key= 0, value=0]
Co-located [key= 1, value=1]
Co-located [key= 7, value=7]
Co-located [key= 10, value=10]
Co-located [key= 18, value=18]
[22:57:09] GridGain stopped OK [uptime=00:00:03:171]

and the remote nodes print:

Remote Node 1

[22:56:40]   _____     _     _______      _         ____   ____
[22:56:40]  / ___/____(_)___/ / ___/___ _(_)___    |_  /  / __/
[22:56:40] / (_ // __/ // _  / (_ // _ `/ // _ \  _/_ <_ /__ \
[22:56:40] \___//_/ /_/ \_,_/\___/ \_,_/_//_//_/ /____(_)____/
[22:56:40]
[22:56:40]  ---==++ HIGH PERFORMANCE CLOUD COMPUTING ++==---
[22:56:40]                ver. x.x.x-DDMMYYYY
[22:56:40]  Copyright (C) 2005-2011 GridGain Systems, Inc.
[22:56:40]
[22:56:40] Quiet mode.
[22:56:40]   ^-- To disable add -DGRIDGAIN_QUIET=false or "-v" to ggstart.{sh|bat}
[22:56:40] << Enterprise Edition >>
 ...
[22:56:42] Topology snapshot [nodes=2, CPUs=4, hash=0x99D66AF5]
 ...
[22:57:02] Node JOINED [nodeId8=3a890c51, addr=[127.0.0.1], CPUs=4]
[22:57:02] Topology snapshot [nodes=3, CPUs=4, hash=0xAB10A0C]
Co-located [key= 2, value=2]
Co-located [key= 3, value=3]
Co-located [key= 4, value=4]
Co-located [key= 5, value=5]
Co-located [key= 11, value=11]
Co-located [key= 12, value=12]
Co-located [key= 13, value=13]
Co-located [key= 16, value=16]
[22:57:08] Node LEFT [nodeId8=3a890c51, addr=[127.0.0.1], CPUs=4]
[22:57:08] Topology snapshot [nodes=2, CPUs=4, hash=0x99D66AF5]

Remote Node 2

[22:56:40]   _____     _     _______      _         ____   ____
[22:56:40]  / ___/____(_)___/ / ___/___ _(_)___    |_  /  / __/
[22:56:40] / (_ // __/ // _  / (_ // _ `/ // _ \  _/_ <_ /__ \
[22:56:40] \___//_/ /_/ \_,_/\___/ \_,_/_//_//_/ /____(_)____/
[22:56:40]
[22:56:40]  ---==++ HIGH PERFORMANCE CLOUD COMPUTING ++==---
[22:56:40]                ver. x.x.x-DDMMYYYY
[22:56:40]  Copyright (C) 2005-2011 GridGain Systems, Inc.
[22:56:40]
[22:56:40] Quiet mode.
[22:56:40]   ^-- To disable add -DGRIDGAIN_QUIET=false or "-v" to ggstart.{sh|bat}
[22:56:40] << Enterprise Edition >>
 ...
[22:56:39] Topology snapshot [nodes=1, CPUs=4, hash=0xF15A46A8]
 ...
[22:56:42] Node JOINED [nodeId8=de5c9bb0, addr=[127.0.0.1], CPUs=4]
[22:56:42] Topology snapshot [nodes=2, CPUs=4, hash=0x99D66AF5]
[22:57:02] Node JOINED [nodeId8=3a890c51, addr=[127.0.0.1], CPUs=4]
[22:57:02] Topology snapshot [nodes=3, CPUs=4, hash=0xAB10A0C]
Co-located [key= 6, value=6]
Co-located [key= 8, value=8]
Co-located [key= 9, value=9]
Co-located [key= 14, value=14]
Co-located [key= 15, value=15]
Co-located [key= 17, value=17]
Co-located [key= 19, value=19]
[22:57:08] Node LEFT [nodeId8=3a890c51, addr=[127.0.0.1], CPUs=4]
[22:57:08] Topology snapshot [nodes=2, CPUs=4, hash=0x99D66AF5]

As you can see the key/value pairs got distributed roughly equal (the more keys the better the distribution will be obviously). What’s also important to note is that we didn’t get any null as values proving the fact that co-location work (remember: we’ve used function peek() that only return locally stored value or null if value for given key is not stored locally).

All in all - these were two quick examples of Java and Scala based applications that demonstrate some of the basics of GridGain functionality. The following chapters in the book will explain why these are just a scratch on the surface…

4. Getting and Installing

Now that we’ve looked briefly at what GridGain can do let’s start… from the beginning: how to get GridGain software and how to install it.

Note
Screenshots
Note that most of the website screenshots will change by the time you read this book. However, you can easily navigate the website as it is right now as its main parts remained relatively the same.

4.1. Download

There are three ways how you can get GridGain:

We highly recommend to use the first method and simply download ZIP file from http://www.gridgain.com website. To do so - simply open http://www.gridgain.com in your favorite browser and locate the download link that usually on the right side:

http://www.gridgain.com/book/images/screenshot-31.png

Once you clicked on the download link you’ll be on download page and you’ll need to enter your name and email:

http://www.gridgain.com/book/images/screenshot-30.png

Keep in mind several things:

  • There are two editions available for the download - enterprise and community

  • Community Edition is licensed under GPLv3 and Enterprise Edition comes with evaluation license

  • There is a link on top of the page for past downloads that contains selected previous releases of GridGain

  • In the download table (see above) you can see date of the build, its version, and the link to Release Notes

There are six downloads (as of version 3.0.2):

  • Enterprise and Community for Windows

  • Enterprise and Community for Linux/Unix/Mac OS.

  • Amazon AMI images for Enterprise and Community editions

All downloads are simple ZIP files. ZIP files are versioned and clearly named to indicate for what OS family they are intended to.

4.1.1. Maven2

Maven repository available only for version 3.0.0c-RC1 and up.

If you decide to use Maven please keep in mind:

  • Only Maven2 repository currently available (as of version 3.0.2)

    • Maven3 is not supported yet.

  • Only community edition is available in our public repository

    • Enterprise edition can only be downloaded directly from http://www.gridgain.com website

    • Maven2 POM file is included with distribution

  • Only the main GridGain JAR file is available in Maven repository

    • Depending on your usage of GridGain you may need configuration files, working directly, etc. that won’t be created when using Maven to get GridGain.

To utilize our Maven repository you’ll need to make the following changes. In your POM file you need to add dependency for GridGain:

POM File
<dependencies>
    .
    .
    .
    <dependency>
        <groupId>org.gridgain</groupId>
         <artifactId>gridgain</artifactId>
         <version>3.0.0c-rc1</version> <!-- CHANGE IT! -->
    </dependency>
</dependencies>

Make sure to properly change the version of the GridGain.

You will need to add GridGain repository to your POM file as well:

POM File
<repositories>
    .
    .
    .
    <repository>
        <id>gridgain</id>
        <url>http://www.gridgainsystems.com/maven2/</url>
    </repository>
</repositories>

Once you have it done - you are ready for Maven-based usage of GridGain.

Note
Internal Repository
We recommend to use internal Maven repositories for your projects if Maven is something you like to use. You can download GridGain as usual through www.gridgain.com website and deploy necessary files to your local repository for the rest of the team to use. This way you have full control on how GridGain is available via Maven for your particular project.

4.1.2. Versions

GridGain follows traditional rules on versioning and what specific version number means:

Version Description

X.X.1…9

Point release.
Usually contains bug fixes, documentation and example improvements.
It is backward compatible unless specified otherwise.

X.1…9.X

Mid-point release.
Usually contains bug fixes, documentation and example improvements.
Backward compatible only if specified

1…9.X.X

Major release.
Usually contains bug fixes, documentation and example improvements.
Backward compatible only if specified.

In general, we target one major release every 12 months and mid-point release every 6 months. Point releases are being cut as we see need to patch issues or provide hot bug fixes.

4.1.3. Supported Operating Systems

GridGain is actively developed and tested on three major operating systems:

http://www.gridgain.com/images/macos_logo_24x24.gif

Mac OS X

http://www.gridgain.com/images/win_logo_24x24.gif

Windows 7

http://www.gridgain.com/images/linux_logo_24x24.gif

Linux (Ubuntu & Fedora)

Being JVM-based software GridGain has minimal dependency on particular operating system (as long as Java is available for it). Most of the dependencies are in scripts. With every release of GridGain we thoroughly testing the software against the following version of operating systems:

  • Mac OS X 10.x

  • Linux Ubuntu (current active release)

  • Linux Fedora (current active release)

  • Windows XP/Vista/2007 (as of 3.0.2 version)

Note that we do not actively test against the following operating system but verified independently that GridGain 3.0 or later works stable and correct on them:

  • Solaris (current release)

  • HP-UX (current release)

  • Window 2003, Windows 2000

Important
Less Tested
GridGain is less tested on: Solaris, HP-UX, Window 2003, and Windows 2000.

In general, with extremely rare exceptions, GridGain will work out-of-the-box on any Windows or Linux/Unix-based system as long as Java 6 (and Scala 2.9 or later) is available on it.

4.1.4. Java, Scala and Groovy

As of version 3.0 GridGain requires Java 6. Note that GridGain 3.0 has not been tested with upcoming Java 7 as of May 2011.

Starting with version 3.0.9 GridGain requires Scala 2.9 or later (if Scala is used which is optional). Note that original release of GridGain 3.0.0 came before Scala 2.8 GA was released and was compatible only with Scala 2.7.

Starting with version 3.1.1 GridGain requires Groovy 1.8 and corresponding version of Groovy++ (if Groovy is used which is optional).

Keep in mind that you can develop with either Java, Scala, Groovy or any combination of thereof. Specifically, Scala is not required to develop with GridGain but some of the tools, like GridGain Visor - monitoring and interpreting tool in Enterprise Edition, use Scala REPL and therefore Scala is required for its usage.

Note also that as of GridGain 3.0.2 - none of the functionality in community edition explicitly require Scala or Groovy.

Note
Java, Scala and Groovy
As of version 3.1.1 GridGain requires Java 6, Scala 2.9 and Groovy 1.8.

As of November 2010 you can download both Java, Scala, and Groovy from:

Important
Java on Mac OS X
Note that Java download for Mac OSX may change its location as its development is shifting from Apple to Oracle as of November 2010.

4.2. Installation

Once you download whatever ZIP file your have selected - the installation process is rather trivial:

  • Unzip it to any location you prefer.

  • Set up GRIDGAIN_HOME environment variable pointing to installation folder

Note that installation does not perform any new-line translations and text files may have wrong new-lines depending on what OS installation is performed.

Unix/Linux/Mac OSX ZIP file has all Shell scripts with executable flag set so that they can be called directly.

Note
GRIDGAIN_HOME
Note that strictly speaking GRIDGAIN_HOME is not required for GridGain operation - and if you know that your setup won’t require it (explained later in the book) - you can skip it. If you are new to GridGain - it’s very advisable to set GRIDGAIN_HOME right after the unzipping the downloaded file.
Important
Trailing Spaces
Make sure there is no trailing \ in GRIDGAIN_HOME path.

4.2.1. Installing On Shared Location

One good practice for testing, staging or production setups is to install GridGain into shared location like a network share or shared hard drive. This way multiple grid nodes can share single configuration, libraries and working directory. This significantly simplifies management of GridGain installation in a distributed environment.

4.3. Uninstallation

Uninstalling GridGain is even simple than installing - you simply remove the GRIDGAIN_HOME folder where GridGain was installed. If it was configured to use paths outside of GRIDGAIN_HOME you will need to delete them too (if necessary).

4.4. Upgrading

Due to complexity of GridGain (mostly due to its distributed nature) we have decided not to provide incremental upgrade (or patching) capabilities. We recommend upgrading GridGain by cleanly uninstalling and installing a new upgrade version.

5. Configuration

5.1. Overview

GridConfigurationdoc interface defines grid runtime configuration. This configuration is passed to GridFactory.start(GridConfiguration)doc method. It defines all configuration parameters required to start a grid instance. Usually, a special class called "loader" will create an instance of this interface and call GridFactory.start(GridConfiguration)doc method to initialize GridGain instance.

Note, that absolutely every configuration property in GridConfigurationdoc is optional. You can simply create a new instance of GridConfigurationAdapterdoc, for example, and pass it to GridFactory.start(GridConfiguration)doc as is to start grid with default configuration. See GridFactorydoc documentation for information about default configuration properties used and more information on how to start grid.

The following configuration parameters can be used to configure grid node with GridConfigurationAdapter:

Setter Method Description Optional Default

setGridName(String)doc

Grid name.

Yes

null

setGridGainHome(String)doc

GridGain installation folder.

Yes

GRIDGAIN_HOME system property or environment variable.

setLocalHost(String)doc

System-wide local address or host for all GridGain components to bind to.

Yes

null

setNodeId(UUID)doc

Unique identifier for local node.

Yes

Random UUID.

setNetworkTimeout(long)doc

Maximum timeout in milliseconds for network requests.

Yes

5000ms

setLicenseUrl(String)doc

License URL different from the default location of the license file.

Yes

GRIDGAIN_HOME/gridgain-license.xml

setUserAttributes(Map<String,? extends Serializable>)doc

User specific attributes to attach to this node. Available via GridNode.getAttribute(String)doc method. Very useful for segmenting grid nodes into subgroups or identifying nodes based on certain property.

Yes

All System Properties and Environment Variables are set as node attributes automatically by GridGain.

setDaemon(boolean)doc

Daemon flag.

Yes

false

setIncludeProperties(String…)doc

Array of system or environment property names to include into node attributes.

Yes

All properties are included by default.

setIncludeEventTypes(int…)doc

Array of event types, which will be recorded by GridEventStorageManager. Note, that either the include event types or the exclude event types can be established.

Yes

All events are recorded by default.

setExcludeEventTypes(int…)doc

Array of event types, which will not be recorded by GridEventStorageManager. Note, that either the include event types or the exclude event types can be established.

Yes

All events are recorded by default.

setLifecycleBeans(GridLifecycleBean…)doc

Collection of lifecycle beans.

Yes

null

setLifeCycleEmailNotification(boolean)doc

Whether or not to enable lifecycle email notifications.

Yes

false

setDiscoveryStartupDelay(long)doc

Time in milliseconds after which a certain metric value is considered expired.

Yes

1 minute

setGridLogger(GridLogger)doc

Logger to use within grid.

Yes

GridLog4jLoggerdoc

setMarshaller(GridMarshaller)doc

Marshaller to use for serialization/deserialization of objects (available from ver. 2.1).

Yes

GridOptimizedMarshallerdoc

setDeploymentMode(GridDeploymentMode)doc

Deployment mode for task/query requests initiated from this node (available from ver. 2.1).

Yes

SHAREDdoc

setPeerClassLoadingEnabled(boolean)doc

Enables/disables peer class loading.

Yes

true

setPeerClassLoadingMissedResourcesCacheSize(int)doc

Specifies internal cache size for missed resources. If attempt to load a resource failed, then it will be cached, and following attempts will not make remote calls (available from ver. 2.1).

Yes

100

setP2PLocalClassPathExclude(List<String>)doc

List of packages in a system class path that should be to P2P loaded even if they exist locally.

Yes

null

setMetricsExpireTime(long)doc

Time in milliseconds after which a certain metric value is considered expired.

Yes

600000ms

setMetricsHistorySize(int)doc

Number of metrics kept in history to compute totals and averages.

Yes

10000

setMetricsLogFrequency(int)doc

Frequency of metrics log print out.

Yes

0, which means that metrics print out is disabled.

setExecutorService(ExecutorService)doc

Thread pool to use mainly for task and job execution.

Yes

GridThreadPoolExecutordoc with 100 threads.

setExecutorServiceShutdown(boolean)doc

Executor service shutdown flag.

Yes

true

setSystemExecutorService(ExecutorService)doc

Thread pool to use for processing job and task session asynchronous responses (available from ver. 2.1).

Yes

GridThreadPoolExecutordoc with 100 threads.

setSystemExecutorServiceShutdown(boolean)doc

System executor service shutdown flag.

Yes

true

setPeerClassLoadingExecutorService(ExecutorService)doc

Thread pool to use for processing peer class loading requests and responses (available from ver. 2.1).

Yes

GridThreadPoolExecutordoc with 20 threads.

setPeerClassLoadingExecutorServiceShutdown(boolean)doc

Peer class loading executor service shutdown flag.

Yes

true

setMBeanServer(MBeanServer)doc

MBean server for exposing GridGain MBeans.

Yes

The default MBean Server provided by JDK.

setSegmentationPolicy(GridSegmentationPolicy)doc

Segmentation policy.

Yes

STOPdoc

setSegmentationResolvers(GridSegmentationResolver…)doc

Segmentation resolvers.

Yes

null

setSegmentCheckFrequency(int)doc

Network segment check frequency.

Yes

10000ms

setWaitForSegOnStart(boolean)doc

Wait for segment on start flag.

Yes

true

setAllSegmentationResolversPassRequired(boolean)doc

All segmentation resolvers pass required flag.

Yes

true

setRestEnabled(boolean)doc

Flag indicating whether external REST access is enabled or not.

Yes

true

setRestJettyPath(String)doc

Path, either absolute or relative to GRIDGAIN_HOME, to JETTY XML configuration file.

Yes

null

setRestSecretKey(String)doc

Secret key to authenticate REST requests.

Yes

null, which means that authentication is disabled.

setSmtpHost(String)doc

SMTP host.

Yes

null, which disables sending emails.

setSmtpPort(int)doc

SMTP port.

Yes

25

setSmtpUsername(String)doc

SMTP username.

Yes

null

setSmtpPassword(String)doc

SMTP password.

Yes

null

setAdminEmails(String[])

Set of admin emails where email notifications will be set.

Yes

null

setSmtpFromEmail(String)doc

FROM email address for email notifications.

Yes

info@gridgain.com

setSmtpSsl(boolean)doc

Whether or not SMTP uses SSL.

Yes

false

setSmtpStartTls(boolean)doc

Whether or not SMTP uses STARTTLS.

Yes

false

setLocalEventListeners(Map<GridLocalEventListener, int[]>)doc

Pre-configured local event listeners.

Yes

null

setLoadBalancingSpi(GridLoadBalancingSpi…)doc

Fully configured instances of GridLoadBalancingSpi. Starting with GridGain 2.1 you can provide multiple instances of Load Balancing SPIs and then specify which one to use on per-task level via @GridTaskSpisdoc annotation attached to your GridTask implementation.

Yes

GridRoundRobinLoadBalancingSpidoc

setCheckpointSpi(GridCheckpointSpi…)doc

Fully configured instances of GridCheckpointSpi. Starting with GridGain 2.1 you can provide multiple instances of Checkpoint SPIs and then specify which one to use on per-task level via @GridTaskSpisdoc annotation attached to your GridTask implementation.

Yes

GridSharedFsCheckpointSpidoc

setCollisionSpi(GridCollisionSpi)doc

Fully configured instance of GridCollisionSpi.

Yes

GridFifoQueueCollisionSpidoc

setCommunicationSpi(GridCommunicationSpi)doc

Fully configured instance of GridCommunicationSpi.

Yes

GridTcpCommunicationSpidoc

setDeploymentSpi(GridDeploymentSpi)doc

Fully configured instance of GridDeploymentSpi.

Yes

GridLocalDeploymentSpidoc

setDiscoverySpi(GridDiscoverySpi)doc

Fully configured instance of GridDiscoverySpi.

Yes

GridMulticastDiscoverySpidoc

setEventStorageSpi(GridEventStorageSpi)doc

Fully configured instance of GridEventStorageSpi.

Yes

GridMemoryEventStorageSpidoc

setFailoverSpi(GridFailoverSpi…)doc

Fully configured instances of GridFailoverSpi. Starting with GridGain 2.1 you can provide multiple instances of Failover SPIs and then specify which one to use on per-task level via @GridTaskSpisdoc annotation attached to your GridTask implementation.

Yes

GridAlwaysFailoverSpidoc

setTopologySpi(GridTopologySpi…)doc

Fully configured instances of GridTopologySpi. Starting with GridGain 2.1 you can provide multiple instances of Topology SPIs and then specify which one to use on per-task level via @GridTaskSpisdoc annotation attached to your GridTask implementation.

Yes

GridBasicTopologySpidoc

setMetricsSpi(GridLocalMetricsSpi)doc

Fully configured instance of GridLocalMetricsSpi.

Yes

GridJdkLocalMetricsSpidoc

Some of the most commonly used configuration properties are explained in more detail below.

5.1.1. Grid Name

Use grid name configuration property whenever you would like to identify your grid by name. Usually, if you have only one grid node within your VM, you don’t have to configure grid name explicitly and use the default no-name grid node. However, if you start multiple grid node instances in the same VM, say for unit testing or debugging, then properly configuring grid name for every grid node instance will allow you to access multiple grid nodes by name via GridFactory.getGrid(String gridName)doc method.

5.1.2. User Attributes

User attributes allow you to attach various custom attributes to your nodes. This attributes can then be used to identify node topology for your task execution or load balancing, segmenting your grid into multiple sub-grids, etc… By default, GridGain will automatically attach or System and Environment properties to your node.

You can query node attributes practically from anywhere in your code, be that your task or job logic, or implementation of topology or load-balancing SPI’s. Simply get a handle on GridNode and check its attributes via GridNode.getAttribute(String)doc method.

5.1.3. Grid Logger

Configuring proper grid logger will allow you to integrate your logging with any environment. By default, GridLog4jLoggerdoc is used which gets its logging configuration from GRIDGAIN_HOME/config/default-log4j.xml.

Below is the list of supported loggers:

  • GridLog4jLoggerdoc - Log4j-based implementation for logging. This logger should be used by loaders that have prefer log4j-based logging. By default, GridGain will use this logger with configuration from GRIDGAIN_HOME/config/default-log4j.xml.

  • GridJavaLoggerdoc - Logger to use with Java logging. Implementation simply delegates to Java Logging.

  • GridJbossLoggerdoc - Logger to use in JBoss loaders. Implementation simply delegates to JBoss logging.

  • GridJclLoggerdoc - This logger wraps any JCL (Jakarta Commons Logging) loggers. Implementation simply delegates to underlying JCL logger. This logger should be used by loaders that have JCL-based internal logging (e.g., Websphere).

5.1.4. Grid Marshaller

Starting with GridGain 2.1 release you are able to configure different marshallers, and if needed provide your own. GridMarshallerdoc allows to marshal or unmarshal objects. It provides serialization/deserialization mechanism for all instances that are sent across network or are otherwise serialized.

GridGain provides the following GridMarshaller implementations:

  • GridJBossMarshallerdoc - this is the default marshaller used by GridGain. It used JBoss implementation of java.io.ObjectOutputStream for object serialization. All marshalled instances must implement java.io.Serializable.

  • GridJdkMarshallerdoc - this marshaller uses standard JDK java.io.ObjectOutputStream for object serialization.. All marshalled instances must implement java.io.Serializable.

  • GridXstreamMarshallerdoc - this marshaller uses Codehaus XStream for serialization of objects into XML. It does not require that marshalled instances implement java.io.Serializable, however, it performs slower than other marshaller implementations as XML is a verbose protocol.

  • GridOptimizedMarshallerdoc - Unlike GridJdkMarshaller, which is based on standard ObjectOutputStream, this marshaller does not enforce that all serialized objects implement java.io.Serializable. It is also generally much faster as it removes lots of serialization overhead that exists in default JDK implementation.

5.1.5. Executor Services

Starting with version 2.1, GridGain exposes configuration for 3 threads pools:

  • ExecutorServicedoc - Implementation of java.util.concurrent.ExecutorService to be used for task and job executions. By default, standard ThreadPoolExecutor thread pool is provided and is configured to use 100 threads. Change this configuration parameter whenever you need to change the number of threads participating in GridTask/GridJob execution.

  • SystemExecutorServicedoc - Implementation of java.util.concurrent.ExecutorService to be used for processing of asynchronous job and task session responses. By default, standard ThreadPoolExecutor thread pool is provided and is configured to use 100 threads. Change this configuration parameter whenever you set task session attributes frequently or feel that responses are not processed fast enough.

  • PeerClassLoadingExecutorServicedoc - Implementation of java.util.concurrent.ExecutorService to be used for processing of all Peer Class Loading requests. By default, standard ThreadPoolExecutor thread pool is provided and is configured to use 20 threads. Change this configuration parameter whenever you feel that class-loading requests don’t get processed fast enough.

Tip Do not confuse executor services provided in configuration for thread pooling with grid-enabled executor service provided by GridGain.

5.1.6. Grid Lifecycle Beans

See Grid Lifecycle Beans documentation for information on how to specify lifecycle beans and examples.

5.1.7. SPIs - Server Provider Interfaces

Server Provider Interfaces allow you to configure virtually every aspect of GridGain, such as communication, discovery, topology and failover, load-balancing, etc… in LEGO-like fashion. For information on available SPI’s and their configuration refer to SPI’s documentation.

5.2. Specifying Different SPIs Per GridTask

Starting with GridGain 2.1 you can start multiple instances of Topology SPI, Load Balancing SPI, Failover SPI and Checkpoint SPI. If you do that, you need to tell a task which SPI to use (by default it will use the first SPI in the list).

Add @GridTaskSpisdoc annotation for your task to specify what SPIs it wants to use. If this annotation is omitted, then by default GridGain will pick the first corresponding SPI implementation from the array of SPIs provided in configuration.

This example shows how to configure different SPI’s for different tasks. Let’s assume that you have two worker nodes, Node1 and Node2. Let’s also assume that you configure Node1 to belong to SegmentA and Node2 to belong to SegmentB. Here is a sample configuration for Node1:

<bean id="grid.cfg" class="org.gridgain.grid.GridConfigurationAdapter" scope="singleton">
    <property name="userAttributes">
        <map>
            <entry key="segment" value="A"/>
        </map>
    </property>
</bean>

Node2 configuration looks similar to Node1 with segment attribute set to B:

<bean id="grid.cfg" class="org.gridgain.grid.GridConfigurationAdapter" scope="singleton">
    <property name="userAttributes">
        <map>
            <entry key="segment" value="B"/>
        </map>
    </property>
</bean>

Now, if you have Task1 and Task2 starting from some master node NodeM, you can easily configure Task1 to only run on SegmentA and Task2 to only run on SegmentB. Here is how configuration on master node NodeM would look like:

<bean id="grid.cfg" class="org.gridgain.grid.GridConfigurationAdapter" scope="singleton">
    <!--
        Topology SPIs. We have two named SPIs: One picks up nodes
        that have attribute "segment" set to "A" and another one sees
        nodes that have attribute "segment" set to "B".
    -->
    <property name="topologySpi">
        <list>
            <bean class="org.gridgain.grid.spi.topology.nodefilter.GridNodeFilterTopologySpi">
                <property name="name" value="topologyA"/>
                <property name="filter">
                    <bean class="org.gridgain.grid.GridJexlNodeFilter">
                        <property name="expression" value="node.attributes['segment'] == 'A'"/>
                    </bean>
                </property>
            </bean>
            <bean class="org.gridgain.grid.spi.topology.nodefilter.GridNodeFilterTopologySpi">
                <property name="name" value="topologyB"/>
                <property name="filter">
                    <bean class="org.gridgain.grid.GridJexlNodeFilter">
                        <property name="expression" value="node.attributes['segment'] == 'B'"/>
                    </bean>
                </property>
            </bean>
        </list>
    </property>
</bean>

Then your Task1 and Task2 would look as follows (note the @GridTaskSpis annotation):

@GridTaskSpis(topologySpi="topologyA")
public class GridSegmentATask extends GridTaskSplitAdapter<String, Integer> {
    ...
}

and

@GridTaskSpis(topologySpi="topologyB")
public class GridSegmentBTask extends GridTaskSplitAdapter<String, Integer> {
    ...
}

5.3. GridSpringBean

Grid Spring bean allows to bypass GridFactorydoc methods. In other words, this bean class allows to inject new grid instance from Spring configuration file directly without invoking static GridFactorydoc methods. This class can be wired directly from Spring and can be referenced from within other Spring beans. By virtue of implementing org.springframework.beans.factory.DisposableBean and org.springframework.beans.factory.InitializingBean interfaces, GridSpringBean automatically starts and stops underlying grid instance.

The following configuration parameters are optional:

  • Grid configuration (see setConfiguration(GridConfiguration)doc)

5.3.1. Spring Configuration Example

<bean id="mySpringBean" class="org.gridgain.grid.GridSpringBean" scope="singleton">
    <property name="configuration">
        <bean id="grid.cfg" class="org.gridgain.grid.GridConfigurationAdapter" scope="singleton">
            <property name="gridName" value="mySpringGrid"/>
        </bean>
    </property>
</bean>

Or use default configuration:

<bean id="mySpringBean" class="org.gridgain.grid.GridSpringBean" scope="singleton"/>

5.3.2. Java Example

Here is how you may access this bean from code:

AbstractApplicationContext ctx = new FileSystemXmlApplicationContext("/path/to/spring/file");

// Register Spring hook to destroy bean automatically.
ctx.registerShutdownHook();

Grid grid = (Grid)ctx.getBean("mySpringBean");

5.4. Examples

GridConfiguration may be defined in code:

GridConfigurationAdapter cfg = new GridConfigurationAdapter();

// Override default values for grid node.
cfg.setGridName("mygrid");

...

// Start grid.
GridFactory.start(cfg);

or from Spring configuration file (default Spring configuration file can be found in GRIDGAIN_HOME/config/default-spring.xml file):

<bean id="grid.cfg" class="org.gridgain.grid.GridConfigurationAdapter" scope="singleton">
    ...
    <property name="gridName" value="mygrid"/>
    ...
</bean>

5.5. AOP Configuration

In order to use annotation based @Gridifydoc AOP-based grid-enabling the following AOP configuration needs to be in place depending on which AOP implementation you choose to use. Note that you only need to pick one AOP implementation.

5.5.1. JBoss AOP

Standalone application

Note that GridGain is not shipped with JBoss and doesn’t include necessary JBoss libraries. We assume that if you choose to use JBoss AOP you would have these libraries anyways. The following configuration needs to be applied to enable JBoss byte code weaving:

  • The following JVM configuration must be present (make sure to replace com.foo.bar with your domain package):

    • -javaagent:[path to jboss-aop-jdk50-4.x.x.jar]

    • -Djboss.aop.class.path=[path to gridgain.jar]

    • -Djboss.aop.exclude=org,com -Djboss.aop.include=com.foo.bar

  • The following JARs should be in a classpath:

    • javassist-4.x.x.jar

    • jboss-aop-jdk50-4.x.x.jar

    • jboss-aspect-library-jdk50-4.x.x.jar

    • jboss-common-4.x.x.jar

    • trove-1.0.x.jar

JBoss AOP with JBoss AS
  • Install JBoss AOP deployer.

    • Remove the jboss-aop-jdk50.deployer directory of "server/your_server_name/deploy" in your JBoss AS

    • Download the latest stable version of JBoss AOP (1.5.5 GA)

    • Unzip it and make sure that all directories were unzipped case-sensitive

    • Copy the appropriate jboss-aop-jdk50.deployer directory from your JBoss AOP installation to your "server/your_server_name/deploy"

    • Edit jboss-aop-jdk50.deployer/jboss-service.xml, setting "EnableLoadTimeWeaving" with a true value, like follows:

      <attribute name="EnableLoadtimeWeaving">true</attribute>
      <attribute name="Exclude">java,javax,org,com,net,sun,oracle,EDU,antlr</attribute>
      <attribute name="Include">com.foo.bar</attribute>
      

      Make sure to replace com.foo.bar with your domain package. Also make sure to edit the exclude list if it does not have some packages that you would not like to weave.

    • Follow the instructions of the jboss-aop-jdk50.deployer/ReadMe.txt file

    • Copy pluggable-instrumentor.jar file (located in the lib-50 directory of your JBoss AOP installation) to the bin directory of your server

    • Edit your run.sh or run.bat to include -javaagent:pluggable-instrumentor.jar in the JAVA_OPTS

  • Deploy Gridgain as SAR.

    • Copy gridgain.sar directory from GRIDGAIN_HOME/config/jboss folder into your "server/your_server_name/deploy" folder.

    • Make sure to update classpath in gridgain.sar/META-INF/jboss-service.xml to point to all libs under GRIDGAIN_HOME and GRIDGAIN_HOME/libs.

Note
JBoss AOP and JSP
JBoss AOP CFLOW pointcut does not properly work JSP-compiled classes (it does not properly handle JSP classes on the stack). The workaround is to include pre-compiled JSP classes into your WAR file. Tomcat provides instructions on how to do that with JSPC here - Web Application Compilation.

5.5.2. AspectJ AOP

The following configuration needs to be applied to enable AspectJ byte code weaving:

  • JVM configuration should include: -javaagent:[GRIDGAIN_HOME]/libs/aspectjweaver-1.5.3.jar

  • Classpath should contain the [GRIDGAIN_HOME]/config/aop/aspectj folder.

5.5.3. Spring AOP

Spring AOP framework is based on dynamic proxy implementation and doesn’t require any specific runtime parameters for online weaving. All weaving is on-demand and should be performed by calling method GridifySpringEnhancer.enhance(Object) for the object that has method with Gridify annotation.

Note that since this method of weaving requires manual enhancing of participating classes, it is rather inconvenient in most cases, and AspectJ or JbossAOP are recommended over it. Spring AOP can be used in situations when code augmentation is undesired and cannot be used. It also allows for very fine grained control of what gets weaved.

BEA Weblogic AS

Weblogic application server does not support AspectJ and JBoss AOP officially and the only way to use AOP is a Spring AOP. One needs to enhance classes as described above using Spring AOP. See http://springide.org/blog/2006/05/24/implementing-jee-with-spring-and-weblogic for details.

6. Main Abstractions

This chapter will list some of the main concepts in GridGain that you need to understand to move forward. Most of them will be discussed in greater depth later on in the book - but it’s helpful to lay them out upfront so that you can follow up examples. This also gives you the bird view on GridGain architecture and API design.

Note Keep in mind that we don’t expect you to fully understand each topic below just yet - all of them will be discussed in-depth much later in the book. In fact, you can skim this chapter quickly - but we recommend at least that.

GridGain has several key abstractions that are essential for understanding pretty much everything else in GridGain. We’ll begin with them first.

6.1. GridNode Interface

GridNodedoc interface defines a logical grid node in the network topology. Note that a physical node (like a computer on the network) can have multiple logical grid nodes running on it. In fact, a single JVM can run multiple logical grid nodes - note that GridGain is the only software in the world allowing this unique capability.

GridNode interface has very concise API and deals only with a notion of a logical network endpoint, a node, in the topology: it has globally unique ID, node metrics, set of static attributes provided by the user and few other parameters.

Note
GridNode
GridNode has globally unique ID, set of static attributes provided by the user, node metrics and few other parameters.

The unique characteristic of GridGain is that it uses Peer-To-Peer (P2P) topology meaning that all nodes in GridGain are equal. There is no master or server nodes, and there are no worker or client nodes either. All nodes are equal from GridGain’s point of view - yet all these and any other roles can be assigned logically to the nodes.

This unique design gives GridGain tremendous flexibility: not only you are not limited to the master-worker mold - you can define any application specific roles and assign them to the nodes dynamically. More over, since these roles are logical, they can change and "migrate" from node to node as topology changes or based on your application logic.

GridNode interface is used primarily by internal kernal code and by discovery and communication SPI implementations and rarely used directly. GridRichNodedoc interface, its rich counterpart, is what used instead for majority of GridGain operations. More on this below.

6.2. Local Grid Node

Local grid node is an instance of GridNode interface that is instantiated in the local JVM runtime for a specific grid. In general, JVM process that runs GridGain runtime can have zero, one or more local grid nodes, but only one local node per specific grid’s topology.

6.3. Grid Topology

As the logical extension of the network endpoint - a grid node as defined above - we use the term topology throughout the GridGain documentation to reference a set of all logical grid nodes where each node "knows" every other nodes (in other words, topology is a fully connected graph of all grid nodes including the local node). We often refer to such topology as simply a grid.

Note
Topology is Associative
Note that the key characteristic of the topology is its associative property, i.e. the fact that each node "knows" every other node.

Depending on configured discovery SPI implementation (discussed later) the topology can be guaranteed to be consistent on all nodes at any point of time or be eventually consistent only. The eventual consistency means that all nodes will eventually get into fully consistent view but there a short time window where nodes can have a different view on the global topology. The guaranteed consistency is expensive to implement and it is optional in GridGain.

Note, however, that for data grid, for example, the configured SPI must provide consistent topology (i.e. support guaranteed discovery discussed later).

It is important to note that a single GridGain runtime can support any number of topologies or grids in the same time. Nodes from one topology have no knowledge about the nodes in other topologies. In such cases, single GridGain runtime (JVM process) will have several local grid nodes where each local node would belong to a different grid. Again, there is an important distinction between GridGain runtime (a JVM process) and a logical grid node running inside of that runtime.

Important
GridGain Runtime vs. Grid Node
There is an important distinction between GridGain runtime (a JVM process) and a logical grid node running inside of that runtime.

Note that we often refer to virtual sub-grid or sub-topology which is essentially just a subset of grid nodes from one topology. More on all that later.

6.4. GridProjection Interface

One of the major addition in GridGain 3.0 was introduction of a grid projection (and corresponding GridProjectiondoc interface. The important observation that led to this addition was the fact there is a large set of GridGain operations that can be defined uniformly on any arbitrary set of grid nodes.

Put it differently, each such operation can be performed on zero, one, two or any other number of grid nodes. For example, you can send a message to one, or more nodes in the grid. Just as you can listen for messages from one, two or any other number of nodes. And so on. To use functional programming terminology - these are the monadic operations defined on a set of grid nodes (which correspondingly makes GridProjection a monad).

Note
GridProjection is a Monad
GridProjection exposes monadic set of operations defined on an arbitrary non-empty set of grid nodes.

To logically group such operations the GridProjection interface is introduced and it defines all major operations in the GridGain that can be performed on a arbitrary set of nodes. You can think about grid projection as a specific view on topology. Projection can be static or dynamic and there are many ways how a projection can be defined.

As you will see throughout this book the elements of functional programming or functional API design are central to GridGain. You’ll discover that even when working with Java APIs you are dealing with functional constructs most of the time - even though Java is not a functional language to being with! This is one the unique sides of GridGain and it leads to extremely elegant and simple to use APIs.

We are going to cover projections and functional programming framework in Java in much greater details in subsequent chapters.

6.5. Rich Interfaces

Once we introduced grid projection it is pretty logical to extend definition of a grid node as a grid projection with just that one node it. Similarly, one can extend grid cloud definition as a grid projection that contains all nodes belonging to that specific physical cloud. And finally, it is only logical to provide a conveniently defined global projection that contains all the nodes in the topology.

This is exactly what GridRichNodedoc, and Griddoc interfaces do. They all extend GridProjection interface and add all necessary additional operations that are specific to a grid node, grid cloud or a global topology.

The idea of rich interfaces is central in Scala and Ruby libraries, for example. By having both thin and rich interfaces (GridNode and GridRichNode) we can satisfy both types of interface usages:

  • thin interface that needs to be implemented by the end user (and therefore should be as simple as possible), and

  • rich interface that is actually used by the end user (and therefore should be as rich as possible).

Note
Rich vs. Thin Interfaces
Thin interface that needs to be implemented by the end user and therefore should be as simple as possible. Rich interface that is actually used by the end user and therefore should be as rich as possible.

Historically, Grid interface - being a global all-inclusive projection, has an additional special purpose. Grid interface acts as a main entry point for entire GridGain functionality. In fact, most of the operations you perform on GridGain originate on Grid interface. That is where you can obtain the instance of data grid, get an instance of rich cloud interface for a specific cloud, create and manage grid projections, manually deploy grid tasks and perform multitude of other operations.

To get an instance of Grid interface you need to use GridFactory.

6.6. GridFactory Class

GridFactorydoc class is a life-cycle factory for Grid instances. Its purpose is to provide various ways to start and stop instances of Grid interface. Note that Grid interface - being the main entry point for GridGain APIs - has a strict life-cycle and state machine that is controlled by GridFactory. As noted before, single GridGain runtime (JVM process) can have zero or more Grid instances each providing a local view on a different grid.

The usual way to work with GridGain is to use GridFactory class to start Grid instance using specific (or default) configuration file (usually at the beginning of your application). Starting Grid instance means, among other things, starting a local grid node and have it join the topology. Once you have started Grid instance you can use any APIs provided by GridGain. When GridGain is no longer needed, you use GridFactory to stop the Grid instance and its node will leave the topology (usually at the end of your application).

6.7. GridCache Interface

GridCachedoc interface represent the main API entry point for data grid functionality. GridCache instance always refers to a single named cache. You can configure as many named caches as you like. You receive the GridCache instance from Grid instance (as anything else in GridGain). GridCache is a rich interface and represents global cache projection (see below).

6.8. GridCacheProjection Interface

GridCacheProjectiondoc interface is analogous to GridProjection but it defines a cache projection over specified set of key-value pairs. In fact, GridCache interface extends GridCacheProjection and simply represents a global cache projection, i.e. the projection over all key-value pairs in this cache.

Cache projections are extremely powerful technique in GridGain’s data grid. It provides a monadic set of operations that is defined on any arbitrary set of:

  • keys,

  • values, or

  • key-value pairs

giving data grid a distinct functional flavor and providing consistent API design between compute and data grids.

Important
Compute and In-Memory Data Grid Design Unification
This logical and design unification between compute and data grids around functional monadic concept is one of the unique characteristics of GridGain architecture.

6.9. MapReduce

MapReduce is a relatively new name for very old concept. In a strict terms the term MapReduce refers to patented algorithm introduced by Google in their internal distributed data processing systems and closely mirrored by Hadoop project developed by a competing Yahoo!.

We (as well as distributed programming community in general) tend to use term MapReduce in more wider sense since it was extensively publicized and we refer to any divide-and-concur design strategies or traditional parallel computing as MapReduce. In fact, if you have a long running task, you can split this task into multiple sub-tasks, execute these tasks in parallel, aggregate their results back and get you final result in a fraction of time. That’s a classic parallel programming, or compute grid - and to avoid myriads of names we call it MapReduce too.

Note
Google and MapReduce
It is important, to note, however, that specific implementation that we chose in GridGain has relatively little, if anything, to do with algorithm used by Google (or Hadoop). Not only Google’s algorithm is patented, but it is also very specific to Google’s needs and rarely applicable outside of extreme big-data use cases.

Hadoop provides one implementation of MapReduce that is closely matching Google definition as an Apache project. Note that Google granted the license to Hadoop to use patented Google algorithm.

6.10. Streaming MapReduce

Streaming MapReduce is a less-defined term but often refers to a type of processing similar to MapReduce (i.e. tasks gets split and reduced) but with input data is not finite in general. Typical example is a search in a live video feed: the obvious problem is that you can’t load the entire feed first and then partition it into small parts to be processed in parallel (like you would do in a traditional MapReduce); you need to somehow map and reduce the incoming data as it comes and in the same time keep providing failover, topology resolution, collision resolution, back pressure control, and all other necessary services.

GridGain provide several unique ways of how this type of processing can be implemented.

6.11. Real-Time Processing

GridGain is all about processing large data sets (a.k.a. BigData) in real-time. But what real-time are we talking about?

In GridGain - we are talking about perceptual real-time, or a software real-time (Java Virtual Machine doesn’t technically support hardware real-time processing). Perceptual real-time is defined by a maximum response time that a typical user will wait for the task he or she expects to be "instant" before cancelling the task. For example, when a typical user clicks "Add to Basket" button on the website anything beyond couple of seconds will probably make that user to click "Back" or otherwise cancel the task. For a online trading application the delay of few seconds on submitting the order will indicate something wrong with the system. Portfolio management application that takes 10 seconds to open last minute moving average chart is practically unusable. And so on…

6.12. Closures & Predicates

We mention closures and predicates here only because we provide their full implementation on Java side. Unlike Scala or Groovy, where closures (functions) are part of the language, in Java they are not - and we had to develop an entire state of the art distributed functional framework for Java that is included with GridGain.

Closure is a block of code that encloses its body and any outside variables used inside of it as a function object. You can then pass such function object anywhere you can pass a variable and execute it. Predicate is a special type of closure that simply returns boolean value.

Scala as a hybrid OOP and FP language offers natural advantage over Java since it provides native in-language support for closures which enables much more concise and elegant APIs provided by GridGain’s Scalar DSL. However, with GridGain’s functional framework we brought Java functional usage as close to Scala as possible.

Below is a simple example that broadcasts and prints string on all nodes in the topology. Just compare how close Java, Groovy and Scala implementations are:

GridGain Scalar DSL:
...
object Test {
    // Broadcast "Howdy!" string to all nodes.
    def main(args: Array[String]) = scalar {
        grid$ *< (BROADCAST, () => println("Howdy!")
    }
}
...
GridGain Grover DSL:
...
@Typed
@Use(GroverProjectionCategory)
class Test {
    // Broadcast "Howdy!" string to all nodes.
    static void main(String[] args) {
        grover {
            Grid g -> g.run(BROADCAST) { println("Howdy!") }
        }
    }
}
...
GridGain Java APIs:
...
public class Test {
    // Broadcast "Howdy!" string to all nodes.
    public void main(String[] args) {
        G.in(null, new CIX1() {
            @Override public void applyx(Grid g) throws GridException {
                g.run(BROADCAST, F.println("Howdy!"));
            }
        };
    }
}
...

Closures and predicates used extensively in GridGain APIs and at the core of our design. You will discover plenty of examples later in the book of how closures and predicates - in Java, Groovy and Scala - used throughout the GridGain.

6.13. Type Aliases & Typedefs

One of the unusual features of our Java side APIs is introduction of type aliases (also known as typedefs) in GridGain 3.0. With introduction of functional design in GridGain 3.0 we have quickly discovered that Java APIs are just way too "chatty" due to lack of any type inference by the Java compiler. The only way to combat that problem in Java is to introduce type alias - a subclass with a shorter name.

Obviously, it works best (or at all) for static, factory-type classes but it also works for object instantiations but unfortunately you can’t use type aliases in method signature since they are different types. Such as live in a Java world…

Note
Aliases Applicability
Due to Java-based implementation type aliases or typedefs are used only for static, factory-type classes.

We’ve introduced aliases only for key GridGain types and there’s only about a dozen of aliases in GridGain APIs. It’s pretty easy to memorize the key ones that are used most frequently. Usage of aliases is, of course, optional as you can always use full (original) names of the types. But we are pretty sure that you’ll find aliases on Java side useful to make your code more readable.

Here’s good example demonstrating how aliases can Java code slightly more readable:

Without Typedefs:
GridFunc.copy(res, goods,
    GridFunc.<Item>and(
        GridFunc.<Item>notNull(),
        GridFunc.<Item>or(
            new GridPredicate<Item>() {
                @Override public boolean apply(Item item) {
                    return item.novelty;
                }
            },
            new GridPredicate<Item>() {
                @Override public boolean apply(Item item) {
                    return item.price < 150;
                }
            }
        )
    )
);

GridFunc.forEach(res, GridFunc.<Item>println());
With Typedefs:
F.copy(res, goods,
    F.<Item>and(
        F.<Item>notNull(),
        F.<Item>or(
            new P1<Item>() {
                @Override public boolean apply(Item item) {
                    return item.novelty;
                }
            },
            new P1<Item>() {
                @Override public boolean apply(Item item) {
                    return item.price < 150;
                }
            }
        )
    )
);

F.forEach(res, F.<Item>println());

6.14. Grid Task and Job

Grid task and job are the main abstractions in the compute grid. As you recall the compute grid is about parallelization the processing, i.e. splitting a long running task into multiple sub-tasks, executing those sub-tasks in parallel and aggregating their results into one final result.

GridTaskdoc defines the descriptor for the overall task to be processed and grid job defines a sub-tasks, i.e. the piece of code that will travel to remote nodes for execution. Essentially a grid task defines mapping and reducing logic, while GridJobdoc is very similar to java.util.Callable interface by defining a simple executable body.

Note
GridTask and GridJob
GridTask defines overall task descriptor as well as mapping and reducing logic. GridJob defines units of work that tasks get split into and that travel to remote nodes for execution. The result of their execution will be reduced by the task into the final result.

Note that anything that gets executed on the compute grid should be defined as a grid task. Closures and AOP-based executions get converted to grid task automatically when needed.

6.15. Service Provider Interface (SPI)

Service Provider Interface (SPI) is at core of GridGain architecture as it provide pluggable modularization to GridGain. SPI concept is simply a component interface with multiple pluggable implementations that all share unified life cycle management. There are two key benefits in this approach:

  • Rest of GridGain (including all user application code) only uses the SPI interface and is totally independent from its specific implementation.

  • User can provide its own pluggable implementation for SPIs and therefor not limited to the one provided by GridGain itself.

The example of SPI is communication subsystem. It has simple interface GridCommunicationSPIdoc and half a dozen implementations that GridGain provides out of the box. These implementations can be set in grid configuration GridConfiguration (and TCP/IP-based implementation is always set by default so that you don’t have to set anything to get GridGain working right away).

Note
GridGain SPI - Service Provider Implementation
Experienced reader can quickly notice that SPI is very similar to OSGI. Although we looked carefully at OSGI number of years ago we quickly concluded that we needed more custom functionality that OSGI provided at the time. In the same time our SPI architecture and OSGI share the same goals of componentization and modularity.

GridGain is "sliced" into dozen of different SPIs for all major subsystems and each such subsystem can be fully replaced by user’s specific implementation - an extremely powerful feature of GridGain.

7. Functional Programming Framework

Introduction of Functional Programming (FP) into Java-based GridGain has its own unique story in GridGain that is worth repeating here.

It was anything but a straightforward decision… In early 2008 when we released GridGain 2.0 we’ve started looking for a new ways to simplify the usability of GridGain. We’ve had a pretty good story with AOP-based grid enabling and a direct API support for MapReduce type of processing - in fact we’ve been way ahead of everyone else in this department. But we felt there’s still lots of plumbing exposed for many use cases where such exposition was clearly unnecessary.

The obvious thought for us was to look at Domain Specific Language (DSL) route. We quickly realized that Java-based DSL is simply out of question (it would be just another set of Java APIs not much different from we had already). XML-based DSL (or any type of external DSL) was considered a non-started even in a hay days of DSLs of 2006-2007.

So, we started looking at other JVM-based languages that would be much more appropriate for DSL development yet let us reuse GridGain Java-based APIs. During surprisingly short evaluation period (which we’ll chronicle in Scala section later on) we quickly and decisively settled on Scala - relatively new than language that so powerfully combined OOP and FP into one cohesive and expressive language.

As we dived into Scala-based DSL development with a renewed energy we quickly realized that in order to provide truly powerful DSL in Scala (utilizing Scala’s functional core including partial functions, closures, etc.) we essentially had to re-implement most of the main APIs in Scala, i.e. duplicate GridGain in Scala. That was a pretty rude awakening for us as it was simply a no-go to have GridGain implemented in two parallel tracks: one in Java and one in Scala.

And that’s where functional story for Java begins. After some research on our part we noticed that if we could make Java side APIs functional in their design - we could largely reduce the Scala-based DSL to a collection of implicit conversions from functional Java parts to Scala parts (and back). That would also allow to have all implementation in Java (where it is originally was) and keep Scala APIs as a layer on top without any duplication of code what-so-ever.

We’ve set off to develop a first truly distributed Functional Programming framework for Java.

After a few false starts we’ve got a first working prototype (that didn’t suck) and started the refactoring process of our Java APIs into functional mold. What we started noticing along the way is that new APIs (newly added or refactored) were becoming much more powerful and elegant when used with functional constructs such as predicate-based node filters or closure-based executions. In about that time we also came up with grid and cache projections that truly revolutionized the GridGain usability. Many implementations became shorter and yet more expressive when we started using our GridFunc class as well as newly introduced typedefs. All these positive effects on our internal software development solidified our resolve to provide the same capabilities to the users of GridGain.

Note
Scala Leads to FP in Java

All in all, the introduction of Scala support and Scala DSL into GridGain led us to develop one of the most comprehensive Functional Programming frameworks in Java that at the core of most of the functionality in GridGain.

What is even more interesting (and exciting for us) is that FP focus on Java side made the development of Grover, our Groovy++ based DSL for GridGain, simple if not downright trivial. It took just a week to release first beta version of it and it was already mighty useful for large Groovy community.

More on that later…

7.1. Type Aliases and Typedefs

Type aliases or typedefs used in programming languages to give shorter name to existing type name to make the code that is using it more readable and easer to understand.

Many languages provide built-in support for type aliases or typedefs.

C-based languages (C/C++/Objective-C) provide direct support for it:

Objective-C
typedef enum {A, B, C} myEnum;
C/C++
typedef int myInt, yoursInt;

Scala also provides excellent built-in support for type alias that goes beyond C-based capabilities. You can declare a type alias right during importing the original type:

import foo.bar.{Original => O} // 'O' becomes alias of 'Original'

And you can also declare the type alias in your code much the same as declaring method or a field (and similar to C/C++):

type Call[R] = () => R // A shorthand for function
type OneWayTicket = () => Unit // Another shorthand for function

You often use structural types in Scala with type aliases:

type T = { def foo: Unit }

declares T as an alias for any type that has method foo with return type Unit.

Java, unfortunately, does not provide any support for type aliases… Yet Java needs them more than any language above due to its syntactic bloat and lack of any reasonable type inference. In fact, look at this "typical" Java code:

private HashMap<Collection<String>, Set<Integer>> map =
    new HashMap<Collection<String>, Set<Integer>>();

Since there is no type inference you have to repeat bloated HashMap definition twice in this definition for no reason what-so-ever - it only makes code look more busy and less readable. And if this type of HashMap is used frequently - and there’s no way to shorthand it - you have to repeat this over and over again in every place where it is used.

Note Java needs typedefs more than any language above due to its syntactic bloat and lack of any reasonable type inference.

Surprisingly enough, this shortcoming of Java is often sighted as one of the major reason the Java code "feels" bloated and unnecessary verbose. This gets even more pronounced when you start using Scala that, like Java, is fully statically typed but removes most, if not all, bloat and unnecessary repetition from the code.

With introduction of Functional Programming in GridGain 3.0 (explained in the following chapter) we were faced with this very problem in Java APIs: we had plenty of new interfaces and classes that were used literally everywhere in our code and it was becoming unwieldy in many places since many of them require parametrization and code was becoming simply ridiculous in some place… Needless to say that users of GridGain APIs would have been faced with the same problems.

To solve this problem (somewhat) we have introduced typedefs to our Java APIs (Scala APIs naturally use Scala language type aliases).

Important
Typedef
Essentially, a typedef is simply a subclass with a shorter name - as there is no other way to do that in Java.

In package org.gridgain.grid.typedefdoc we have few dozens of typedefs defined as sub-classes with short one-two letter names for various frequently used types in GridGain. The following table shows all typedefs shipped with GridGain:

Typedef or Type Alias Original Type

C1<E1,R>doc

org.gridgain.grid.lang.GridClosure<E1,R>doc

C2<E1,E2,R>doc

org.gridgain.grid.lang.GridClosure2<E1,E2,R>doc

C3<E1,E2,E3,R>doc

org.gridgain.grid.lang.GridClosure3<E1,E2,E3,R>doc

CAdoc

org.gridgain.grid.lang.GridAbsClosuredoc

CAXdoc

org.gridgain.grid.lang.GridAbsClosureXdoc

CI1<T>doc

org.gridgain.grid.lang.GridInClosure<T>doc

CI2<E1,E2>doc

org.gridgain.grid.lang.GridInClosure2<E1,E2>doc

CI3<E1,E2,E3>doc

org.gridgain.grid.lang.GridInClosure3<E1,E2,E3>doc

CIX1<T>doc

org.gridgain.grid.lang.GridInClosureX<T>doc

CIX2<E1,E2>doc

org.gridgain.grid.lang.GridInClosure2X<E1,E2>doc

CIX3<E1,E2,E3>doc

org.gridgain.grid.lang.GridInClosure3X<E1,E2,E3>doc

CO<T>doc

org.gridgain.grid.lang.GridOutClosure<T>doc

COX<T>doc

org.gridgain.grid.lang.GridOutClosureX<T>doc

CX1<E1,R>doc

org.gridgain.grid.lang.GridClosureX<E1,R>doc

CX2<E1,E2,R>doc

org.gridgain.grid.lang.GridClosure2X<E1,E2,R>doc

CX3<E1,E2,E3,R>doc

org.gridgain.grid.lang.GridClosure3X<E1,E2,E3,R>doc

Fdoc

org.gridgain.grid.lang.GridLangdoc

Gdoc

org.gridgain.grid.GridFactorydoc

P1<E1>doc

org.gridgain.grid.lang.GridPredicate<E1>doc

P2<T1,T2>doc

org.gridgain.grid.lang.GridPredicate2<T1,T2>doc

P3<T1,T2,T3>doc

org.gridgain.grid.lang.GridPredicate3<T1,T2,T3>doc

PAdoc

org.gridgain.grid.lang.GridAbsPredicatedoc

PCE<K,V>doc

org.gridgain.grid.lang.GridPredicate<GridCacheEntry<K, V>>

PEdoc

org.gridgain.grid.lang.GridPredicate<GridEvent>

PKV<K,V>doc

org.gridgain.grid.lang.GridPredicate2<K, V>

PNdoc

org.gridgain.grid.lang.GridPredicate<GridRichNode>

PX1<E1>doc

org.gridgain.grid.lang.GridPredicateX<E1>doc

PX2<T1,T2>doc

org.gridgain.grid.lang.GridPredicate2Xdoc

PX3<T1,T2,T3>doc

org.gridgain.grid.lang.GridPredicate3Xdoc

R1<E1,R>doc

org.gridgain.grid.lang.GridReducerdoc

R2<E1,E2,R>doc

org.gridgain.grid.lang.GridReducer2doc

R3<E1,E2,E3,R>doc

org.gridgain.grid.lang.GridReducer3doc

RX1<E1,R>doc

org.gridgain.grid.lang.GridReducerXdoc

RX2<E1,E2,R>doc

org.gridgain.grid.lang.GridReducer2Xdoc

RX3<E1,E2,E3,Rdoc

org.gridgain.grid.lang.GridReducer3Xdoc

As you can see typedefs defined primarily for functional classes (tuples, closures, and predicates) as well as for a few factory classes like GridFactorydoc and GridFuncdoc. Here’s a short sub-list of the most frequently used typedefs in GridGain:

Typedef or Type Alias Original Type

C1<E1,R>doc

org.gridgain.grid.lang.GridClosure<E1,R>doc

CAdoc

org.gridgain.grid.lang.GridAbsClosuredoc

CO<T>doc

org.gridgain.grid.lang.GridOutClosure<T>doc

Fdoc

org.gridgain.grid.lang.GridLangdoc

P1<E1>doc

org.gridgain.grid.lang.GridPredicate<E1>doc

PAdoc

org.gridgain.grid.lang.GridAbsPredicatedoc

PNdoc

org.gridgain.grid.lang.GridPredicate<GridRichNode>

Here’s a code snipped from GridFunctionalCopyExample example that is shipped with GridGain. First version does not use typedefs and uses full names of the types:

Without Typedefs:
GridFunc.copy(res, goods,
    GridFunc.<Item>and(
        GridFunc.<Item>notNull(),
        GridFunc.<Item>or(
            new GridPredicate<Item>() {
                @Override public boolean apply(Item item) {
                    return item.novelty;
                }
            },
            new GridPredicate<Item>() {
                @Override public boolean apply(Item item) {
                    return item.price < 150;
                }
            }
        )
    )
);

GridFunc.forEach(res, GridFunc.<Item>println());
With Typedefs:
F.copy(res, goods,
    F.<Item>and(
        F.<Item>notNull(),
        F.<Item>or(
            new P1<Item>() {
                @Override public boolean apply(Item item) {
                    return item.novelty;
                }
            },
            new P1<Item>() {
                @Override public boolean apply(Item item) {
                    return item.price < 150;
                }
            }
        )
    )
);

F.forEach(res, F.<Item>println());

I would argue that the second version gains more readability and easier to understand since we don’t have to repeat ad nauseum GridFunc and GridPredicate in every line.

Note also that we have couple of typedefs that shorten parameterized types that allows for greater brevity and more concise code:

Typedef or Type Alias Original Type

PCE<K,V>doc

org.gridgain.grid.lang.GridPredicate<GridCacheEntry<K, V>>

PEdoc

org.gridgain.grid.lang.GridPredicate<GridEvent>

PKV<K,V>doc

org.gridgain.grid.lang.GridPredicate2<K, V>

PNdoc

org.gridgain.grid.lang.GridPredicate<GridRichNode>

Warning
Limitation

Now, this approach obviously has limitation:

  • Since typedefs are sub-classes - you can’t use them in signatures unless you expect to be passed a typedef itself - a rather bad approach.

  • Another limitation is that typedef is a new type and a different one from the original type - so any code that relies on exact types (AOP, IoC, etc.) may no longer work. So, essentially, in Java you can only use typedefs during instance creation - similar, to a certain degree, to factory methods.

Note
Scala
You can freely use Java-based typedefs in Scala code - but we suggest to use native type alias support provided by Scala

7.1.1. Typedefs vs. Factory Methods

Now, you may ask why not use factory methods, a standard Java idiom, instead? GridGain actually provides plenty of factory methods in GridFuncdoc class (that itself has F typedef). But factory methods often tend to be more verbose and sometime hide the "creation of new instance" context.

Factory Method
Foobar v = FoobarFactory.newFoobar(...);

or

Typedefs
Foobar v = new T(...); // 'T' is a typedef for 'Foobar'

In our experience working with GridGain source code we’ve found that typedefs generally provide for the most terse code without loosing context or readability.

7.1.2. Where To Use Typedefs

The answer here is simple - everywhere unless you lose in readability of your code.

Important
Do Not Trade In Readability
We strongly believe that you should never trade in few saved characters for poorer code readability.

Once you get familiar with the some of the most frequently used typedefs - you will start using them freely and in most situation they will improve your code - make it less bloated and concentrate reader’s attention on the actual business logic and away from unnecessary repetitive declarations.

8. GridGain Basics

In this pretty long chapter we’ll cover all basic functionality available in GridGain apart from the "big two" - compute grids and data grids - which will be covered in subsequent individual chapters. Both big subsystems are fundamentally based on the functionality explained in this chapter and therefore the following material is pretty important.

Note What’s more interesting is that fact that some GridGain-based applications don’t even use any of the two main technologies we have - but utilize, for example, actor-based message passing, distributed functional programming, zero deployment or event-based processing provided by GridGain.

8.1. Logging

GridGain provides pluggable logging capability by allowing the user to specify his own logging framework. This is especially convenient when GridGain runs inside of the hosting environment such as servlet container or application server.

In such case, GridGain can be easily configured to route all its logging through host’s logging framework eliminating the need to have multiple log file locations. This also dramatically simplifies the debugging since multiple log file don’t have to be line-by-line synchronized.

To provide this pluggability GridGain relies on its own interface GridLoggerdoc that provides absolute minimal API for logging. While GridGain uses this interface throughout entire product it further provides out-of-the-box complete implementations for this interface using the following popular logging frameworks:

Users, of course, are free to provide their own implementations and many often do for integration with existing log analysis or health monitoring solutions.

8.1.1. Configuration

GridGain logger could be configured either from code by modifying GridConfiguration during start of GridGain or via Spring XML. Following examples demonstrate both ways for Log4J and JCL loggers:

GridConfiguration cfg = new GridConfigurationAdapter();
...
// Log4J logger.
URL xml = U.resolveGridGainUrl("modules/tests/config/log4j-test.xml");
GridLogger log = new GridLog4jLogger(xml);
...
cfg.setGridLogger(log);
...
<property name="gridLogger">
    <bean class="org.gridgain.grid.logger.jcl.GridJclLogger">
        <constructor-arg type="org.apache.commons.logging.Log">
            <bean class="org.apache.commons.logging.impl.Log4JLogger">
                <constructor-arg type="java.lang.String" value="config/default-log4j.xml"/>
            </bean>
        </constructor-arg>
    </bean>
</property>
...
Configuring Java Logging

Here is an example of configuring Java logger in GridGain configuration Spring file to work over Log4J implementation. Note that we use the same configuration file as we provide by default:

...
<property name="gridLogger">
    <bean class="org.gridgain.grid.logger.java.GridJavaLogger">
        <constructor-arg type="java.util.logging.Logger">
            <bean class="java.util.logging.Logger">
                <constructor-arg type="java.lang.String" value="global"/>
            </bean>
        </constructor-arg>
    </bean>
</property>
...

or

...
<property name="gridLogger">
    <bean class="org.gridgain.grid.logger.java.GridJavaLogger"/>
</property>
...

And the same configuration if you’d like to configure GridGain in your code:

GridConfiguration cfg = new GridConfigurationAdapter();
...
GridLogger log = new GridJavaLogger(Logger.global);
...
cfg.setGridLogger(log);

or which is actually the same:

GridConfiguration cfg = new GridConfigurationAdapter();
...
GridLogger log = new GridJavaLogger();
...
cfg.setGridLogger(log);
Configuring Log4j

Here is a typical example of configuring log4j logger in GridGain configuration file:

<property name="gridLogger">
    <bean class="org.gridgain.grid.logger.log4j.GridLog4jLogger">
        <constructor-arg type="java.lang.String" value="config/default-log4j.xml"/>
    </bean>
</property>

and from your code:

GridConfiguration cfg = new GridConfigurationAdapter();
...
URL xml = U.resolveGridGainUrl("modules/tests/config/log4j-test.xml");
GridLogger log = new GridLog4jLogger(xml);
...
cfg.setGridLogger(log);
Configuring JBoss logging

Information about configuring JBoss logging with GridGain can be found at http://docs.jboss.org/process-guide/en/html/logging.html.

Configuring Tomcat logging

Please refer to http://tomcat.apache.org/tomcat-6.0-doc/logging.html for more information on how to configure GridGain with Tomcat logging.

Configuring JCL

This logger wraps any JCL - Jakarta Commons Logging loggers. Implementation simply delegates to underlying JCL logger. This logger should be used by loaders that have JCL-based internal logging (e.g., Websphere).

Here is an example of configuring JCL logger in GridGain configuration Spring file to work over Log4J implementation. Note that we use the same configuration file as we provide by default:

...
<property name="gridLogger">
    <bean class="org.gridgain.grid.logger.jcl.GridJclLogger">
        <constructor-arg type="org.apache.commons.logging.Log">
            <bean class="org.apache.commons.logging.impl.Log4JLogger">
                <constructor-arg type="java.lang.String" value="config/default-log4j.xml"/>
            </bean>
        </constructor-arg>
    </bean>
</property>
...

If you are using system properties to configure JCL logger use following configuration:

...
<property name="gridLogger">
    <bean class="org.gridgain.grid.logger.jcl.GridJclLogger"/>
</property>
...

And the same configuration if you’d like to configure GridGain in your code:

GridConfiguration cfg = new GridConfigurationAdapter();
...
GridLogger log = new GridJclLogger(new Log4JLogger("config/default-log4j.xml"));
...
cfg.setGridLogger(log);

or following for the configuration by means of system properties:

GridConfiguration cfg = new GridConfigurationAdapter();
...
GridLogger log = new GridJclLogger();
...
cfg.setGridLogger(log);
Configuring SLF4J

This logger should be used by hosts that have slf4j-based logging.

Here is an example of configuring SLF4J logger in GridGain configuration Spring file:

<property name="gridLogger">
    <bean class="org.gridgain.grid.logger.slf4j.GridSlf4jLogger"/>
</property>

8.1.2. Injection vs. Instantiation

Instance of GridLogger interface can be obtain at any point via Grid.log() method. However, when logger is needed in grid task and/or jobs it is preferable to use resource injection via @GridLoggerResourcedoc annotation that annotates a field or a setter method for injection of GridLogger instance.

Logger can be injected into instances of following classes:

  • GridTask

  • GridJob

  • GridSpi (and its all implementations)

  • GridLifecycleBean

  • Any object annotated with @GridUserResource annotation

Here is how injection would typically happen:

public class MyGridJob implements GridJob {
     ...
     @GridLoggerResource
     private GridLogger log;
     ...
 }

or

public class MyGridJob implements GridJob {
    ...
    private GridLogger log;
    ...
    @GridLoggerResource
    public void setGridLogger(GridLogger log) {
         this.log = log;
    }
    ...
}

8.1.3. Quiet Mode

GridGain 3.0 introduced a quiet logging mode. Essentially, this mode suppresses most of the INFO and all DEBUG logging and provides very concise logging output. This mode is very useful for examples and demonstration as well as for everyday development where full output of INFO or DEBUG is not necessary.

By default starting with version 3.0 GridGain starts in quite mode suppressing INFO and DEBUG log output. If system property GRIDGAIN_QUIET is set to false than GridGain will operate in normal un-suppressed logging mode (with whatever logging back-end is configured). Note that all output in quiet mode is done through standard output (STDOUT).

Note that GridGain’s standard startup scripts $GRIDGAIN_HOME/bin/ggstart.{sh|bat} starts by default in quiet mode. Both scripts accept -v arguments to turn off quiet mode.

8.2. Loaders

Grid loaders are used to start grid in different environments. Loaders provide basic boilerplate code for starting GridGain in various environments. For example, when starting within application servers such as JBoss, Weblogic or Websphere, provided loaders will configure GridGain to use "native" logging, JMX facility, discovery and execution services (JSR-237) which makes GridGain basically blend into hosting environment. Loaders do not need to implement any interface and their sole responsibility is to configure and start grid.

8.2.1. Command Line Loader

Command line loader located in org.gridgain.grid.loaders.cmdline package.

Command line loader is used to start grid from a command line script. Grid comes with ggstart.{sh|bat} startup script located in $GRIDGAIN_HOME/bin folder. By default this script will use configuration defined in $GRIDGAIN_HOME/config/default-spring.xml. This configuration will pick default configuration for all grid internal components and SPI’s which is sufficient for running examples and doing your own development and testing.

If you wish to provide your own configuration file, simply pass its path as parameter to the script.

ggstart.sh C:\myfolder\mygrid.xml

To stop grid, simply press CTRL-C which will initiate GridGain stop routine.

Tip
Script Startup
Note that in addition to starting grid nodes on separate physical machines, GridGain supports starting multiple grid nodes on the same machine as well as in the same VM. The only requirement for default configuration is that IP-Multicast is supported.
Tip
Custom Jars
Starting from 2.1.0 you can add your libraries to the class path without changing startup scripts. Just put them into the $GRIDGAIN_HOME/libs/ext directory and GridGain will pick them up automatically.
Warning
GRIDGAIN_HOME Environment Variable
If you get the following error: Exception in thread "main" java.lang.NoClassDefFoundError: org/gridgain/grid/loaders/cmdline/GridCommandLineLoader, then your $GRIDGAIN_HOME environment variable is not set or is set incorrectly. Please set $GRIDGAIN_HOME environment variable to your GridGain installation folder.

8.2.2. GlassFish Loader

GlassFish loader located in org.gridgain.grid.loaders.glassfish package.

GlassFish loader is used to start GridGain within GlassFish application server. GridGain loader implemented as GlassFish life-cycle listener module. GlassFish loader should be used to provide tight integration between GridGain and GlassFish AS. Current loader implementation works on both GlassFish v1 and GlassFish v2 servers.

The following steps should be taken to configure this loader:

  1. Add GridGain libraries in GlassFish common loader. See GlassFish Class Loaders

  2. Create life-cycle listener module. Use command line or administration GUI:

    asadmin> create-lifecycle-module --user admin --passwordfile ../adminpassword.txt --classname "org.gridgain.grid.loaders.glassfish.GridGlassfishLoader" --property cfgFilePath="config/default-spring.xml" GridGain

For more information consult GlassFish Project - Documentation Home Page.

Note that GlassFish is not shipped with GridGain. If you don’t have GlassFish, you need to download it separately. See https://glassfish.dev.java.net for more information.

8.2.3. Tomcat Loader

Tomcat loader located in org.gridgain.grid.loaders.tomcat package.

Tomcat loader is used to start GridGain within Tomcat server. GridGain loader implemented as Tomcat LifecycleListener. Tomcat loader should be used to provide tight integration between GridGain and Tomcat web container (logging, MBean server).

The following steps should be taken to configure this loader:

  1. Add GridGain libraries in Tomcat common loader. Add in file $TOMCAT_HOME/conf/catalina.properties for property common.loader the following $GRIDGAIN_HOME/gridgain.jar,$GRIDGAIN_HOME/libs/*.jar (replace $GRIDGAIN_HOME with absolute path).

  2. Add GridGain LifeCycle Listener in $TOMCAT_HOME/conf/server.xml.

<!-- GridGain loader -->
<Listener className="org.gridgain.grid.loaders.tomcat.GridTomcatLoader"
          configurationFile="config/default-spring.xml"/>

Note that Tomcat is not shipped with GridGain. If you don’t have Tomcat, you need to download it separately. See http://tomcat.apache.org for more information.

8.2.4. JBoss Loader

JBoss loader located in org.gridgain.grid.loader.jboss package.

JBoss loader is used to start GridGain within JBoss as a JBoss service. Note that jboss-service.xml has a configuration parameter pointing to Spring XML configuration. At startup, JBoss loader will look for the Spring configuration XML file specified in jboss-service.xml.

GridGain ships with pre-built SAR directory. SAR directory located in $GRIDGAIN_HOME/config/jboss folder. You can simply deploy GridGain into JBoss into 2 steps:

  • Change the codebase in jboss-service.xml in META-INF sub-folder to point to correct location.

  • Copy entire SAR directory from $GRIDGAIN_HOME/config/jboss folder to deploy directory of the JBoss.

Here is how $GRIDGAIN_HOME/config/jboss/jboss-service.xml looks (note we use 1.5.0 version as an example):

<!DOCTYPE server PUBLIC "-//JBoss//DTD MBean Service 4.0//EN" "http://www.jboss.org/j2ee/dtd/jboss-service_4_0.dtd">

<!--
    JBoss service descriptor for GridGain JBoss Loader.

    Classpath should contain the following libraries:
    - $GRIDGAIN_HOME/libs/*.jar
    - $GRIDGAIN_HOME/gridgain_1.5.0.jar

    For example, if GridGain is installed into /opt/gridgain-1.5.0 then
    you can use the following classpath settings to includes all
    necessary JARs:

    <classpath codebase="/opt/gridgain-1.5.0/gridgain-1.5.0.jar"/>
    <classpath codebase="/opt/gridgain-1.5.0/libs" archives="*"/>
-->
<server>
    <classpath codebase=".." /> <!-- FIX IT BEFORE USING. -->

    <mbean code="org.gridgain.grid.loaders.jboss.GridJbossLoader" name="gridgain:service=loader">
        <!--
            config/default-spring.xml - Default GridGain configuration.
            config/jboss/ha/jboss-gridgain-ha-spring.xml - JBoss specific configuration that
                will use JBoss SPIs for communication and discovery. Requires JBoss HA enabled.
        -->
        <attribute name="ConfigurationFile">config/default-spring.xml</attribute>
    </mbean>
</server>
Warning Currently provided JBoss loader doesn’t work with JBoss 7. Use servlet context listener loader instead.

8.2.5. WebLogic Loader

Weblogic loader located in org.gridgain.grid.loader.weblogic package.

Weblogic loader is used to start GridGain within Weblogic application server. GridGain loader for WebLogic implemented as a pair of start and shutdown classes. Please consult WebLogic documentation on how to configure startup classes in Weblogic. Weblogic loader is used for tight integration with Weblogic AS. Specifically, Weblogic loader integrates GridGain with Weblogic logging, MBean server, and work manager (JSR-237).

The following steps should be taken to configure startup and shutdown classes:

  1. Add Startup and Shutdown Class in admin console (Environment → Startup & Shutdown Classes → New).

  2. Add the following parameters for startup class:

    • Name: GridWeblogicStartup

    • Classname: org.gridgain.grid.loaders.weblogic.GridWeblogicStartup

    • Arguments: cfgFilePath=config/default-spring.xml

  3. Add the following parameters for shutdown class:

    • Name: GridWeblogicShutdown

    • Classname: org.gridgain.grid.loaders.weblogic.GridWeblogicShutdown

  4. Change classpath for WebLogic server in startup script: CLASSPATH="$CLASSPATH:$GRIDGAIN_HOME/gridgain.jar:$GRIDGAIN_HOME/libs/"

Note that Weblogic is not shipped with GridGain. If you don’t have Weblogic, you need to download it separately. See http://www.bea.com more information.

8.2.6. WebSphere Loader

Websphere loader located in org.gridgain.grid.loaders.websphere package.

Websphere loader is used to start GridGain within Websphere application server. This is GridGain loader implemented as Websphere custom service (MBean). Websphere loader should is used to provide tight integration between GridGain and Websphere AS. Specifically, Websphere loader integrates GridGain with Websphere logging, MBean server and work manager (JSR-237).

The following steps should be taken to configure this loader:

  1. Add CustomService in admin console (Application Servers → server1 → Custom Services → New).

  2. Add custom property for this service: cfgFilePath=config/default-spring.xml.

  3. Add the following parameters:

    • Classname: org.gridgain.grid.loaders.websphere.GridWebsphereLoader

    • Display Name: GridGain

    • Classpath (replace $GRIDGAIN_HOME with absolute path): "$GRIDGAIN_HOME/gridgain.jar:$GRIDGAIN_HOME/libs/". Note that forward slash (/) at the end is critical.

Note that Websphere is not shipped with GridGain. If you don’t have Websphere, you need to download it separately. See http://www.ibm.com/software/websphere/ for more information.

8.2.7. Servlet context listener loader

Servlet context listener loader located in org.gridgain.grid.loaders.servlet package.

This loader can be used to startup GridGain grid inside any web container as servlet context listener. Loader must be defined in web.xml file.

<context-param>
    <param-name>cfgFilePath</param-name>
    <param-value>config/default-spring.xml</param-value>
</context-param>

<listener>
    <listener-class>org.gridgain.grid.loaders.servlet.GridServletContextListenerLoader</listener-class>
</listener>

Servlet-based loader may be used in any web container like Tomcat, Jetty and etc. Depending on the way this loader is deployed the GridGain instance can be accessed by either all web applications or by only one. See web container class loading architecture:

Tip

To start GridGain in a web container, you have to create WAR file with the following structure:

gridgain.war
    |-- WEB-INF/
        |-- lib/
        |   |-- gridgain.jar
        |   `-- GridGain libraries (contents of $GRIDGAIN_HOME/libs folder)
        `-- web.xml (shipped with GridGain in $GRIDGAIN_HOME/config/servlet folder)

This file should be copied to deployments directory.

8.3. Life Cycle Beans

GridLifecycleBeandoc reacts to grid lifecycle events defined in GridLifecycleEventTypedoc. Use this bean whenever you need to plug some custom logic before or after grid startup and stopping routines.

There are four events you can react to:

GridLifecycleEventType.BEFORE_GRID_START

Invoked before grid startup routine is initiated. Note that grid is not available during this event, therefore if you injected a grid instance via GridInstanceResourcedoc annotation, you cannot use it yet.

GridLifecycleEventType.AFTER_GRID_START

Invoked right after grid has started. At this point, if you injected a grid instance via GridInstanceResourcedoc annotation, you can start using it.

GridLifecycleEventType.BEFORE_GRID_STOP

Invoked right before grid stop routine is initiated. Grid is still available at this stage, so if you injected a grid instance via GridInstanceResourcedoc annotation, you can use it.

GridLifecycleEventType.AFTER_GRID_STOP

Invoked right after grid has stopped. Note that grid is not available during this event.

8.3.1. Resource Injection

Lifecycle beans can be injected using IoC (dependency injection) with grid resources. Both, field and method based injection are supported. The following grid resources can be injected:

  • GridLoggerResourcedoc

  • GridLocalNodeIdResourcedoc

  • GridHomeResourcedoc

  • GridMBeanServerResourcedoc

  • GridExecutorServiceResourcedoc

  • GridMarshallerResourcedoc

  • GridSpringApplicationContextResourcedoc

  • GridSpringResourcedoc

  • GridInstanceResourcedoc

8.3.2. Usage

If you need to tie your application logic into GridGain lifecycle, you can configure lifecycle beans via standard grid configuration, add your application library dependencies into GRIDGAIN_HOME/libs/ext folder, and simply start GRIDGAIN_HOME/ggstart.(sh|bat) scripts.

8.3.3. Configuration

Grid lifecycle beans can be configured programmatically as follows:

GridConfigurationAdapter cfg = new GridConfigurationAdapter();

cfg.setLifecycleBeans(new FooBarLifecycleBean1(), new FooBarLifecycleBean2());

GridFactory.start(cfg);

or from Spring XML configuration file as follows:

<bean id="grid.cfg" class="org.gridgain.grid.GridConfigurationAdapter" scope="singleton">
    ...
    <property name="lifecycleBeans">
        <list>
            <bean class="foo.bar.FooBarLifecycleBean1"/>
            <bean class="foo.bar.FooBarLifecycleBean2"/>
        </list>
    </property>
    ...
</bean>

8.4. Metadata & Meta Programming

TODO

8.5. Marshaling

8.6. Messaging

Messaging - an exchange of the messages between grid nodes - is one of the main functional areas that often used standalone in GridGain (without using main Compute and Data Grid capabilities). Given GridGain’s sophisticated topology management and auto-discovery it just makes sense for many applications to simply piggy-back on this functionality and use GridGain as an intelligent message bus.

Note
Intelligent Message Bus

GridGain messaging support provides unique features that makes it an advanced message bus:

  • Zero Deployment support

  • Auto-discovery

  • Actor-based message exchange

  • Pluggable discovery implementation

  • Pluggable communication transport implementation

  • Pluggable security and QoS via communication SPI

8.7. Events

TODO

8.8. Grid-Enabled Executor Service

Grid.executor()doc method creates ExecutorService which will execute all submitted Callable and Runnable tasks on the grid. User may run Callable and Runnable tasks just like normally with java.util.ExecutorService, but these tasks must implement Serializable interface.

The execution will happen either locally or remotely, depending on configuration of Load Balancing SPI and Topology SPI. Distributed ExecutorService delegates commands execution to already started Grid instance. Every submitted task will be serialized and transfered to any node in grid.

Here is an example of an ExecutorService to show how it can be used.

public static void main(String[] args) throws GridException {
    GridFactory.start();

    try {
        Grid grid = GridFactory.grid();

        ExecutorService srvc = grid.executor();

        List<Callable<String>> cmds = new ArrayList<Callable<String>>(2);

        String testVal1 = "test-value-1";
        String testVal2 = "test-value-2";

        cmds.add(new FooCallable<String>(testVal1));
        cmds.add(new FooCallable<String>(testVal2));

        List<Future<String>> futures = srvc.invokeAll(cmds);

        // Wait for command completion.
        String res1 = futures.get(0).get();
        String res2 = futures.get(1).get();

        // Print out results.
        System.out.println("Results [res1=" + res1 + ", res2=" + res2 + ']');
    }
    finally {
        GridFactory.stop(true);
    }
}

where simple FooCallable is:

private static class FooCallable<T> implements Callable<T>, Serializable {
    /** */
    private T data = null;

    /**
     * @param data Some data.
     */
    FooCallable(T data) {
        this.data = data;
    }

    /**
     * {@inheritDoc}
     */
    public T call() throws Exception {
        System.out.println("Message: " + data);

        return data;
    }
}

8.9. Segmenting Grid Nodes

8.9.1. Why Segment Nodes?

Often in deployments you need to segment your grid nodes into several groups, having each group perform one or more subsets of jobs only. For example, let’s say you have a scenario where you have some nodes only submitting jobs to grid (masters), and other groups of nodes only executing these jobs (workers). Then you would segment your grid into 2 groups, masters and workers, and have each group do only what it is supposed to do.

Multiple Sub-Grids

Node segmentation allows you to create multiple sub-grids within your grid. Every sub-grid may have it’s own static physical characteristics and logical responsibilities. All node characteristics, physical or logical, if they are static, can be specified in Spring configuration and used in your Topology SPI or GridTask.map(..)doc logic to implement the segmentation (this is shown in example below).

Note, that based on its attributes, every node can participate in one or multiple segments.

Dynamic Sub-Grids

You may also wish to segment your grid based on dynamic characteristics, not static. For example, what if you only want to include nodes that have less than 50% CPU utilization. In GridGain you can achieve this by using dynamic GridNodeMetricsdoc provided by GridNodedoc. All you would have to do is grab current CPU utilization from node metrics and in your GridTask.map(..)doc method only pick the nodes with CPU’s loaded under 50%.

8.9.2. Node Segmentation Example

This example shows how you can segment your grid into static segments using GridGain. In GridGain such segmentation can be easily achieved with node attributes (see GridNode.getAttribute(String)doc). Let’s say that you want to segment your grid into 3 segments: master, worker1, and worker2.

Every node at startup should get a certain number of attributes assigned to it. Here is how this can be done from Spring XML configuration:

<bean id="grid.cfg" class="org.gridgain.grid.GridConfigurationAdapter" scope="singleton">
    ...
    <property name="userAttributes">
        <map>
            <!--
                In our example, segment value can be either
                'master', 'worker1', or 'worker2'.
            -->
            <entry key="segment" value="worker1"/>
        </map>
    </property>
    ...
</bean>

Then you can restrict the topology passed into GridTask.map(..)doc method by properly configuring GridNodeFilterTopologySpi to only include nodes from segments worker1 and worker2 and always exclude nodes belonging to master segment. Here is an example:

<bean id="grid.custom.cfg" class="org.gridgain.grid.GridConfigurationAdapter" singleton="true">
    ...
    <property name="topologySpi">
        <bean class="org.gridgain.grid.spi.topology.nodefilter.GridNodeFilterTopologySpi">
            <property name="filter">
                 <bean class="org.gridgain.grid.lang.GridJexlPredicate2">
                     <constructor-arg index="0">
                         <value>
                             <![CDATA[
                                 node.attributes().get('segment') == 'worker1' ||
                                 node.attributes().get('segment') == 'worker2'
                             ]]>
                         </value>
                     </constructor-arg>
                     <constructor-arg index="1" value="node"/>
                 </bean>
             </property>
        </bean>
    </property>
    ...
</bean>

Alternatively, you can also implement your GridTask.map(..)doc method to map your jobs only to worker node segments. You can check which node segment a node belongs to by checking its attributes via GridNode.getAttribute(String)doc method. Here is an example:

public class FooBarGridTask extends GridTaskAdapter<String, String> {
    ...
    public Map<GridJob, GridNode> map(List<GridNode> topology, String arg) {
        Map<GridJob, GridNode> jobs = new HashMap<GridJob, GridNode>(topology.size());

        for (GridNode node : topology) {
            String segment = node.attribute("segment");

            if (segment != null) {
                if (segment.equals("worker1"))
                    // This type of job should only execute on 'worker1' segment.
                    jobs.put(new FooBarWorker1Job(arg), node);
                else if (segment.equals("worker2"))
                    // This type of job should only execute on 'worker2' segment.
                    jobs.put(new FooBarWorker2Job(arg), node);
            }
            else
                throw new GridException("Node does not belong to any segment.");
        }

        return jobs;
    }
    ...
}

8.9.3. Grid Node Filters

You are able to filter nodes by providing your implementation of GridPredicate<? super GridRichNode>doc interface. Instances of classes that implement this interface are used to filter grid nodes. These instances are used to filter nodes in method GridProjection.nodes(GridPredicate<? super GridRichNode>…)doc. They are also used by GridNodeFilterTopologySpi to provide task topology based on user-defined node filters.

GridGain also comes with GridJexlPredicatedoc implementation which allows you to conveniently filter nodes based on Apache JEXL expression language. For information about specifics of JEXL expression language refer to Apache JEXL documentation.

Together with GridNodeFilterTopologySpi, GridJexlPredicate2doc allows for a fairly simple way to provide complex SLA-based task topology specifications. For example, expression below shows how the SPI can be configured with GridJexlPredicate2doc to include all Windows XP nodes with more than one processor or core and that are not loaded over 50%.

GridNodeFilterTopologySpi topSpi = new GridNodeFilterTopologySpi();

GridJexlPredicate2<GridRichNode> filter = new GridJexlPredicate2<GridRichNode>(
    "node.metrics().availableProcessors > 1 && " +
    "node.metrics().averageCpuLoad < 0.5 && " +
    "node.attributes().get('os.name') == 'Windows XP'",
    "node");

// Add filter.
topSpi.setFilter(filter);

GridConfigurationAdapter cfg = new GridConfigurationAdapter();

// Override topology SPI.
cfg.setTopologySpi(topSpi);

// Start grid.
GridFactory.start(cfg);

or from Spring configuration file:

<bean id="grid.custom.cfg" class="org.gridgain.grid.GridConfigurationAdapter" singleton="true">
    ...
    <property name="topologySpi">
        <bean class="org.gridgain.grid.spi.topology.nodefilter.GridNodeFilterTopologySpi">
            <property name="filter">
                 <bean class="org.gridgain.grid.lang.GridJexlPredicate2">
                     <constructor-arg index="0">
                         <value>
                             <![CDATA[
                                 node.metrics().availableProcessors > 1 &&
                                 node.metrics().averageCpuLoad < 0.5 &&
                                 node.attributes().get('os.name') == 'Windows XP'
                             ]]>
                         </value>
                     </constructor-arg>
                     <constructor-arg index="1" value="node"/>
                 </bean>
             </property>
        </bean>
    </property>
    ...
</bean>

8.9.4. GridProjection-based Segmentation

GridGain 3.0 introduced another way to segment topology by using GridProjection. Described later in more details GridProjection represents dynamic view on global topology filtered by a predicate. GridProjection is also a monad providing monadic set of operations for any arbitrary set of nodes in the projection.

8.10. Resources Injection

Resource is a GridGain internal object or user defined one (either by Spring or set up manually) that is relevant to the context like task session for the task and job, current node id or grid instance. There is fixed numbers of GridGain internal resources that can be injected into the job, task or SPIs. Prior to their initialization and availability all resources that have corresponding annotations will be injected into the task/job/SPI. Both, field and method based injection are supported. The following grid resources can be injected:

  • Executor Service Resource

  • GridGain Home Path Resource

  • Grid Instance Resource

  • Job Id Resource

  • Job Context Resource

  • Load Balancer Resource

  • Local Node Id Resource

  • Logger Resource

  • Marshaller Resource

  • MBean Server Resource

  • Spring Application Context Resource

  • Spring Bean Resource

  • Task Session Resource

  • User Resource

8.10.1. Executor Service Resource

Executor service is a Java java.util.concurrent.ExecutorService that GridGain uses to pool threads and execute jobs, process incoming messages and P2P requests. It can be injected through the @GridExecutorServiceResourcedoc annotation. It can be injected into the tasks, jobs and SPIs. Here is how injection would typically happen:

Variable injection
public class MyGridJob implements GridJob {
    ...
    @GridExecutorServiceResource
    private ExecutorService execSvc;
    ...
}

or

Method injection
public class MyGridJob implements GridJob {
    ...
    private ExecutorService execSvc = null;
    ...
    @GridExecutorServiceResource
    public void setExecutor(GridExecutorService execSvc) {
        this.execSvc = execSvc;
    }
    ...
}

8.10.2. GridGain Home Path Resource

GridGain home path is a Java String that points to the installation directory (gives user value of environment variable named GRIDGAIN_HOME or Java system property with the same name). One should use @GridHomeResourcedoc annotation to inject this resource. It can be injected into the tasks, jobs and SPIs. Here is how injection would typically happen:

Variable injection
public class MyGridJob implements GridJob {
    ...
    @GridHomeResource
    private String home;
    ...
}

or

Method injection
public class MyGridJob implements GridJob {
    ...
    private String home = null;
    ...
    @GridHomeResource
    public void setGridGainHome(String home) {
        this.home = home;
    }
    ...
}

8.10.3. Grid Instance Resource

Grid instance is a Griddoc for executing and deploying tasks, sending messages, etc… Prior to task/job execution resources will be injected into it and one could use grid to obtain local node or remote nodes for example. Use @GridInstanceResourcedoc annotation to inject Grid resource. It can be injected into grid tasks and grid jobs. Here is how injection would typically happen:

Variable injection
public class MyGridJob implements GridJob {
    ...
    @GridInstanceResource
    private Grid grid;
    ...
}

or

Method injection
public class MyGridJob implements GridJob {
    ...
    private Grid grid = null;
    ...
    @GridInstanceResource
    public void setGrid(Grid grid) {
        this.grid = grid;
    }
    ...
}

8.10.4. Job Context Resource

Context attached to every job executed on the grid. Note that unlike GridTaskSessiondoc, which distributes all attributes to all jobs in the task including the task itself, job context attributes belong to a particular job only and do not get sent over network unless a job moves from one node to another. This resource can only be injected into Grid Jobs and not Grid Tasks or SPIs. @GridJobContextResourcedoc annotation can be used to inject this resource. Here is how injection would typically happen:

Variable injection
public class MyGridJob implements GridJob {
    ...
    @GridJobContextResource
    private GridJobContext jobCtx;
    ...
}

or

Method injection
public class MyGridJob implements GridJob {
    ...
    private GridJobContext jobCtx = null;
    ...
    @GridJobContextResource
    public void setJobContext(GridJobContext jobCtx) {
        this.jobCtx = jobCtx;
    }
    ...
}

8.10.5. Load Balancer Resource

Load balancer can be injected into grid tasks only. Specific implementation for grid load balancer is defined by GridLoadBalancingSpidoc which is provided to grid via GridConfigurationdoc. @GridLoadBalancerResourcedoc annotation can be used to inject this resource. Here is how injection would typically happen:

Variable injection
public class MyGridJob implements GridJob {
    ...
    @GridLoadBalancerResource
    private GridLoadBalancer balancer = null;
    ...
}

or

Method injection
public class MyGridJob implements GridJob {
    ...
    private GridLoadBalancer balancer = null;
    ...
    @GridLoadBalancerResource
    public void setLoadBalancer(GridLoadBalancer balancer) {
        this.balancer = balancer;
    }
    ...
}

8.10.6. Local Node Id Resource

This resource injects local node ID of type java.util.UUID into an instance of Grid Job, task or SPI. Node that is is the ID of a node your code is executed on which is not necessarily ID of a node that started the execution. @GridLocalNodeIdResourcedoc annotation can be used to inject this resource. Here is how injection would typically happen:

Variable injection
public class MyGridJob implements GridJob {
    ...
    @GridLocalNodeIdResource
    private UUID locNodeId = null;
    ...
}

or

Method injection
public class MyGridJob implements GridJob {
    ...
    private UUID locId = null;
    ...
    @GridLocalNodeIdResource
    public void setLocNodeId(UUID locNodeId) {
        this.locNodeId = locNodeId;
    }
    ...
}

8.10.7. Logger Resource

Grid logger is provided to grid via GridConfigurationdoc. It can be injected into grid tasks, grid jobs, and SPI’s. Use @GridLoggerResourcedoc annotation to inject this resource. Here is how injection would typically happen:

Variable injection
public class MyGridJob implements GridJob {
    ...
    @GridLoggerResource
    private GridLogger log = null;
    ...
}

or

Method injection
public class MyGridJob implements GridJob {
    ...
    private GridLogger log = null;
    ...
    @GridLoggerResource
    public void setLogger(GridLogger log) {
        this.log = log;
    }
    ...
}

8.10.8. Marshaller Resource

Injects marshaller to the task/job/SPI. Marshaller allows to serialize/deserialize data the same way as Grid does. Marshaller can be configured via GridConfigurationdoc. Use @GridMarshallerResourcedoc annotation to inject this resource. Here is how injection would typically happen:

Variable injection
public class MyGridJob implements GridJob {
    ...
    @GridMarshallerResource
    private GridMarshaller marshaller = null;
    ...
}

or

Method injection
public class MyGridJob implements GridJob {
    ...
    private GridMarshaller marshaller = null;
    ...
    @GridMarshallerResource
    public void setMarshaller(GridMarshaller marshaller) {
        this.marshaller = marshaller;
    }
    ...
}

8.10.9. MBean Server Resource

Injects MBean server into the task, job or SPI. MBean server is the same as Grid uses to register its own MBeans. Use @GridMBeanServerResourcedoc annotation to inject this resource. Here is how injection would typically happen:

Variable injection
public class MyGridJob implements GridJob {
    ...
    @GridMBeanServerResource
    private MBeanServer mbeanSrv = null;
    ...
}

or

Method injection
public class MyGridJob implements GridJob {
    ...
    private MBeanServer mbeanSrv = null;
    ...
    @GridMBeanServerResource
    public void setMBeanServer(MBeanServer srv) {
        this.srv = srv;
    }
    ...
}

8.10.10. Spring Application Context Resource

Injects Spring ApplicationContext resource. When GridGain starts using Spring configuration, the Application Context for Spring Configuration is either created or passed to the startup routine. It can be injected into grid tasks, grid jobs, and SPI’s. @GridSpringApplicationContextResourcedoc annotation is to inject this resource. Here is how injection would typically happen:

Variable injection
public class MyGridJob implements GridJob {
    ...
    @GridSpringApplicationContextResource
    private ApplicationContext springCtx = null;
    ...
}

or

Method injection
public class MyGridJob implements GridJob {
    ...
    private ApplicationContext springCtx = null;
    ...
    @GridSpringApplicationContextResource
    public void setApplicationContext(MBeanServer springCtx) {
        this.springCtx = springCtx;
    }
    ...
}

8.10.11. Spring Bean Resource

Injects any custom resources declared in provided Spring ApplicationContext. It can be injected into grid tasks and grid jobs. Use it when you would like, for example, to inject something like JDBC connection pool into tasks or jobs - this way your connection pool will be instantiated only once per task and reused for all executions of this task. You can inject other resources into your user resource. User resources may contain fields or setters with other resource annotations . The resource will be picked up from provided Spring ApplicationContext by name value. Note, that injected spring bean must be declared in Spring ApplicationContext on every grid node where they get accessed. Use @GridSpringResourcedoc annotation to inject this resource. Here is how injection would typically happen:

Variable injection
public class MyGridJob implements GridJob {
    ...
    @GridSpringResource(resourceName = "bean-name")
    private transient MyUserBean rsrc = null;
    ...
}

or

Method injection
public class MyGridJob implements GridJob {
    ...
    private transient MyUserBean rsrc = null;
    ...
    @GridSpringResource(resourceName = "bean-name")
    public void setMyUserBean(MyUserBean rsrc) {
        this.rsrc = rsrc;
    }
    ...
}

on Spring side it would look like following:

Spring MyUserBean description
<bean id="bean-name" class="my.foo.MyUserBean" singleton="true">
    ...
</bean>

8.10.12. Task Session Resource

Injects Distributed Grid Task Session into the job or task, but not in SPI. Task session gives a simple way to set task/job attributes. Use @GridTaskSessionResourcedoc annotation to inject this resource. Here is how injection would typically happen:

Variable injection
public class MyGridJob implements GridJob {
    ...
    @GridTaskSessionResource
    private GridTaskSession taskSes = null;
    ...
}

or

Method injection
public class MyGridJob implements GridJob {
    ...
    private GridTaskSession taskSes = null;
    ...
    @GridTaskSessionResource
    public void setTaskSession(GridTaskSession taskSes) {
        this.taskSes = taskSes;
    }
    ...
}

8.10.13. User Resource

@GridUserResourcedoc injects any custom user resource into grid tasks or grid jobs. Use it when you would like, for example, to use something like JDBC connection pool from your tasks or jobs - this way your connection pool will be instantiated only once per task and reused for all executions of this task.

The resource will be created based on the resourceClass value. If resourceClass is not specified, then the field type or setter parameter type will be used to infer the class type of the resource. Set resourceClass to a specific value if the class of resource cannot be inferred from field or setter declaration (for example, if field is an interface).

User resource will be instantiated once on every node where task is deployed. Basically there will always be only one instance of resource on any grid node for any given task class. Every node will instantiate it’s own copy of user resources used for every deployed task (see GridUserResourceOnDeployeddoc and GridUserResourceOnUndeployeddoc annotation for resource deployment and undeployment callbacks).

Tip User resources are never serialized (they get instantiated) and should always be declared as transient.
Tip The scope of user resource depends on Deployment Mode used. You can configure your user resources to be deployed on per-task, per-class-loader, or per-grid basis. Take a look at Deployment Mode documentation for more information.

Use @GridUserResourcedoc annotation to inject this resource. Here is how injection would typically happen:

Variable injection
public class MyGridJob implements GridJob {
    ...
    @GridUserResource
    private transient MyUserResource rsrc = null;
    ...
}

or

Method injection
public class MyGridJob implements GridJob {
    ...
    private transient MyUserResource rsrc = null;
    ...
    @GridUserResource
    public void setMyUserResource(MyUserResource rsrc) {
        this.rsrc = rsrc;
    }
    ...
}

where resource class can look like this:

MyUserResource class
public class MyUserResource {
    ...
    // Inject logger (or any other resource).
    @GridLoggerResource
    private GridLogger log = null;

    // Inject grid instance (or any other resource).
    @GridInstanceResource
    private Grid grid = null;

    // Deployment callback.
    @GridUserResourceOnDeployed
    public void deploy() {
        // Some initialization logic.
        ...
    }

    // Undeployment callback.
    @GridUserResourceOnUndeployed
    public void undeploy() {
        // Some clean up logic.
        ...
    }
}

9. Deployment

Prior to being used, a Grid Task needs to be deployed:

  • If peer class loading is enabled (see property GridConfiguration.isPeerClassLoadingEnabled() in Grid Configuration):

    • Task class loaded from local class path if it is not defined as Local P2P Exclude

    • If there is no task class in local class path or task class needs to be peer loaded it is downloaded from task originating node.

  • If peer class loading is disabled:

    • Check that task class was deployed. If you are using GAR Deployment, then your task will be implicitly deployed every time GAR file or directory is changed. Otherwise, task can be deployed explicitly in code via Grid.deployTask(Class<? extends GridTask<?>>)doc method.

    • If task class was not deployed then we try to find it in local class path by task name. If you are not using @GridTaskNamedoc annotation to provide a custom task name, then your task name will default to the actual class name of the task and the task will be auto-deployed first time it’s executed (no explicit deployment step is required in this case).

    • If task has custom name (that does not correspond task class name), and this task was not deployed before, then exception will be thrown.

9.1. Peer Class Loading

Peer class loading (P2P) is turned on by default. To turn it off set GridConfiguration.isPeerClassLoadingEnabled()doc property to false in Grid Configuration.

Although internals of peer class loading are rather complex, what it means in a nutshell is that when a JVM on the remote node needs to find a certain class as part of the grid task execution, it will check the local class loader first and if such class cannot be found, it will ask the node that originated grid task execution (one that should have this class by design) to provide it. In an essence, GridGain class loading becomes grid-aware.

This technique is invaluable during grid application development. It allows for absolutely grid-transparent development cycle: you write your application as you normally do (in a single node environment of Eclipse, IDEA, etc.), compile and run it - and it seamlessly runs on the grid without any extra deployment or build steps what-so-ever. More over, GridGain supports hot-redeployment so you don’t have to restart GridGain every time you change the grid task code - again, just modify the code, compile and run and all your changes will be picked up on the grid.

Peer class loading sequence works as follows:

  1. GridGain will check if class was loaded at system startup, and if it was, it will be returned. No class loading from a peer node will take place in this case.

  2. If class is not locally loaded, then a request will be sent to task originating node to provide class definition. Originating node will send class byte code definition and the class will be loaded on a peer node.

Peer class loading should be used in most situations, especially during development with Java IDEs. It allows to dramatically reduce overhead of grid-enabled application development effectively making it as quick and productive as local application development. You simply change code and run - and your modified application seamlessly runs on the grid.

Note When utilizing peer class loading, you should be aware of the libraries that get loaded from peer nodes vs. libraries that are already available locally in the class path. Our suggestion is to include all 3rd party libraries into class path of every node. This way you will not transfer megabytes of 3rd party classes to remote nodes every time you change a line of code.
Tip
Error Messages
Some frameworks like Spring or CGLib ask for the certain classes to identify whether another frameworks are available or not. For example Spring being started looks for the Groovy framework and thus when peer-to-peer feature is on this class/resource request might be sent to remote node. If there is no such class/resource available then you may get message "Requested resource not found" on remote (task originating node) and "Failed to get resource due to remote failure" on local node. They are printed out for your information only.

9.1.1. Local P2P Exclude

Note that giving preference to local deployment (as GridGain does by default) does not always work. For example, GridGain utilizes Spring for its own implementation, so Spring is always loaded locally by system class loader at startup. This may create a problem if user also utilizes Spring to load some beans reflectively. For example, Spring Hibernate support will attempt to load Hibernate classes with its own class loader (system class loader in this case) and if Hibernate is not in local class path, class definitions will not be found.

There are 2 ways to solve this problem:

  1. Include Hibernate jars into class path on every node. This will perform better, as Hibernate jars will not have to be loaded with every task deployment.

  2. If above does not work, you can make sure that Spring and Hibernate classes will always be loaded from a peer node by specifying their packages in GridConfiguration.getP2PLocalClassPathExclude()doc configuration property in Grid Configuration.

9.2. Deployment Modes

Deployment mode is specified at grid startup via GridConfiguration.getDeploymentMode()doc configuration property (it can also be specified in Spring XML configuration file). The main difference between all deployment modes is how classes and user resources are loaded on remote nodes via peer-class-loading mechanism. User resources can be instances of caches, databased connections, or any other class specified by user with @GridUserResourcedoc annotation.

The following deployment modes are supported:

Mode Description

GridDeploymentMode.PRIVATEdoc

In this mode deployed classes do not share user resources (see @GridUserResourcedoc).

Basically, user resources are created once per deployed task class and then get reused for all executions. Note that classes deployed within the same class loader on master node, will still share the same class loader remotely on worker nodes. However, tasks deployed from different master nodes will not share the same class loader on worker nodes, which is useful in development when different developers can be working on different versions of the same classes. Also note that resources are associated with task deployment, not task execution. If the same deployed task gets executed multiple times, then it will keep reusing the same user resources every time.

GridDeploymentMode.ISOLATEDdoc

Unlike PRIVATE mode, where different deployed tasks will never use the same instance of user resources, in ISOLATED mode, tasks or classes deployed within the same class loader will share the same instances of user resources (see @GridUserResourcedoc). This means that if multiple tasks classes are loaded by the same class loader on master node, then they will share instances of user resources on worker nodes. In other words, user resources get initialized once per class loader and then get reused for all consequent executions. Note that classes deployed within the same class loader on master node, will still share the same class loader remotely on worker nodes. However, tasks deployed from different master nodes will not share the same class loader on worker nodes, which is especially useful when different developers can be working on different versions of the same classes.

GridDeploymentMode.SHAREDdoc

Same as GridDeploymentMode.ISOLATED, but now tasks from different master nodes with the same user version and same class loader will share the same class loader on remote nodes.

Classes will be undeployed whenever all master nodes leave grid or user version changes. The advantage of this approach is that it allows tasks coming from different master nodes share the same instances of user resources (see @GridUserResourcedoc) on worker nodes. This allows for all tasks executing on remote nodes to reuse, for example, the same instances of connection pools or caches. When using this mode, you can startup multiple stand-alone GridGain worker nodes, define user resources on master nodes and have them initialize once on worker nodes regardless of which master node they came from. This method is specifically useful in production as, in comparison to GridDeploymentMode.ISOLATED deployment mode, which has a scope of single class loader on a single master node, GridDeploymentMode.SHARED mode broadens the deployment scope to all master nodes. Note that classes deployed in GridDeploymentMode.SHARED mode will be undeployed if all master nodes left grid or if user version changed. User version can be specified in META-INF/gridgain.xml file as a Spring bean property with name userVersion. This file has to be in the class path of the class used for task execution. SHARED deployment mode is default mode used by the grid.

GridDeploymentMode.CONTINUOUSdoc

Same as SHARED deployment mode, but user resources (see @GridUserResourcedoc) will not be undeployed even after all master nodes left grid.

Tasks from different master nodes with the same user version and same class loader will share the same class loader on remote worker nodes. Classes will be undeployed whenever user version changes. The advantage of this approach is that it allows tasks coming for different master nodes share the same instances of user resources (see @GridUserResourcedoc) on worker nodes. This allows for all tasks executing on remote nodes to reuse, for example, the same instances of connection pools or caches. When using this mode, you can startup multiple stand-alone GridGain worker nodes, define user resources on master nodes and have them initialize once on worker nodes regardless of which master node they came from. This method is specifically useful in production as, in comparison to ISOLATED deployment mode, which has a scope of single class loader on a single master node, CONTINUOUS mode broadens the deployment scope to all master nodes. Note that classes deployed in CONTINUOUS mode will be undeployed only if user version changes. User version can be specified in META-INF/gridgain.xml file as a Spring bean property with name userVersion. This file has to be in the class path of the class used for task execution.

9.2.1. User Version

User version comes into play whenever you would like to redeploy tasks deployed in SHARED or CONTINUOUS modes. By default, GridGain will automatically detect if class-loader changed or a node is restarted. However, if you would like to change and redeploy code on a subset of nodes, or in case of CONTINUOUS mode to kill the ever living deployment, you should change the user version.

User version is specified in META-INF/gridgain.xml file as follows:

<!-- User version. -->
<bean id="userVersion" class="java.lang.String">
    <constructor-arg value="0"/>
</bean>

By default, all gridgain startup scripts (ggstart.sh or ggstart.bat) pick up user version from GRIDGAIN_HOME/config/userversion folder. Usually, it is just enough to update user version under that folder, however, in case of GAR or JAR deployment, you should remember to provide META-INF/gridgain.xml file with desired user version in it.

9.2.2. Always-Local Development

GridGain deployment (regardless of mode) allows you to develop everything as you would locally. You never need to specifically write any kind of code for remote nodes. For example, if you need to use a distributed cache from your GridJobdoc, then you can the following:

  1. Simply startup stand-alone GridGain nodes by executing GRIDGAIN_HOME/ggstart.{sh|bat} scripts.

  2. Inject your cache instance into your jobs via @GridUserResourcedoc annotation. The cache can be initialized and destroyed with @GridUserResourceOnDeployeddoc and @GridUserResourceOnUndeployeddoc annotations.

  3. Now, all jobs executing locally or remotely can have a single instance of cache on every node, and all jobs can access instances stored by any other job without any need for explicit deployment.

9.3. JEE Deployment

When deploying grid tasks into JEE container, you can keep using standard JEE deployment artifacts. For example, if you are deploying a WAR file into JEE container, simply add your grid task classes to the WAR file and that’s it.

9.4. GAR Deployment

GAR deployment is a traditional deployment model, similar to JAR/WAR/EAR deployment in JEE, where you create the *G*rid *AR*chive file that contains all necessary classes for the grid task and deploy it. GridGain comes with URL-based GridDeploymentSpidoc implementation so that you can deploy your GAR files on any URLs accessible via FTP, HTTP(S), POP3 or FILE protocols.

For example, when properly configured, you can just drop your GARs into certain folder on your web server and they will be deployed on the grid.

9.4.1. GAR File

GAR file is a deployable unit used by GridUriDeploymentSpidoc. GAR file is based on ZLIB compression format like simple JAR file and its structure is similar to WAR archive. GAR file has .gar extension.

GAR file structure (file or directory ending with .gar):

META-INF/
        |
        - gridgain.xml
        - ...
lib/
   |
   -some-lib.jar
   - ...
xyz.class
...
  • META-INF entry may contain gridgain.xml file which is a task descriptor XML file. The purpose of task descriptor XML file is to specify all tasks to be deployed. This file is a regular Spring XML definition file. META-INF entry may also contain any other file specified by JAR format.

  • lib entry contains all library dependencies.

  • Compiled Java classes must be placed in the root of a GAR file.

GAR file may be deployed without descriptor file. If there is no descriptor file, GridDeploymentSpidoc will scan all classes in archive and instantiate those that implement GridTaskdoc interface. In that case, all grid task classes must have a public no-argument constructor (you can always use GridTaskAdapterdoc adapter for convenience when creating grid tasks).

Note
gridgain.xml
GAR Descriptor gridgain.xml is optional. If not provided - GridDeploymentSpidoc will scan all classes in GAR archive.

By default, all downloaded GAR files that have digital signature in META-INF folder will be verified and deployed only if signature is valid.

gridgain.xml

gridgain.xml GAR descriptor file is a standard Spring XML that should contain zero or more java.util.List beans. Each list should contain fully qualified class names for grid tasks. Here’s an example of gridgain.xml:

<?xml version="1.0" encoding="UTF-8"?>

<!--
    Spring configuration file for test classes in gar-file.
-->
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:util="http://www.springframework.org/schema/util"
       xsi:schemaLocation="
        http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-2.0.xsd
        http://www.springframework.org/schema/util http://www.springframework.org/schema/util/spring-util-2.0.xsd">
    <description>Gridgain Spring configuration file in gar-file.</description>

    <!--
        Test tasks specification.
    -->
    <util:list id="tasks">
        <value>foo.bar.SomeGridTask1</value>
        <value>foo.bar.SomeGridTask2</value>
    </util:list>
</beans>

9.4.2. Ant GAR Task

GridGain is shipped with GAR Ant task: GridGarAntTaskdoc. This task extends zip Ant task and can be used exactly like standard jar Ant task. GAR Ant task allows to archive class files and necessary dependencies (like resource files and libraries) with optional descriptor file (gridgain.xml). GridGain comes with an example of using GAr deployment including build.xml. Here’s an example of how to use GAR Ant task in typical build.xml:

<!--
    Special task for creating GAR files.
-->
<taskdef name="gar" classname="org.gridgain.grid.tools.ant.gar.GridGarAntTask"
    classpathref="gg.libs.path"/>

<!-- Create GAR file. -->
<gar destfile="${examples.gar.deploy.dir}/${gar.name}"
    descrdir="${examples.gar.dir}/META-INF"
    basedir="${examples.gar.deploy.dir}/tmpgar"/>

9.5. Deployment SPIs

GridGain comes with 2 deployment SPIs:

  • GridLocalDeploymentSpidoc - use it for P2P and JEE deployments.

  • GridUriDeploymentSpidoc - use it for GAR deployment.

10. Zero Deployment

TODO

11. Compute Grid

11.1. MapReduce

11.1.1. Overview

MapReduce is a distributed computing paradigm which allows to map your task into smaller jobs based on some key, execute these jobs on Grid nodes, and reduce multiple job results into one task result.

Here is a diagram that explains how MapReduce works based on Shape Counter example. Given a collection of Shapes we split this collection into 2 parts and send every part to a grid node. Each node will count number of Shapes provided and will return it back to caller. The caller then will add results received from remote nodes and provide the reduced result back to the user (the counts are displayed next to every shape).

http://www.gridgain.com/images/mapreduce_small.png

In GridGain, MapReduce paradigm is implemented via GridTaskdoc interface.

Map Operation

The GridTask.map(..) method splits a task into multiple instances of GridJobdoc and maps every GridJobdoc to a grid node.

Result Operation

Upon completion of any job, GridTask.result(..) method is invoked which is responsible to tell GridGain whether to Wait for more job results, Reduce now, or Failover this job to another node.

Reduce Operation

This operation is responsible for taking multiple results from remote jobs and reducing them into one aggregate result. This aggregated result will be returned to the user.

11.1.2. Pull vs. Push MapReduce

One of the fundamental differences between GridGain’s implementation of MapReduce and the ones in the existing or legacy systems like Sun GridEngine, GigaSpaces, Hadoop and Globus is the cardinality or the type of the mapping operation. In conventional approach the worker nodes pull the sub-tasks for execution. In GridGain, sub-tasks are pushed to the worker nodes and this process is initially controlled by the task. The latter has fundamental advantage that was largely missing in grid computing frameworks before GridGain.

Note GridGain approach of giving task the control of sub-task distribution enables early and late load balancing algorithms. This effectively helps to adapt task execution to non-deterministic nature of execution on the grid. Not having this capability significantly narrows deployment options where optimal performance and scalability can be achieved.

This unique property of GridGain’s MapReduce implementation has profound effect on ability to develop grid applications with the advanced load balancing, failover and collision resolution logic.

See Early And Late Load Balancing for more information.

11.2. GridTask and GridJob

11.2.1. GridTask And GridJob Interfaces

To create a grid task you need to implement GridTaskdoc interface. When implementing this interface you will also need to be aware of GridJobdoc interface. Basically, both of these interfaces define practically everything you need to know to create a grid task. In a nutshell, GridTask is responsible for splitting business logic into multiple grid jobs, receiving results from individual grid jobs executing on remote nodes, and reducing (aggregating) received jobs' results into final grid task result.

Grid task gets split into jobs when GridTask.map(List, Object)doc method is called. This method returns all jobs for the task mapped to their corresponding grid nodes for execution. Grid will then serialize this jobs and send them to requested nodes for execution.

See GridTaskdoc and GridJobdoc Javadoc documentation for more information about their API.

11.2.2. Executing Grid Tasks

Grid-enabling is a process of making a piece of Java code to execute on the grid. In GridGain, there are two ways to do grid-enabling: API-based and annotation-based.

Tip
Direct Execution vs. Annotation-Based AOP
There is no better or worse between these two methods. They both have their areas of applicability. When creating grid task you basically have the same programming and development model as in JEE: you create a component, deploy it and execute it. With annotation-based grid-enabling you have an extra option of transparently attaching grid-enabling logic to existing code without modifying it (except for additional annotation).
API-Based Grid Task Execution

This method allows to grid-enable any arbitrary Java code. You have a full control on split and aggregate logic and all other aspects of grid task execution. Here is an example of direct grid task execution:

public static void main(String[] args) throws GridException {
    GridFactory.start();

    try {
        Grid grid = GridFactory.getGrid();

        // Execute task.
        GridTaskFuture<String> future = grid.execute(FooBarTask.class, "Argument");

        // Wait for task completion.
        String result = fugure.get();

        // Print out task result.
        System.out.println("Task result: " + result);
    }
    finally {
        GridFactory.stop(true);
    }
}
Annotate Existing Method With Gridify Annotation

The only difference of this method vs. directly executing grid task is that you can annotate a regular Java method and it will become grid-enabled. Using this technique you can still have custom grid task that will handle annotation-based grid-enabling (including split & aggregate logic or passing state to remote jobs) but you will be limited to the boundaries of the method you are grid-enabling. Here is an example of such usage:

@Gridify(taskClass = FooBarTask.class, timeout = 3000)
public void sayIt(String arg) {
    // Some business logic.
}

For information on how to configure AOP, refer to AOP Configuration section.

Tip
Serializable State
Note that when using @Gridifydoc annotation on non-static methods without specifying explicit grid task, the state of the whole instance will be serialized and sent out to remote node. Therefore the class must implement java.io.Serializable interface. If you cannot make the class Serializable, then you must implement custom grid task which will take care of proper state initialization. In either case, GridGain must be able to serialize the state passed to remote node.

11.2.3. Configuring Grid Tasks

Starting with GridGain 2.1 you can start multiple instances of Topology SPI, Load Balancing SPI, Failover SPI and Checkpoint SPI. If you do that, you need to tell a task which SPI to use (by default it will use the fist SPI in the list).

Add @GridTaskSpisdoc annotation to your task to specify which SPIs it wants to use. If this annotation is omitted, then by default GridGain will pick the first corresponding SPI implementation from the array provided in configuration.

For more information and examples refer to Specifying Different SPIs Per GridTask documentation.

11.2.4. Grid Task Execution Sequence

The sequence of task execution can be described as following:

  1. Upon request to execute a grid task with given task name system will find deployed task with given name.

  2. System will create new Distributed Grid Task Session. Also see GridTaskSessiondoc.

  3. System will inject all annotated resources (including Distributed Grid Task Session) into grid task instance. See Resources Injection for more information.

  4. System will call method map(…) on GridTaskdoc interface. These method is basically responsible for splitting business logic of grid task into multiple grid jobs (units of execution) and mapping them to grid nodes. Method map(…) returns a map of grid jobs keyed by the grid nodes. Consider using @GridLoadBalancerResourcedoc to inject load balancer into task for assigning jobs to the best available nodes.

  5. System will start sending grid jobs to their respective nodes.

  6. Upon arrival to remote node, grid job gets put on waiting list which is passed to underlying GridCollisionSpidoc SPI.

  7. The Collision SPI on remote node will decide one of the following scheduling policies:

    Policy Description

    WAIT

    Grid Job will be kept on waiting list. In this case, job will not get a chance to execute until next time the Collision SPI is called. Collision SPI gets called every time a new job arrives or an active one completes.

    EXECUTE

    Grid Job will be moved to active list (i.e. activated). In this case system will proceed with job execution.

    REJECT

    Job on the waiting list can be rejected before they get a chance to start executing. In this case the GridJobResultdoc passed into GridTask.result(GridJobResult, List)doc method will contain GridExecutionRejectedExceptiondoc exception. If you are using any of the task adapters shipped with GridGain, then job will be failed over automatically for execution on another node.

    CANCEL

    If GridJob is on the active list and is currently executing, then it can be canceled by calling GridJob.cancel()doc method. Note that in this case job will still complete and return a result from GridJob.execute()doc method.

  8. For activated jobs on remote nodes, system will inject all annotated resources (including Distributed Grid Task Session) into grid job instance. See Resources Injection for more information.

  9. Remote nodes will execute the jobs by calling GridJob.execute()doc method.

  10. If job gets canceled while executing on remote node, then GridJob.cancel()doc method will be called. Note that just like with Thread.interrupt() method, grid job cancellation serves as a hint that a job should stop executing or exhibit some other user defined behavior. Generally it is up to a job to decide whether it wants to react to cancellation or ignore it. Job cancellation can happen for several reasons:

    • Collision SPI has canceled an active job.

    • Parent task has completed without waiting for this job’s result.

    • User canceled task by calling GridTaskFuture.cancel()doc method.

  11. Once job execution is complete, the return value will be sent back to parent task and will be passed into GridTask.result(GridJobResult, List)doc method. If job execution resulted in a checked exception, then GridJobResult.getException()doc method will contain that exception. If job execution threw a runtime exception or error, then it will be wrapped into GridUserUndeclaredExceptiondoc exception. # Method GridTask.result(GridJobResult, List)doc is called for each job result and decides whether or not to continue waiting for the remaining results, failover current result or reduce immediately based on returned policy.

    Policy Description

    GridJobResultPolicy.WAITdoc

    If this policy is returned, then Grid Task will continue to wait for other job results. If this result is the last job result, then GridTask.reduce(List)doc method will be called.

    GridJobResultPolicy.REDUCEdoc

    If this policy is returned, then method GridTask.reduce(List)doc will be called right away without waiting for other jobs' completion (all remaining jobs will receive a cancel request).

    GridJobResultPolicy.FAILOVERdoc

    If this policy is returned, then job will be failed over to another node for execution. The node to which job will get failed over to is decided by GridFailoverSpidoc SPI implementation. Note that if you use any of task adapters then they will automatically fail-over jobs to ther nodes for 2 known failure cases: node crash and job rejection.

  12. When enough results are received, method GridTask.reduce(List)doc is called to aggregate (reduce) these results into one final grid task result.

  13. After reduce(…) is complete - the result is returned to user as grid task result and can be retrieved from GridTaskFuture.get()doc method.

  14. System will clean up all task session resources (such as checkpoints with session scope). Execution of the grid task is considered finished at this point.

11.2.5. Grid Task Coding Guidelines

There are certain known patterns and anti-patterns to be aware of when developing grid task and jobs.

Serialization and Deserialization

Jobs created by task are moved from one grid node to another. Before sending they are serialized into the byte stream and thus need to implement java.io.Serializable interface. On remote node every job is deserialized with a class loader that depends on deployment method (see Grid Deployment).

Prior to GridGain 2.1, every grid job class member (including super classes) except for static members need to implement java.io.Serializable. Static class members will not be sent to remote node and should be initialized on remote node. Note also that task parameters passed into GridJob.execute()doc method are sent to remote nodes and need to implement java.io.Serializable as well.

Starting with GridGain 2.1, you can configure different Grid Marshallers and depending on a marshaller, serialization may either be required or not.

Inner and Anonymous Classes

Any kind of inner classes or anonymous classes are allowed. Write your code as you usually do and GridGain will distribute it. You can implement your job as anonymous class within grid task class and use task class members inside your job. Here is an example of anonymous job:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
import java.io.*;
import java.util.*;
import org.gridgain.grid.*;

/**
 * Test task with anonymous job which uses method scope variable.
 */
public class TestGridTask extends GridTaskSplitAdapter<String> {
    /** Dummy multiplier. */
    private int multiplier = 3;

    /**
     * This method is responsible for splitting a task into multiple jobs.
     */
    @Override
    protected Collection<? extends GridJob> split(int gridSize, final String arg) throws GridException {
        List<GridJobAdapter<String>> jobs = new ArrayList<GridJobAdapter<String>>(gridSize);

        for (int i = 0; i < gridSize; i++) {
            jobs.add(new GridJobAdapter<String>() {
                /**
                 * Every job simply multiplies number of characters in the argument by some multiplier.
                 */
                public Serializable execute() throws GridException {
                    return multiplier * arg.length();
                }
            });
        }

        return jobs;
    }

    /**
     * Reduces multiple job results into one task result.
     */
    public Object reduce(List<GridJobResult> results) throws GridException {
        int sum = 0;

        // For the sake of this example, let's sum all results.
        for (GridJobResult res : results) {
            sum += (Integer)res.getData();
        }

        return sum;
    }
}

Here we have anonymous job class created at line 20 which uses method-scope variable arg of task class declared in method signature at line 16 and used in job at line 25 as well as task class member multiplier declared at line 10 and used at line 25.

Overriding Methods with Gridify Annotation

If you have following code:

public class A {
    @Gridify
    protected methodA() {
        ...
    }
}

public class B extends A {
    @Override
    protected methodA() {
        ...
        super.methodA();
        ...
    }
}

and use aspects you should get B.methodA() called twice, first on your local node and second time on remote node regardless of class or method modifiers. This is a feature of aspects implementation and we don’t recommend to use @Gridify in parent classes.

Here is step by step explanation:

  1. You create object of class B.

  2. You make a call to B.methodA() and since this method does not have annotation in class B aspects will not work.

  3. Your B.methodA() executes and it calls super.methodA()

  4. A.methodA() has annotation and thus aspect will call GridGain and distribute your object of class B and method call to a grid node.

  5. On the grid node (local or remote) B.methodA() will be called (note that you have object of class B) again.

  6. Your B.methodA() executes and it calls super.methodA()

  7. Method A.methodA() has annotation but GridGain will catch this situation and it won’t be distributed twice but instead will be just called.

As you can see we have 2 executions of B.methodA() and only one A.methodA().

11.2.6. Resources Injection

GridTaskdoc and GridJobdoc implementations can be injected using IoC (dependency injection) with grid resources. Both, field and method based injection are supported.

The following grid resources can be injected:

Resource Description

@GridTaskSessionResourcedoc

@GridInstanceResourcedoc

Injects the actual instance of Griddoc this task is executed on.

@GridLoggerResourcedoc

Injects an instance of GridLoggerdoc logger used by this grid instance.

@GridHomeResourcedoc

Injects a path to GridGain installation home.

@GridExecutorServiceResourcedoc

Injects an instance of java.util.concurrent.ExecutorService used by this grid.

@GridLocalNodeIdResourcedoc

Injects local grid node ID of type java.util.UUID.

@GridMBeanServerResourcedoc

Injects an instance of javax.management.MBeanServer used by this grid node.

@GridJobIdResourcedoc

This resource can only be injected into Grid Jobs and not Grid Tasks. It injects unique job execution ID of type java.util.UUID into an instance of Grid Job.

@GridSpringApplicationContextResourcedoc

This resource injects the Spring application context into tasks and jobs. You can use it for accessing Spring beans or any other information available in Spring application context. By default, this application context is the same as the one used for configuring GridGain, but you can pass a custom one by calling GridFactory.start(GridConfiguration, ApplicationContext)doc method.

Note Note that Spring Application Context is local to every node and is not distributed. Make sure that all bean classes and resources declared in Spring file are available on the node’s classpath.

@GridUserResourcedoc

Use this annotation to inject custom resources into tasks and jobs. The scope of this resource is per-task, so it will be initialized once the task is deployed and de-initialized once task is undeployed. Also see @GridUserResourceOnDeployeddoc and @GridUserResourceOnUndeployeddoc for controlling resource life cycle.

@GridMarshallerResourcedoc

Resource can be injected into the task, job or SPI and gives you simple way of marshalling/unmarshalling data or objects (since 2.1.0).

@GridSpringBeanResourcedoc

Injects any custom resources declared in provided Spring ApplicationContext. It can be injected into grid tasks and grid jobs. The resource will be picked up from provided Spring ApplicationContext by name value. Note, that injected spring bean must be declared in Spring ApplicationContext on every grid node where they get accessed (since 2.1.0).

Refer to Resources Injection for more information.

11.2.7. Convenience Adapters

GridTaskdoc and GridJobdoc come with several convenience adapters to make the usage easier:

Adapter Description

GridTaskAdapterdoc

Grid Task adapter that provides default implementation for GridTask.result(GridJobResult, List)doc method which implements automatic fail-over to another node if remote job has failed due to a node crash (detected by GridTopologyExceptiondoc exception) or due to job execution rejection (detected by GridExecutionRejectedExceptiondoc exception).

GridTaskSplitAdapterdoc

Grid Task adapter that hides the job-to-node mapping logic from user and provides convenient GridTaskSplitAdapter.split(int, Object)doc method for splitting task into sub-jobs in homogeneous environments.

GridJobAdapterdoc

Grid Job adapter that provides default empty implementation for GridJob.cancel()doc method and also allows user to set and get job argument, if there is one.

Refer to corresponding adapter documentation for more information.

11.2.8. Distributed Session Attributes And Checkpoints

Both, Grid Tasks and Grid Jobs can utilize Distributed Grid Task Session for coordination with each other via session attributes and checkpoints.

Session Attributes

Jobs can communicate with parent task and with other job siblings from the same task by setting session attributes (see GridTaskSessiondoc). Other jobs can wait for an attribute to be set either synchronously or asynchronously. Such functionality allows jobs to synchronize their execution with other jobs at any point and can be useful when other jobs within task need to be made aware of certain event or state change that occurred during job execution.

Saving Checkpoints

Long running jobs may wish to save intermediate checkpoints to protect themselves from failures. There are three checkpoint management methods available on Grid Task Session which allow user to save, load, and remove checkpoints.

Jobs that utilize checkpoint functionality should attempt to load a check point at the beginning of execution. If a non-null value is returned, then job can continue from where it failed last time, otherwise it would start from scratch. Throughout it’s execution job should periodically save its intermediate state to avoid starting from scratch in case of a failure.

Refer to Distributed Grid Task Session documentation for more information.

11.2.9. MapReduce Paradigm

The design of GridTaskdoc is heavily influenced by Google MapReduce paradigm. For more information about MapReduce paradigm, refer to MapReduce: Simplified Data Processing on Large Clusters article from Google.

11.2.10. Example

Below is a grid task implementation that is responsible for split and aggregate (a.k.a map/reduce) logic. Note that this implementation uses GridTaskSplitAdapterdoc that simplifies API for grid tasks in homogeneous grids (which is often the case). Main two methods that are implemented here are split and reduce. Method reduce aggregates result (number of characters in the string) returned from every node.

package org.gridgain.examples.helloworld.api;

import org.gridgain.grid.*;
import java.util.*;
import java.io.*;

public class GridHelloWorldTask extends GridTaskSplitAdapter<String, Integer> {
    /** Auto-injected grid logger. */
    @GridLoggerResource
    private GridLogger log = null;

    @Override
    public Collection<? extends GridJob> split(int gridSize, String phrase) throws GridException {
        // Split the passed in phrase into multiple words separated by spaces.
        String[] words = phrase.split(" ");

        List<GridJob> jobs = new ArrayList<GridJob>(words.length);

        for (String word : words) {
            // Every job gets its own word as an argument.
            jobs.add(new GridJobAdapter<String>(word) {
                /*
                 * Simply prints the word passed into the job and
                 * returns number of letters in that word.
                 */
                public Serializable execute() {
                    String word = getArgument();

                    if (log.isInfoEnabled() == true) {
                        log.info(">>>");
                        log.info(">>> Printing '" + word + "' on this node from grid job.");
                        log.info(">>>");
                    }

                    // Return number of letters in the word.
                    return word.length();
                }
            });
        }

        return jobs;
    }

    /**
     * Sums up all letters from all jobs and returns a
     * total number of letters in the phrase.
     *
     * @param results Job results.
     * @return Number of letters for the phrase passed into
     *      <tt>split(gridSize, phrase)</tt> method above.
     * @throws GridException If reduce failed.
     */
    public Integer reduce(List<GridJobResult> results) throws GridException {
        int totalCharCnt = 0;

        for (GridJobResult res : results) {
            // Every job returned a number of letters
            // for the word it was responsible for.
            Integer charCnt = res.getData();

            totalCharCnt += charCnt;
        }

        // Account for spaces. For simplicity we assume one space between words.
        totalCharCnt += results.size() - 1;

        // Total number of characters in the phrase
        // passed into task execution.
        return totalCharCnt;
    }
}

11.3. GridProjection

TODO

11.4. GridTaskSession

11.4.1. Overview

Distributed task session is created for every task execution. It is defined by GridTaskSessiondoc interface. Task session is distributed across the parent task and all grid jobs spawned by it, so attributes set on a task or on a job can be viewed on other jobs. Correspondingly attributes set on any of the jobs can also be viewed on a task.

Session has 2 main features: attribute and checkpoint (see Checkpoint SPI for more details) management. Both, attributes and checkpoints, can be used from task itself and from the jobs belonging to this task. Session attributes and checkpoints can be set from any task or job methods. Session attribute and checkpoint consistency is fault tolerant and is preserved whenever a job gets failed over to another node for execution. Whenever task execution ends, all checkpoints saved within session with GridTaskSessionScope.SESSION_SCOPEdoc scope will be removed from checkpoint storage. Checkpoints saved with GridTaskSessionScope.GLOBAL_SCOPEdoc will outlive the session and can be viewed by other tasks.

The sequence in which session attributes are set is consistent across the task and all job siblings within it. There will never be a case when one job sees attribute A before attribute B, and another job sees attribute B before A. Attribute order is identical across all session participants. Attribute order is also fault tolerant and is preserved whenever a job gets failed over to another node.

11.4.2. Connected Tasks

Note that apart from setting and getting session attributes, tasks or jobs can choose to wait for a certain attribute to be set using any of the GridTaskSession.waitForAttribute(..) methods. Tasks and jobs can also receive asynchronous notifications about a certain attribute being set through GridTaskSessionAttributeListenerdoc listener. Such feature allows grid jobs and tasks remain connected in order to synchronize their execution with each other and opens a solution for a whole new range of problems.

Imagine for example that you need to compress a very large file (let’s say terabytes in size). To do that in grid environment you would split such file into multiple sections and assign every section to a remote job for execution. Every job would have to scan its section to look for repetition patterns. Once this scan is done by all jobs in parallel, jobs would need to synchronize their results with their siblings so compression would happen consistently across the whole file. This can be achieved by setting repetition patterns discovered by every job into the session. Once all patterns are synchronized, all jobs can proceed with compressing their designated file sections in parallel, taking into account repetition patterns found by all the jobs in the split. Grid task would then reduce (aggregate) all compressed sections into one compressed file. Without session attribute synchronization step this problem would be much harder to solve.

11.4.3. Session Injection

Session can be injected into a task or a job using IoC (dependency injection). See [Resources Injection] page for additional details.

11.4.4. Example

Below is a grid task implementation that is responsible for split and aggregate (a.k.a map/reduce) logic. Note that this implementation uses GridifyTaskSplitAdapterdoc that simplifies API for grid tasks in homogeneous grids (which is often the case). Main two methods that are implemented here are split and reduce. Method reduce aggregates results (sums up all numbers returned by jobs) to calculate length of initial string.

This task will split passed in string into separate word and then pass each word into its own job for execution on different nodes. Every job will do the following:

  1. Execute grid-enabled method with argument passed in.

  2. Add its argument to the session.

  3. Wait for other jobs to add their arguments to the session.

  4. Execute grid-enabled method with all session attributes concatenated into one string as an argument.

package org.gridgain.examples.helloworld.gridify.session;

import java.io.*;
import java.util.*;
import org.gridgain.grid.*;
import org.gridgain.grid.gridify.*;
import org.gridgain.grid.resources.*;

/**
 * Grid task for {@link GridifyHelloWorldSessionExample} example. It handles spiting
 * this example into multiple jobs for execution on remote nodes.
 * <p>
 * Every job will do the following:
 * <ol>
 * <li>Execute grid-enabled method with argument passed in.</li>
 * <li>Add its argument to the session.</li>
 * <li>Wait for other jobs to add their arguments to the session.</li>
 * <li>Execute grid-enabled method with all session attributes concatenated into one string as an argument.</li>
 * </ol>
 */
public class GridifyHelloWorldSessionTask extends GridifyTaskSplitAdapter<Integer> {
    /** Grid task session will be injected. */
    @GridTaskSessionResource
    private GridTaskSession ses = null;

    /**
     * {@inheritDoc}
     */
    @Override
    protected Collection<? extends GridJob> split(int gridSize, GridifyArgument arg) throws GridException {
        String[] words = ((String)arg.getMethodParameters()[0]).split(" ");

        List<GridJobAdapter<String>> jobs = new ArrayList<GridJobAdapter<String>>(words.length);

        for (String word : words) {
            jobs.add(new GridJobAdapter<String>(word) {
                /** Job context will be injected. */
                @GridJobContextResource
                private GridJobContext jobCtx = null;

                /**
                 * Executes grid-enabled method once with all
                 * session attributes concatenated into string
                 * as an argument and again with passed in argument.
                 */
                public Serializable execute() throws GridException {
                    String word = getArgument();

                    // Set session attribute with value of this job's word.
                    ses.setAttribute(jobCtx.getJobId(), word);

                    try {
                        // Wait for all other jobs within this task to set their attributes on
                        // the session.
                        for (GridJobSibling sibling : ses.getJobSiblings()) {
                            // Waits for attribute with sibling's job ID as a key.
                            if (ses.waitForAttribute(sibling.getJobId()) == null) {
                                throw new GridException("Failed to get session attribute from job: " +
                                    sibling.getJobId());
                            }
                        }
                    }
                    catch (InterruptedException e) {
                        throw new GridException("Got interrupted while waiting for session attributes.", e);
                    }

                    // Create a string containing all attributes set by all jobs
                    // within this task (in this case an argument from every job).
                    StringBuilder msg = new StringBuilder();

                    // Formatting.
                    msg.append("All session attributes [ ");

                    for (Serializable jobArg : ses.getAttributes().values()) {
                        msg.append(jobArg).append(' ');
                    }

                    // Formatting.
                    msg.append(']');

                    // For the purpose of example, we simply log session attributes.
                    log.info(msg.toString());

                    // Execute gridified method again and return the number
                    // characters in the passed in word.
                    return GridifyHelloWorldSessionExample.sayIt(word);
                }
            });
        }

        return jobs;
    }


    /**
     * Sums up all characters from all jobs and returns a
     * total number of characters in the initial phrase.
     *
     * @param results Job results.
     * @return Number of letters for the word passed into
     *      {@link GridifyHelloWorldSessionExample#sayIt(String)} method.
     * @throws GridException If reduce failed.
     */
    public Integer reduce(List<GridJobResult> results) throws GridException {
        int totalCharCnt = 0;

        for (GridJobResult res : results) {
            // Every job returned a number of letters
            // for the phrase it was responsible for.
            Integer charCnt = res.getData();

            totalCharCnt += charCnt;
        }

        // Account for spaces. For simplicity we assume one space between words.
        totalCharCnt += results.size() - 1;

        // Total number of characters in the phrase
        // passed into task execution.
        return totalCharCnt;
    }
}

11.5. Zero Deployment

TODO

11.6. Task Resources

TODO

11.7. GridNodeLocal

When working in distributed environment often you need to have a consistent local state per grid node that is reused between various job executions. For example, what if multiple jobs require database connection pool for their execution - how do they get this connection pool to be initialized once and then reused by all jobs running on the same grid node? Essentially you can think about it as a per-grid-node singleton service, but the idea is not limited to services only, it can be just a regular Java bean that holds some state to be shared by all jobs running on the same grid node.

Before GridGain 3.0 this approach was handled by using @GridUserResourcedoc annotation to annotate fields within GridTaskdoc or GridJobdoc classes to specify singleton beans. However, this approach was dependent on GridDeploymentModedoc configuration and, for ISOLATED or PRIVATE deployment modes, resource could be initialized multiple times, once per GridTask. This forced users to use various hacks in their logic and generally was not very convenient to use.

Starting with GridGain 3.0 GridNodeLocaldoc per-grid-node local cache was introduced. The name was borrowed from ThreadLocal class in Java, because just like ThreadLocal provides unique space per-thread in Java, GridNodeLocal provides unique space per-grid-node in GridGain. GridNodeLocal implements java.util.concurrent.ConcurrentMap interface and is absolutely lock-free. In fact, it simply extends java.util.concurrent.ConcurrentHashMap implementation and, therefore, inherits all the methods available there.

Here is an example of how GridNodeLocal could be used to create some user specific singleton connection pool from a simple GridGain job:

GridNodeLocal Example
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
final Grid grid = G.start(..);
...
// Execute runnable job on some remote grid node.
grid.run(GridClosureCallMode.BALANCE, new Runnable() {
    public void run() {
        GridNodeLocal<String, MySingletonConnectionPool> nodeLocal = grid.nodeLocal();

        // 1. First see if someone already stored connection pool in node-local storage.
        MySingletonConnectionPool pool = nodeLocal.get("connPool");

        if (pool == null) {
            // 2. Create new connection pool and store it in node-local storage.
            MySingletonConnectionPool old = pool.putIfAbsent("connPool", pool = new MySingletonConnectionPool(..));

            if (old != null)
                pool = old;
        }

        // Perform operations with connection pool.
        ...
    }
});

11.8. Failover

TODO

11.9. Collision Resolution

TODO

11.10. Topology Management

TODO

11.11. Load Balancing

11.11.1. Overview

In MapReduce pattern the mapping is a process of splitting the initial task into sub-tasks and assigning them to the grid nodes. Mapping generally involves the splitting logic itself, mapping sub-tasks to the nodes including load balancing, and potential failover and collision resolution. In conventional approach the worker nodes pull the sub-tasks for execution. In GridGain, sub-tasks are pushed to the worker nodes and this process is initially controlled by the task. The later has fundamental advantage that was largely missing in grid computing frameworks before GridGain.

Tip GridGain approach of giving task the control of sub-task distribution enables early and late load balancing algorithms. This effectively helps to adapt task execution to non-deterministic nature of execution on the grid. Not having this capability significantly narrows deployment options where optimal performance and scalability can be achieved.

11.11.2. Early And Late Load Balancing

The sequence of steps described below shows when Early and Late load balancing policies come into play:

  1. Someone calls one of GridProjection.execute(..) methods passing grid task and its argument to initiate grid task execution in the system.

  2. Method GridTask.map(..) will be called on the task to perform the initial mapping. This method is responsible for taking a task, splitting it into number of sub-tasks and mapping every sub-task with one or more grid nodes. This method returns set of sub-task:node pairs. This is what we call Early Load Balancing as it is done right during initial mapping operation and with only information available at the execution initiation time (see Load Balancing SPI documentation).

  3. Once mapping is done the sub-tasks will travel to respective remote nodes for execution.

  4. When sub-task arrives to the destination grid node it will be subject for collision (scheduling) resolution via Collision SPI. This SPI is called every time when new sub-task arrived, existing sub-task finished its execution or a metrics update is received (with every heartbeat). Collision SPI looks into the queue of its sub-tasks (including a newly received one, if any) and can either cancel sub-task, leave it waiting in the queue, transfer it to another node for execution, or start its execution locally. This is what we call Late Load Balancing. This load balancing happens later in the process of execution and it happens on destination node right where sub-task is about to get executed.

The important characteristic of the late load balancing is that there can be a significant time difference between mapping (early load balancing) and actual time when execution of the sub-task commences on the remote node - and late load balancing allows to account for this non-deterministic aspect of grid execution and potentially re-balance the sub-task on the grid.

For example, our Job Stealing Collision SPI does exactly that. It monitors number of queued sub-tasks on each node and preemptively moves waiting sub-tasks from "busy" node to the "idle" node for execution.

Load balancing capabilities in GridGain are more of the advanced features and not everyone would need them. For example, in homogeneous grid with homogeneous tasks load balancing achieved naturally. However, in many other cases when conditions are more real-life - sophisticated load balancing capabilities are about the only way to get the most out of your grid.

For more information on MapReduce refer to Map/Reduce: Simplified Data Processing on Large Clusters article from Google.

Early Load Balancing

Load balancing is a simple process of the optimal assignment of jobs to the nodes where these jobs to be executed. As almost all kernel level functionality in GridGain the load balancing is designed as SPI (Service Provider Interface). It consists of the public SPI and several implementations. Number of pre-built implementations are shipped with GridGain and user can develop one easily.

Load balancing SPI provides the next best balanced node for job execution. This SPI is used either implicitly or explicitly whenever a job gets mapped to a node during GridTask.map(..) invocation

This load balancing is usually referred as early load balancing as it happens early in the process of the grid task execution during mapping phase of MapReduce process. Note that late load balancing happens during collision resolution and is handled by Collision SPI.

Late Load Balancing

Grid jobs are said to be in collision when a job arrives onto node that already has one or more jobs either waiting or executing on it. Job collision resolution provides means to resolve this collision by basically allowing to:

  • put newly arrived job into the waiting queue

  • schedule it for immediate execution

  • cancel it (and preempt it by failing it over to another node)

  • wake up already waiting job from the queue and schedule it for immediate execution

As almost any kernel level functionality in GridGain collision is designed as SPI (Service Provider Interface). It consists of the public API and several implementations. As always, several pre-built implementations are shipped with GridGain and available for the developer - and custom ones can be easily built.

Collision SPI allows to regulate how grid jobs get executed when they arrive on a destination node for execution. In general a grid node will have multiple jobs arriving to it for execution and potentially multiple jobs that are already executing or waiting for execution on it. There are multiple possible strategies dealing with this situation: all jobs can proceed in parallel, or jobs can be serialized i.e., only one job can execute in any given point of time, or only certain number or types of grid jobs can proceed in parallel, etc.

Collision SPI doesn’t expose any public APIs and works implicitly behind the scenes. As with any SPI, developer can provide its own implementation and plug it into GridGain.

Collision is generally referred as late load balancing as it happens late in the execution process when job has already arrived onto destination node. In fact, it allows to load balance jobs in the context of the given node. Note that early load balancing handled by Load Balancing SPI and occurs during initial mapping phase of MapReduce process.

11.12. AOP-Based Grid-Enabling

TODO

11.13. Closure Execution

TODO

11.14. Executor Service

TODO

11.15. Cron-Based Scheduling

TODO

11.16. Remote Actors vs. GridGain and Concurrency Unification

This is a bit of off-topic chapter discussing the differences between popular Actors concept (remote actors specifically) and functionality provided by GridGain. I was convinced to write about it after I got questions on Actors vs. GridGain Scalar at almost every conference I spoke about our Scalar DSL.

When talking about Actors there is always a bigger topic of Concurrency Unification trend that aims to combine principles of local multithreading concurrency and distributed programming. We at GridGain are strong supporters of concurrency unification. The example later on in this chapter will show some of our current work in this direction.

Note
actor vs. Actor

We use lowercase actor to denote an instance of actor class or type, and uppercase Actor to denote the concept of actors.

Back to Actors… After my presentation at GeeCON in the spring of 2011 I got the email asking for GridGain version of Pi-calculation example from Akka, a very popular and deservingly so, Actor framework in Scala. The sender was asking for help to compare Akka/Scala actors approach to basic distributed programming and GridGain Scalar’s approach.

I haven’t seen the Akka’s example before so I first downloaded the Akka 1.1 and looked at it…

Now, before we get to it I want to re-iterate few points related to Actor-based concurrency (note that these points are implementation-agnostic and apply equally to Scala actors or Akka actors).

I believe that Actors is an important "new" abstraction for elegantly resolving multithreading concurrency. I’m not, however, subscribing to an idealistic view that they are drop-in replacement for threads and java.util.concurrent utilities. Most of the real-life examples and applications that I’ve seen use all of these mechanisms together with actors.

Note
Actors

I believe that Actors is an important "new" abstraction for elegantly resolving multithreading concurrency.

It is often repeated that Actors do work best when they are used throughout the application - not just for solving one particular synchronization problem - but to build the entire subsystem or a module based on Actors. I tend to agree. Mixing and matching shared-state concurrency with actors produce rather awkward combination and significantly negates any advantages Actors bring.

There are, of course, many use cases where Actors simply don’t work well. Anytime you need a fine grain control, general performance fine-tuning or determinism on threading, or when you need more sophisticated locking algorithms (read/write, counting critical sections, etc.), or shared state is unavoidable - Actors tend to produce more verbose and less flexible solution. For example, I’ve seen several times attempts to implement pseudo-semaphore synchronization on the group of actors - and this was rather ugly.

Now, despite my positive outlook about general Actors for better concurrency management I have more reservations about Remote Actors - i.e. applying the same Actors concept in the distributed context.

11.16.1. Remote Actors and Concurrency Unification

Remote Actors basically allow to exchange messages between actors in different JVMs.

One of the major appeal of remote actors is that they attempt to bridge local JVM multithreading and distributed concurrency. At the first glance it seems rather elegant to extend the share-nothing message passing metaphor into distributed context to provide long sought-after Concurrency Unification.

The illusive advantage of that, however, is ill-fit.

The obvious contention is that in the distributed systems:

  • State is by default not shared by the same JVM

  • Not shared state exposed to parallel access from multiple JVMs

  • Data is already passed as serialized messages

Important
Actors In Distributed Context

The key features of actors in local JVM multithreading are already present in distributed systems.

Yet, distributed systems introduce the host of their own challenges comparatively to local JVM multithreading (JVM-M):

  • much larger latencies

  • cost of message passing is not negligible anymore and can easily exceed the processing time

  • resource starvation and deadlocking due to conditions not present in JVM-M

  • topology management & discovery

  • heterogenous environment (different CPUs, number of cores, memory sizes, OSes, networking, language runtime, etc.)

  • failover is very different conceptually from JVM-M

  • distributed load balancing not present in JVM-M

  • data sharing (a.k.a In-Memory Data Grid) is fundamentally different from sharing data in JVM-M

  • compute sharing (a.k.a. Compute Grid) is fundamentally different from sharing computations in JVM-M

  • deployment and provisioning of the code is dramatically different from JVM-M

It should be pretty obvious that challenges of distributed systems is far wider and more complex in nature than local JVM multithreading and therefore it makes more sense to adopt distributed practices to local JVM multithreading (and not vice verse) to gain true Concurrency Unification. In fact, you need to design from more general to more specific APIs when attempting to unify two related concepts.

11.16.2. Back To Example

So, here is the example of Pi calculation verbatim from the Akka 1.1 tutorial. I took the liberty to remove some excessive comments to make the code shorter:

Pi Calculation using Akka 1.1 Actors
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
package akka.tutorial.first.scala

import akka.actor.{Actor, PoisonPill}
import Actor._
import akka.routing.{Routing, CyclicIterator}
import Routing._
import java.util.concurrent.CountDownLatch

object Pi extends App {
    calculate(nrOfWorkers = 4, nrOfElements = 10000, nrOfMessages = 10000)

    sealed trait PiMessage
    case object Calculate extends PiMessage
    case class Work(start: Int, nrOfElements: Int) extends PiMessage
    case class Result(value: Double) extends PiMessage

    class Worker extends Actor {
        // define the work
        def calculatePiFor(start: Int, nrOfElements: Int): Double = {
            var acc = 0.0
            for (i <- start until (start + nrOfElements))
            acc += 4.0 * (1 - (i % 2) * 2) / (2 * i + 1)
            acc
        }

        def receive = {
            case Work(start, nrOfElements) =>
                self reply Result(calculatePiFor(start, nrOfElements)) // perform the work
        }
    }

    class Master(nrOfWorkers: Int, nrOfMessages: Int, nrOfElements: Int, latch: CountDownLatch)
        extends Actor {
        var pi: Double = _
        var nrOfResults: Int = _
        var start: Long = _

        // create the workers
        val workers = Vector.fill(nrOfWorkers)(actorOf[Worker].start())

        // wrap them with a load-balancing router
        val router = Routing.loadBalancerActor(CyclicIterator(workers)).start()

        // message handler
        def receive = {
            case Calculate =>
                // schedule work
                for (i <- 0 until nrOfMessages) router ! Work(i * nrOfElements, nrOfElements)

                // send a PoisonPill to all workers telling them to shut down themselves
                router ! Broadcast(PoisonPill)

                // send a PoisonPill to the router, telling him to shut himself down
                router ! PoisonPill
            case Result(value) =>
                // handle result from the worker
                pi += value
                nrOfResults += 1
                if (nrOfResults == nrOfMessages) self.stop()
        }

        override def preStart() {
            start = System.currentTimeMillis
        }

        override def postStop() {
            // tell the world that the calculation is complete
            pri