GridGain™ 4.3.1e
Enterprise "Big Data" Edition

org.gridgain.grid
Interface GridTask<T,R>

Type Parameters:
T - Type of the task argument that is passed into map(List, Object) method.
R - Type of the task result returning from reduce(List) method.
All Superinterfaces:
Serializable
All Known Implementing Classes:
GridifyTaskAdapter, GridifyTaskSplitAdapter, GridTaskAdapter, GridTaskNoReduceAdapter, GridTaskNoReduceSplitAdapter, GridTaskSplitAdapter

public interface GridTask<T,R>
extends Serializable

Grid task interface defines a task that can be executed on the grid. Grid task is responsible for splitting business logic into multiple grid jobs, receiving results from individual grid jobs executing on remote nodes, and reducing (aggregating) received jobs' results into final grid task result.

Grid Task Execution Sequence

  1. Upon request to execute a grid task with given task name system will find deployed task with given name. Task needs to be deployed prior to execution (see Grid.deployTask(Class) method), however if task does not specify its name explicitly via @GridTaskName annotation, it will be auto-deployed first time it gets executed.
  2. System will create new distributed task session (see GridTaskSession).
  3. System will inject all annotated resources (including task session) into grid task instance. See org.gridgain.grid.resources package for the list of injectable resources.
  4. System will apply map(List, Object). This method is responsible for splitting business logic of grid task into multiple grid jobs (units of execution) and mapping them to grid nodes. Method map(List, Object) returns a map of with grid jobs as keys and grid node as values.
  5. System will send mapped grid jobs to their respective nodes.
  6. Upon arrival on the remote node a grid job will be handled by collision SPI (see GridCollisionSpi) which will determine how a job will be executed on the remote node (immediately, buffered or canceled).
  7. Once job execution results become available method result(GridJobResult, List) will be called for each received job result. The policy returned by this method will determine the way task reacts to every job result:
  8. Once all results are received or result(GridJobResult, List) method returned GridJobResultPolicy.REDUCE policy, method reduce(List) is called to aggregate received results into one final result. Once this method is finished the execution of the grid task is complete. This result will be returned to the user through GridTaskFuture.get() method.

Continuous Job Mapper

For cases when jobs within split are too large to fit in memory at once or when simply not all jobs in task are known during GridTask.map(List, Object) step, use GridTaskContinuousMapper to continuously stream jobs from task even after map(...) step is complete. Usually with continuous mapper the number of jobs within task may grow too large - in this case it may make sense to use it in combination with @GridTaskNoResultCache annotation.

Task Result Caching

Sometimes job results are too large or task simply has too many jobs to keep track of which may hinder performance. In such cases it may make sense to disable task result caching by attaching @GridTaskNoResultCache annotation to task class, and processing all results as they come in GridTask.result(GridJobResult, List) method. When GridGain sees this annotation it will disable tracking of job results and list of all job results passed into GridTask.result(GridJobResult, List) or GridTask.reduce(List) methods will always be empty. Note that list of job siblings on GridTaskSession will also be empty to prevent number of job siblings from growing as well.

Resource Injection

Grid task implementation can be injected using IoC (dependency injection) with grid resources. Both, field and method based injection are supported. The following grid resources can be injected: Refer to corresponding resource documentation for more information.

Grid Task Adapters

GridTask comes with several convenience adapters to make the usage easier: Refer to corresponding adapter documentation for more information.

Examples

Many task example usages are available on GridGain website. To see example on how to use GridTask for basic split/aggregate logic refer to HelloWorld Task Example. For example on how to use GridTask with automatic grid-enabling via @Gridify annotation refer to Gridify HelloWorld Example.

 

Method Summary
 Map<? extends GridJob,GridNode> map(List<GridNode> subgrid, T arg)
          This method is called to map or split grid task into multiple grid jobs.
 R reduce(List<GridJobResult> results)
          Reduces (or aggregates) results received so far into one compound result to be returned to caller via GridTaskFuture.get() method.
 GridJobResultPolicy result(GridJobResult res, List<GridJobResult> rcvd)
          Asynchronous callback invoked every time a result from remote execution is received.
 

Method Detail

map

@Nullable
Map<? extends GridJob,GridNode> map(List<GridNode> subgrid,
                                             @Nullable
                                             T arg)
                                    throws GridException
This method is called to map or split grid task into multiple grid jobs. This is the first method that gets called when task execution starts.

Throws:
GridException - If mapping could not complete successfully. This exception will be thrown out of GridTaskFuture.get() method.
Parameters:
arg - Task execution argument. Can be null. This is the same argument as the one passed into Grid#execute(...) methods.
subgrid - Nodes available for this task execution. Note that order of nodes is guaranteed to be randomized by container. This ensures that every time you simply iterate through grid nodes, the order of nodes will be random which over time should result into all nodes being used equally.
Returns:
Map of grid jobs assigned to subgrid node. Unless GridTaskContinuousMapper is injected into task, if null or empty map is returned, exception will be thrown.

result

GridJobResultPolicy result(GridJobResult res,
                           List<GridJobResult> rcvd)
                           throws GridException
Asynchronous callback invoked every time a result from remote execution is received. It is ultimately upto this method to return a policy based on which the system will either wait for more results, reduce results received so far, or failover this job to another node. See GridJobResultPolicy for more information about result policies.

Throws:
GridException - If handling a job result caused an error. This exception will be thrown out of GridTaskFuture.get() method.
Parameters:
res - Received remote grid executable result.
rcvd - All previously received results. Note that if task class has GridTaskNoResultCache annotation, then this list will be empty.
Returns:
Result policy that dictates how to process further upcoming job results.

reduce

@Nullable
R reduce(List<GridJobResult> results)
         throws GridException
Reduces (or aggregates) results received so far into one compound result to be returned to caller via GridTaskFuture.get() method.

Note, that if some jobs did not succeed and could not be failed over then the list of results passed into this method will include the failed results. Otherwise, failed results will not be in the list.

Throws:
GridException - If reduction or results caused an error. This exception will be thrown out of GridTaskFuture.get() method.
Parameters:
results - Received results of broadcasted remote executions. Note that if task class has GridTaskNoResultCache annotation, then this list will be empty.
Returns:
Grid job result constructed from results of remote executions.

GridGain™ 4.3.1e
Enterprise "Big Data" Edition

GridGain - In-Memory Big Data
Enterprise "Big Data" Edition, ver. 4.3.1e.10112012
2012 Copyright © GridGain Systems
Follow us:   Follow GridGain on Github Join GridGain User Group Follow GridGain on Twitter Follow GridGain on Vimeo