Advanced Load Balancing
GridGain provides both early and late load balancing for Compute Grid that are defined by load balancing and collision resolution SPIs – effectively enabling full customization of entire load balancing process. Early and late load balancing allows adapting the grid task execution to non-deterministic nature of execution on the grid.
Early load balancing is supported via mapping operation of MapReduce process. The mapping – the process of mapping jobs to nodes in the resolved topology – happens right at the beginning of task execution and therefore it is considered to be an early load balancing
Once jobs is scheduled and arrived on the remote node for execution it gets queued up on the remote node. How long this job will stay in the queue and when it’s going to get executed is controller by collision SPI – that effectively defines the late load balancing stage.
One implementation of this load balancing orchestrations provided out-of-the-box is a job stealing algorithm. What is does is detecting the imbalance at the late stage and sending jobs from busy nodes to the nodes that are considered free right before the actual execution.
Grid and cloud environments are often heterogeneous and non-static, tasks can change their complexity profiles dynamically at runtime and external resources can affect execution of the task at any point. All these factors underscore the need for proactive load balancing during initial mapping operation as well as on destination nodes where jobs can be in waiting queues.