GridGain Developers Hub

Generic Performance Tips

GridGain and Ignite as distributed storages and platforms require certain optimization techniques. Before you dive into the more advanced techniques described in this and other articles, consider the following basic checklist:

  • GridGain is designed and optimized for distributed computing scenarios. Deploy and benchmark a multi-node cluster rather than a single-node one.

  • GridGain can scale horizontally and vertically equally well. Thus, consider allocating all the CPU and RAM resources available on a local machine to a GridGain node. A single node per physical machine is a recommended configuration.

  • In cases where GridGain is deployed in a virtual or cloud environment, it’s ideal (but not strictly required) to pin a GridGain node to a single host. This provides two benefits:

    • Avoids the "noisy neighbor" problem where GridGain VM would compete for the host resources with other applications. This might cause performance spikes on your GridGain cluster.

    • Ensures high-availability. If a host goes down and you have two or more GridGain server node VMs pinned to it, then it can lead to data loss.

  • If resources allow, store the entire data set in RAM. Even though GridGain can keep and work with on-disk data, its architecture is memory-first. In other words, the more data you cache in RAM the faster the performance. Configure and tune memory appropriately.

  • It might seem counter to the bullet point above but it’s not enough just to put data in RAM and expect an order of magnitude performance improvements. Be ready to adjust your data model and existing applications if any. Use the affinity colocation concept during the data modelling phase for proper data distribution. For instance, if your data is properly colocated, you can run SQL queries with JOINs at massive scale and expect significant performance benefits.

  • If Native persistence is used, then follow these persistence optimization techniques.

  • If you are going to run SQL with GridGain, then get to know SQL-related optimizations.

  • Adjust data rebalancing settings to ensure that rebalancing completes faster when your cluster topology changes.