How to Use Spark With Apache Ignite for Big Data Processing

Apache Ignite and Spark are complementary in-memory computing solutions that can be used together to achieve superior performance and functionality to process SQL data.

When Ignite is used as a data source for Spark, Ignite provides mechanisms, such as SQL indexes and data co-location across tables, that optimize SQL for Spark. Also, although the order of the inserted data is important to many data sources, it is not important to Ignite. Therefore, Spark SQL queries that use Ignite as a data source run much faster than Spark SQL queries that use other data sources, such as HADOOP.

In this webinar, you learn when to use Ignite, and, using real-world examples, you learn how to configure Spark-Ignite integration.. In addition, you learn about optimization techniques and best practices. Topics include the following:

Methods for loading data from Spark to Ignite
Methods for reading data from Ignite via Spark
Methods for working with Ignite data frames
The benchmark results for using Ignite-Spark integration to load and query a large amount of data
Tips for debugging Ignite-Spark integrations

Andrey Alexandrov
Senior Software Engineer at GridGain