The GridGain in-memory computing platform includes ANSI-99 compliant SQL capabilities which enable you to run in-memory SQL across your data. GridGain supports free-form SQL queries with virtually no limitations and can use any SQL function, aggregation, or grouping. GridGain supports distributed SQL joins and allows for cross-cache joins, performing like an in-memory distributed SQL database. Joins between partitioned and replicated caches work without limitations while joins between partitioned data sets require that the keys are collocated. GridGain also supports the concept of fields queries to help minimize network and serialization overhead.
GridGain ANSI SQL-99 compliance empowers big data analytics use cases where ad hoc SQL queries running at in-memory speeds offer application opportunities not supported by disk-based data analytics solutions. Current applications which require a nightly ETL and take multiple hours per SQL query can become real-time insights into current operational data when moved onto the GridGain in-memory computing platform. Functioning like an in-memory SQL database, GridGain can address emerging hybrid transactional/analytical processing use cases and eliminate the need for a separate analytics infrastructure that requires nightly ETL.
When GridGain is used as a complement to Apache Spark or Apache Cassandra, the SQL support in GridGain can power major improvements in the underlying technology:
- The GridGain SQL indexing capability can improve Spark query times by 1,000x or more. Spark supports a fairly rich SQL syntax but does not support data indexing so each query requires a full data scan when used without GridGain.
- GridGain enables ad hoc SQL queries on Cassandra data loaded into the GridGain using the system's in-memory distributed SQL database-like capabilities. With no SQL support, ad hoc queries are not supported by Cassandra