Google Big Table Deep Dive and Spark SQL Acceleration with Apache Ignite (Chicago)

We united with Chicago Data Engineering meetup group to host two technical deep-dive talks on Google Big Data and Apache Ignite for Spark. QuantumBlack has agreed to sponsor our February Meetup. Lots of networking, along with pizza and beer!

Please bring an ID to get into the building.

Agenda:
6:00pm - 6:30pm: Networking and snacks

6:30pm - 7:00pm: "How to Speed Up Spark SQL With In-Memory Computing Stack?" talk by Denis Magda

7:00pm - 7:30pm: "Google Big Table - Store data at scale" talk by Piyush Sanghi

7:30pm - 8:00pm: Q/A with Speakers and QuantumBlack

How to find us

McKinsey & Company

300 E Randolph St #3100 · Chicago, IL

About Talks

- - - -
Talk #1: How to Speed Up Spark SQL With In-Memory Computing Stack?

With Spark SQL based on the Catalyst optimizer, we can query and join various data sources, including Hive, relational databases, Avro, and Parquet. Catalyst’s extensible design lets us add data source-specific rules to push down aggregations and filters execution into external storage systems. Such optimizations speed up Spark SQL operations significantly by reducing data shuffling between Spark workers and an external data source.

This talk aims to explain how Apache Ignite’s in-memory store and internal SQL engine were integrated into the Catalyst optimizer to accelerate real-time analytics workloads with a highly-performant in-memory computing stack. We’ll start from the basics showing how to gain a performance boost by merely running Spark and Ignite together. Next, we’ll dive into more sophisticated optimizations to achieve an order of magnitude increase.

- - - -
Talk #2: Google Big Table - Store data at scale

With massive speeds at which data are collected, we need new ways of persisting data at scale. Enter Google Big Table, which provides sub 10ms latency and scales to peta-bytes. We will look at how Google Big Table scaling works and when to use it.

Speakers

Denis Magda
VP, Developer Relations in R&D at GridGain; Apache Ignite committer and PMC member

Thursday, February 20 2020

6:00PM

Learn More