Google Big Table Deep Dive and Spark SQL Acceleration with Apache Ignite (Chicago)

We united with Chicago Data Engineering meetup group to host two technical deep-dive talks on Google Big Data and Apache Ignite for Spark. QuantumBlack has agreed to sponsor our February Meetup. Lots of networking, along with pizza and beer!

Please bring an ID to get into the building.


Agenda:
6:00pm - 6:30pm: Networking and snacks

6:30pm - 7:00pm: "How to Speed Up Spark SQL With In-Memory Computing Stack?" talk by Denis Magda

7:00pm - 7:30pm: "Google Big Table - Store data at scale" talk by Piyush Sanghi

7:30pm - 8:00pm: Q/A with Speakers and QuantumBlack

 

How to find us

McKinsey & Company

300 E Randolph St #3100 · Chicago, IL

Talks Details

- - - -
Talk #1: How to Speed Up Spark SQL With In-Memory Computing Stack?

With Spark SQL based on the Catalyst optimizer, we can query and join various data sources, including Hive, relational databases, Avro, and Parquet. Catalyst’s extensible design lets us add data source-specific rules to push down aggregations and filters execution into external storage systems. Such optimizations speed up Spark SQL operations significantly by reducing data shuffling between Spark workers and an external data source.

This talk aims to explain how Apache Ignite’s in-memory store and internal SQL engine were integrated into the Catalyst optimizer to accelerate real-time analytics workloads with a highly-performant in-memory computing stack. We’ll start from the basics showing how to gain a performance boost by merely running Spark and Ignite together. Next, we’ll dive into more sophisticated optimizations to achieve an order of magnitude increase.

- - - -
Talk #2: Google Big Table - Store data at scale


With massive speeds at which data are collected, we need new ways of persisting data at scale. Enter Google Big Table, which provides sub 10ms latency and scales to peta-bytes. We will look at how Google Big Table scaling works and when to use it.

Speakers
Denis Magda
VP, Developer Relations in R&D at GridGain; Apache Ignite committer and PMC member
Biography

Denis Magda is an open-source software enthusiast who began his journey by working first with the technology evangelism group of Sun Microsystems and then with the Java engineering team of Oracle. During his years at Sun and Oracle, Denis became a seasoned Java professional, deepening and expanding his knowledge of the technology by contributing to the Java Development Kit, architecting Java solutions, and building local Java communities. Denis now continues his journey by supporting the Apache Software Foundation and working with GridGain Systems. For the foundation, he contributes to Apache Ignite as an Apache Ignite committer and a member of the Project Management Committee. As the head of the GridGain Developer Relations team, Denis works with software engineers and architects to help them develop their expertise in in-memory computing. You will find Denis at conferences, workshops, and other events sharing his knowledge about Apache Ignite, distributed systems, and open-source communities.  

 

Share This
Start Date