On Tuesday I was in New York City hosting the NYC In-Memory Computing Meetup. I started the meetup 18 months ago and there are about 1,400 members. Our April 23 event attracted some 80 in-memory computing enthusiasts – a new meetup record!
We had two speakers: Dave Rubin, senior director for Oracle NoSQL and Embedded Database, and Pat Patterson, director of evangelism at StreamSets. Dave’s talk was titled, "Oracle NoSQL Database - The most flexible NoSQL Data Store you’ve NEVER heard of" and Pat’s was titled, "Ingesting Streaming Data for Analysis in Apache Ignite."
The main part of Ignite that Pat features in this talk is called "Continuous Queries." Ignite’s continuous queries are ideal for use cases when you want to execute a query and then continue to get notified about the data changes that fall into your query filter.
Apache Ignite provides a distributed platform for a wide variety of workloads, but often the issue is simply in getting data into the database in the first place. The wide variety of data sources and formats presents a challenge to any data engineer; in addition, “data drift,” the constant and inevitable mutation of the incoming data's structure and semantics, can break even the most well-engineered integration.
Pat’s session, aimed at data architects, data engineers and developers, explored how we can use the open-source StreamSets Data Collector to build robust data pipelines. Attendees learned how to collect data from cloud platforms such as Amazon and Salesforce, devices, relational databases and other sources, continuously stream it to Ignite, and then use features such as Ignite's continuous queries to perform streaming analysis.
He started by covering the basics of reading files from disk, move on to relational databases, then looked at more challenging sources such as APIs and message queues.
He then demonstrated how to:
* Build data pipelines to ingest a wide variety of data into Apache Ignite
* Anticipate and manage data drift to ensure that data keeps flowing
* Perform simple and complex ad-hoc queries in Ignite via SQL
* Write applications using Ignite to run continuous queries, combining data from multiple sources
Pat will also be delivering this same talk on Tuesday (April 30) at the Bay Area In-Memory Computing Meetup in Menlo Park. Here's the video of his talk from this past Tuesday in NYC.