Held on June 6, 2023, this year's Ignite Summit showcased the breadth and flexibility that make Apache Ignite a foundational technology for many demanding data use cases. Presenters represented a wide range of organizations, from financial services and telecommunications to programming language development.
Organized by GridGain and the Apache Ignite community, Ignite Summits attract engineers interested in learning and sharing their experiences with Apache Ignite. Ignite Summit 2023 included a virtual “Ask an Ignite Expert” booth with Ignite committers and GridGain engineers available to answer questions and help Igniters tackle challenges in their own work.
Read through session recaps below and check out the full on-demand videos here.
Whiskey Clustering with Groovy and Apache Ignite
Paul King, VP at Apache Groovy, Principal Software Engineer At OCI
After discussing the benefits of using Groovy for projects based on Apache Ignite and Java, Dr. Paul King detailed his quest to analyze single malt Scotch whiskeys produced by the world’s top 86 distilleries. He based his analysis on 12 different categories defined by whiskey experts, including sweetness, maltiness, floral, fruity, medicinal, smoky, and more.
Dr. King began by showing different strategies for using K-means algorithms to create basic clusters, and along the way, discussed “dimension reduction,” a way to make visualizing the 12 dimensions easier – without losing any information. Humans, he said, struggle to visualize more than four dimensions.
Dr. King then demonstrated the benefits of using the distributed K-means clustering algorithm from the Apache Ignite machine learning library to enable more effective clustering. A key advantage of Ignite, said Dr. King, is that even with thousands and thousands of data points, Ignite needs to share only center points – a very small amount of data – to distribute the algorithm. Ignite also has features beyond clustering to help organizations analyze data with machine learning algorithms.
Ignite Powered Quantitative Analytics
Ali Ferda Arikan, UK Quantitative Analytics Lead at Cardano Risk Management
In a return visit to the Ignite Summit, Ali Ferda Arikan updated attendees on his company’s work with Apache Ignite technology to solve quantitative analytics challenges. Cardano Risk Management has a cloud-deployed Apache Ignite-powered platform that leverages API access to the company’s analytics library. Using the platform, the company performs valuations and sensitivity analyses on financial instruments and derivatives. The results are used in hedging and portfolio management decisions, as well as reporting, so there are various modes of operations, including batch execution and ad hoc queries.
A key development over the last year is that the Ignite-powered platform is now the main calculation engine for projects that require quantitative analytics, and the company is getting positive feedback from users about the platform’s performance and availability, which provides a competitive edge. Arikan also reported that his small team can maintain the platform because it doesn't require any specialist knowledge. Arkan then discussed the benefits of Apache Ignite Persistence, including greater scalability, faster recovery to the previous state, and more efficient access to the stored data.
Arikan concluded with his plans for the future, including adding more instruments and capabilities, extending cache usage with higher-level objects to reduce the dependence on market data, and using more distributed data structures.
A Double Victory: Fast Retrieval of Data From Ignite In-Memory Cache and Reduce Load on Classical DB
Mirko Cambi, Executive Director at Mediobanca S.p.A and Justin Mathew, Quant developer(VP) at Mediobanca S.p.A.
Mediobanca, an Italian investment bank established in 1946, is one of the leading financial institutions in Italy. The bank’s Financial Engineering Group is responsible for the pricing and valuation of structured products using sophisticated mathematical models, as well as risk management and modeling using quantitative models and risk management techniques.
Mirko Cambi laid out the original architecture of the Financial Engineering Group’s customer web application, Argo. A web server runs three services, one serving HTML pages and one that receives requests from the clients and routes them through the third service, which is a queue manager running on the application server. The services are connected to the other elements of the infrastructure, including databases. When the team identified performance issues with accessing databases, they decided to add Apache Ignite to the technology stack.
Justin Mathew then reviewed the first use case for Apache Ignite – enabling instant loading of data – which resulted in a completely new data architecture without any disruption to the current infrastructure. According to Mathew, the key features of Ignite for Mediabanco included In-Memory Data Grid and In-Memory Database for fast retrieval based on key-value pair access, Ignite’s inclusion of both a thin client and a thick client to support the best strategy for the application, and UI layers depending on the need. They also leverage Ignite Persistence with flexible configuration options, Affinity Colocation with affinity keys to reduce latency, support for the development of microservices, lock-free/wait-free algorithms, and grid computing.
The Future of Apache Ignite in the Open Source Economy
Lalit Ahuja, SVP of Product at GridGain
As data requirements have evolved, a new data architecture has emerged to address the need for data processing and analytics across data at-rest and data in-motion, all at ultra-low latencies. Gartner calls this architecture a "Unified Real-Time Data Platform."
In his Apache Ignite Summit keynote, Lalit Ahuja detailed the requirements of a Unified Real-Time Data Platform and how it relates to the future of Apache Ignite. He also shared how attendees, as members of the Ignite Community, could benefit from and contribute to the success of Apache Ignite.
Building Cloud-Native Telco 5G Convergent Charging Using Apache Ignite for Telco HA & Low Latency
Bernhard Kraft, Senior Director of Technical Product Management at Optiva and Keith Mellor, Principal Software Architect at Optiva
Optiva, Inc. provides the cloud-native Optiva BSS Platform™, a telecommunications revenue-management solution that supports real-time convergent charging of voice and data sessions to billing and CRM solutions. To ensure a frictionless user experience with no interruptions in service, the company implemented Apache Ignite at the heart of its architecture.
Bernhard Kraft discussed the new and often complex requirements around 5G, including the move to standard APIs, the need for massive low-latency transactions, and new deployment requirements, such as edge deployments, hybrid models, support for cloud ecosystems, and multi-region deployments with disaster recovery and business continuity capabilities. He then discussed how they decided on Apache Ignite as the right architecture to meet the company’s evolving needs.
Keith Mellor provided details on why Apache Ignite was a great fit to enable increased throughput via horizontal scaling in Kubernetes, while also managing what data would reside in memory and what could persist on disk. He shared insights into what the company learned from the deployment, including the need to carefully assess the difference in performance between public and private clouds, a requirement for a regular cycle of automated performance testing and observability, and the need to support geographical redundancy, including resolving conflicts when updating data asynchronously from different regions.
Security for Apache Ignite on the Cloud
Stanislav Lukyanov, Director, Product Management at GridGain
Stanislav Lukyanov discussed how to address security issues when deploying Apache Ignite or the GridGain platform in the cloud, reviewing the key elements every team should consider. He also discussed best practices when deploying the out-of-the-box GridGain Nebula solution.
Lukyanov focused first on network protection and described the need to choose the right network topology, noting that AWS VPC Peering is the most common and flexible strategy, while AWS PrivateLink is the most secure, though not as flexible. He noted that VPC Peering is coming soon to GridGain Nebula. He then recommended protecting the network by setting up at least a basic firewall with SSL TLS. Advanced firewalls can add intelligent traffic inspection, but these can be quite costly in the cloud.
Lukyanov then discussed how to implement a data protection layer to ensure that even if the network is breached, access to the data is limited. He started with transparent volume-level and database-level data encryption, which is available in Apache Ignite, the GridGain platform, and GridGain Nebula. He concluded with various strategies for authentication and authorization available within the platforms.
How to Get the Most Out of Affinity and Data Co-Location in Apache Ignite
Peter Whitney, Solutions Architect at GridGain
Solution Architect Peter Whitney started his talk by explaining the benefits of using affinity and data co-location in Apache Ignite. In brief, affinity and data co-location are similar concepts. Affinity allows data in one table to be pre-grouped based on one or more columns of that table. Rows that have the same column value will be stored on one host instead of being distributed to many hosts. Data co-location refers to two or more tables that are frequently joined on a specific column. To support efficient join operations, all join tables use a common affinity key or column. This forces the Apache Ignite system to locate records with specific column values on the same host.
Whitney demonstrated the performance improvement when using affinity. He used publicly available New York taxi trip data – 7,019,375 rows of data – and configured two identical data sets, one using an affinity key. He showed in detail how the affinity key changed the structure of the data in the Ignite cluster, then made multiple queries against the two data sets. The results were impressive. A query that took five seconds on the data set without an affinity key took only three seconds on the data set with the key. As Whitney explained, this is a 40 percent improvement, which translates into returning 9.6 hours of execution time to an organization every 24 hours.
Real-Time Digital Transformation at M&T Bank – Integrating z/OS Core Systems
Timothy Anderson, Software Engineer at M&T Bank
M&T Bank, headquartered in Buffalo, NY, operates more than 1,000 branches in 12 states across the Eastern U.S. Timothy Anderson dove into an explanation of how they integrated Apache Ignite into its core banking system running on IBM z/OS to achieve real-time access to data. After describing some of the challenges of the bank’s previous architecture – including limited REST APIs, non-real-time ETL processes, no streaming/event processing capabilities, and more – Anderson laid out the new digital integration hub (DIH) architecture that is enabling real-time data access for customers and internal users.
In the new architecture, the bank still relies on nightly batch updates of millions of customer transactions to the system of record, but real-time updates to those transactions are achieved through REST APIs, with access to new transactions enabled through a z/OS log stream. All this data – 33 caches each with between 5 and 18 million records – is loaded into Apache Ignite. A polling process that reads the z/OS log stream keeps the data in Ignite up to date, with most polling taking place between 25 milliseconds and one second intervals. These intervals are far smaller than necessary for maintaining a good customer experience, but the bank has not had any performance issues at those speeds.
Today, all key customer-facing applications, including mobile banking, web banking, access to information at branches, etc., are consuming real-time data from Apache Ignite running on z/OS.
Cluster Membership Improvements in Apache Ignite 3
Aleksandr Polovtsev, Senior Developer at GridGain
In Aleksandr Polovtsev’s talk about cluster membership improvements in Apache Ignite 3.0, he explained how node discovery worked in Ignite 2.0 and will work in Ignite 3.0. He also elaborated on cluster management performance in Ignite 3.0. Polovtsev laid out the discovery components in Ignite 2.0, including a ring topology and failure detection. The one-way ring topology leverages a coordinator assigned to a node to ensure proper handling of messages to all other nodes.
Every message, including node discovery and failure detection, must travel around the ring until it reaches the coordinator, and after the coordinator “verifies” a message, it sends the message around the ring again. Because every message must travel around the ring, often multiple times, latency is introduced that scales linearly based on the number of nodes.
Polovtsev then explained that Ignite 3.0 maintains two topologies. A physical topology, provided by SWIM, provides node discovery and failure detection. SWIM is weakly consistent, and string consistency is guaranteed by the second topology. This second logical topology is provided by an internal component, cluster management group (CMG), which is based on Raft, a consensus algorithm. With these two separate topologies in Ignite 3.0, membership update latency grows logarithmically based on cluster size, not linearly.
With an excellent turnout and industry-leading speakers, this year’s Summit was a testament to the highly technical skillsets and collective enthusiasm of the thriving Ignite community. We can't wait to watch this powerful technology continue to unfold and transform businesses across the globe.
Register now to watch the 2023 Apache Ignite Summit talks and presentations.