Akmal B. Chaudhri

← GridGain Blog

Position:
Technical Evangelist, GridGain Systems
Bio:

 

Apache® Ignite™ is a very versatile product that supports a wide-range of integrated components. These components include a Machine Learning (ML) library that supports popular ML algorithms, such as Linear Regression, k-NN Classification and K-Means Clustering. The ML capabilities of Ignite provide a wide-range of benefits, as shown in Figure 1. For example, Ignite can work on the data in-place, avoiding costly ETL between different...
In this two-part series, we will look at how Apache® Ignite™ and Apache® Spark™ can be used together. Ignite is a memory-centric distributed database, caching, and processing platform. It is designed for transactional, analytical, and streaming workloads, delivering in-memory performance at scale. Spark is a streaming and compute engine that typically ingests data from HDFS or other storage. Historically, it has been inclined towards OLAP...
In the previous article in this Machine Learning series, we looked at k-NN Classification with Apache® Ignite™. We’ll now look at another Machine Learning algorithm and conclude our series. In this article, we’ll look at K-Means Clustering using the Titanic dataset. Very conveniently, Kaggle provides the dataset in a CSV form. For our analysis, we are interested in two clusters: whether passengers survived or did...
In the previous article in this Machine Learning series, we looked at Linear Regression with Apache® Ignite™. Now let’s take the opportunity to try another Machine Learning algorithm. This time we’ll look at k-Nearest Neighbor (k-NN) Classification. This algorithm is useful for determining class membership, where we classify an object based upon the most common class amongst its k nearest neighbors. A dataset that is...
In the previous article in this Machine Learning series, we looked at the Apache® Ignite™ Machine Learning Grid. Now let’s take the opportunity to drill-down further into some of the Machine Learning algorithms that are supported in Apache Ignite and try out some examples using popular datasets. If we search for suitable datasets to use, we can find many that are available. However, one dataset...
In a previous article, we discussed the Apache® Ignite™ Machine Learning Grid. At that time, a beta release was available. Subsequently, in version 2.4, Machine Learning became Generally Available. Since the 2.4 release, more improvements and developments have been added, including support for Partitioned-Based Datasets and Genetic Algorithms. Many of the Machine Learning examples that are provided with Apache Ignite can work standalone, making it...
These are very exciting times for Apache Ignite. During this past year that I have been with GridGain, I have seen some significant technology additions to the Open Source project, such as support for SQL-99, Native Persistence, and Machine Learning to name but three. Earlier this year, new Genetic Algorithm (GA) code was donated to the Apache Software Foundation. Since I am not very familiar...
Throughout my IT career, I have sometimes heard about cases where colleagues needed to travel for business meetings to far away locations just for a day. At the beginning of my IT career there was even a case that I heard about where a senior manager was nearly on a Concorde flight from London to New York because of an emergency situation. This kind of...
This article is the last part of the Apache Ignite Transactions Architecture series. In the previous articles in this series, we discussed a range of topics associated with Apache Ignite's transactions handling in its Key-Value API. In the first article, we briefly reviewed the two-phase commit protocol and described how it worked with various types of cluster nodes. In the second article, we discussed locking...
The fourth and final meetup that I attended on my trip to Washington DC was held on Thursday 5 April at The Hotel at Arundel Preserve, 7795 Arundel Mills Boulevard, Hanover, Maryland. The meetup was organized by Apache Spark and Distributed Computing Maryland. The location of the meetup was quite far from where I was staying, so I had to journey out of Washington using...
The third meetup that I attended on my trip to Washington, D.C. was held on Wednesday 4 April at REI Systems, 45335 Vintage Park Plaza, Sterling, Virginia. The meetup was organized by Nova Data Science. The location of the meetup was quite far from where my colleague Chris Cook and I were staying, so we had to journey out of Washington taking the Metro and...
The second meetup that I attended on my trip to Washington DC was held on Tuesday 3 April at Pariveda Solutions, 11th Floor, 1616 Fort Myer Drive, Arlington, Virginia. I was again assisted by my colleague Chris Cook. This time, meetup attendance was lower than the event held on the previous day, but the audience was very interactive and asking questions throughout my presentation. Pariveda...
On Sunday 1 April, I flew from London to Washington DC for several days of meetups in and around the DC area. The first of these was held on Monday 2 April at 1776 on the 12th floor of 1133 15th Street NW, Washington DC. 1776 is a startup incubator and was a great local venue not very far from where I was staying in...
In the previous article in this series, we looked at failover and recovery. Here are topics we will cover in the rest of this series: Transaction handling at the level of Ignite persistence (WAL, checkpointing, and more). Transaction handling at the level of 3rd party persistence. In this article, we will focus on transaction handling at the level of Ignite persistence. Those who use Apache...
In the previous article in this series, we looked at concurrency modes and isolation levels. Here are topics we will cover in the rest of this series: Failover and recovery Transaction handling at the level of Ignite persistence (WAL, checkpointing, and more) Transaction handling at the level of 3rd party persistence In this article, we will focus on how Apache Ignite handles failover and recovery...