GridGain Nebula Managed Service Offering: An Introduction

June 19, 2023

GridGain Nebula Introduction

In this post, we will be discussing the GridGain Nebula managed service offering for Apache Ignite, available on the GridGain portal. The main benefits of GridGain Nebula are that it simplifies cloud provisioning and provides managed service support.

What is Apache Ignite?

Before we begin, let’s take a look at Apache Ignite. At a very high level, Apache Ignite and GridGain are both in-memory data platforms. They are used to build fast, scalable, and resilient solutions, providing speed and scale to new and existing applications. The core of the platform is built on the open-source Apache Ignite project.

One of the key functionalities of Apache Ignite is distributed memory-centric storage. This allows us to store terabytes or gigabytes of data in memory by scaling our clusters horizontally. With this capability, we can perform interesting tasks like co-located compute, where computations are brought to the servers where the data resides, eliminating network bottlenecks and ensuring fast performance.

Apache Ignite also offers various APIs, including machine learning, SQL, messaging, and streaming APIs. It provides data access through key-value and SQL, allowing users to access the data stored in memory either as a traditional key-value cache or using SQL queries.

To connect to Apache Ignite instances, there are multiple client technologies available, such as Java, .NET, PHP, REST, Python, and JDBC. These connectors enable users to connect their custom applications to the cluster and access the data.

In addition to the core features of Ignite, GridGain provides enterprise features that enhance the platform's capabilities. The GridGain platform adds additional security to ensure a secure gaming experience. It also enables replication across clusters, allowing data to be replicated from one cluster to another. This replication can be done for various purposes such as improving performance across regions or for backup and recovery. This feature is specifically designed for enterprise use.

In addition to replication, GridGain also supports rolling upgrades. This means that if you need to upgrade the underlying platform version, there are tools available to help you upgrade individual nodes without taking down the entire cluster. This allows for zero downtime during upgrades.

Another key feature of GridGain is persistence. This feature allows you to take snapshots of your data and caches for backup and recovery purposes. It also enables point-in-time recovery. These features provide an added layer of functionality on top of the Apache Ignite platform.

Now let's discuss how we can work with Ignite in other ways.

GridGain Nebula

Nebula simplifies the provisioning and creation process of Ignite clusters. This is achieved by using standardized templates for configurations, application layout, logging, and backup schedules. Users no longer need to worry about learning the intricate details of Ignite configurations. Instead, they can provision a cluster in just 10-15 minutes without having to learn and download Ignite or understand how to run with a cluster.

Currently, GridGain Nebula supports deployment to Amazon, and I will demonstrate this for you. In the future, we plan to add support for additional cloud providers. Our short-term targets include Azure and Google Cloud. It's important to note that Nebula clusters are a pay-as-you-go offering, meaning you only pay for what you use.

Once you start up the cluster, our wizard will provide you with information about the hourly cost. As long as the cluster is running, we will bill you accordingly. When you no longer need the cluster, you can simply take it down.

Another benefit of GridGain Nebula is that it includes typical managed service support. This includes a 24x7 team that manages and monitors the cluster for you. They also handle tasks such as backups, recovery, and restarts.

Having this level of support is helpful because it allows you to focus more on your application. You don't have to worry about configurations, restarts, and maintenance. This, in turn, helps lower operating costs as you don't need to maintain an operations team for the clusters. It also makes it much easier to get started with Apache Ignite and quickly begin working on your own application.

Best Practices

We have implemented best practices in the design of these clusters to ensure their optimal performance. As the main developer of Apache Ignite GridGain, we have extensive experience working with enterprise customers. We assist them in fine-tuning their deployments to adhere to best practices, ensuring maximum resilience and facilitating tasks such as backups and log access.

The Nebula clusters are fully automated and follow a templated pattern, eliminating the need for additional configurations. They are supported by a 24/7 service, enabling continuous monitoring and support. Our team will monitor your clusters to ensure they are performing properly. In the event of any issues, we are equipped to perform restarts and schedule backups of your cluster data for easy restoration.

Now, let's discuss the main components of our clusters.

Main Components

The GridGain Nebula service is built on the core platform, which is based on Apache Ignite. In addition to this, we have incorporated extra features from our GridGain GridGame product or GridGame platform. These features focus specifically on security and backup and recovery using snapshots.

As part of our services, we provide a 99.0% SLA, which includes a 24x7 team with extensive experience in monitoring and maintaining GridGain Nebula clusters. Our monitoring and management tool is also integrated into the user interface, giving you full access to all its features. You can create dashboards for your metrics, run SQL queries, and perform debugging tasks.

Nebula Security

Once you have created a Nebula cluster, there are several security configurations available to you. These configurations include access lists, which allow you to define whitelists for machines that can connect to your Nebula cluster. We also support HTTPS, SSL, and TLS for secure connections. Additionally, we provide multiple clients, such as Java thin clients, Java fit clients, and others, for connectivity to the cluster. In a demonstration, I will show you how to connect using these clients. Moreover, we offer additional layers of cluster authentication, allowing you to define users, roles, and access for these Nebula clusters. As an administrator, you have the rights to revoke permissions and perform other administrative tasks.

Ignite Support

The goal is to support the entire Ignite stack for GridGain Nebula clusters. Once you have created an Ignite cluster, you can use all the typical functionalities that you would use with an on-premise Ignite cluster. This includes caching use cases, loading data, performing transactions, accessing data via key-value APIs or SQL, setting up a service grid, and running compute operations on the data in the Nebula cluster.

All the connectors that you are used to using to connect to your Ignite cluster are also supported, as well as native persistence, third-party persistence, and password-based authentication. In addition to Ignite, we also provide support for GridGain snapshots. Our operations team takes daily snapshots of your data, which can be used to restore your cluster to a specific point in time if needed. However, some enterprise GridGain functionality such as data center replication and user-configurable snapshots are not currently available, but they are being considered for future releases and updates. The provisioning process is very efficient.

Provisioning

We have simplified the process into a few configuration steps. Underneath the surface, we utilize Amazon EC2 instances, specifically the high-performance memory type VMs known as i3. These environments are dedicated, ensuring that any cluster you provision is isolated and all resources are dedicated to your workloads. The configuration options include cluster size, resource allocation per node, provisioning location, and basic security settings.

To begin the provisioning process, navigate to the main Grid Game Nebula UI at portal.gradegame.com. As a new user, you are presented with two options. If you already have an existing cluster, you can attach it to this environment for management. Alternatively, you can create new Grid Gain Nebula clusters.

Creating a new cluster involves three steps. Firstly, you need to select your cloud provider. Currently, we support Amazon Web Services (AWS), with plans to add more cloud providers in the future. Next, choose the region where you want to deploy your cluster. We support various AWS EC2 data centers, such as the US East and North USC regions. Lastly, determine the resource sizing for your cluster. We offer three node sizes: small, medium, and large, each with different CPU, RAM, and disk space allocations.

The next step is optional and allows you to customize storage allocation and whitelist IP addresses. You can configure persistent storage or split between in-memory and persistent storage. Additionally, you can add IP addresses to your whitelist for access control. Finally, provide credentials for the initial user of the cluster, such as a username and password. Review the summary of your configuration choices.

When creating a new account with portal.gridgain.com, you receive a sign-up bonus of 500 credits. This credit allows you to explore clusters, experiment with different sizes, and run various use cases for a few weeks. Once the credit is depleted, you can add a credit card for pay-as-you-go billing integrated with Stripe. Click "Create" to initiate the provisioning process.

Based on the information provided, a new EC2 instance in Amazon using the i3 type will be provisioned. The cluster creation process typically takes around 10 to 15 minutes. While the cluster is being created, you can access client connections and template code to help you connect your clients to the Nebula cluster. This includes URLs, updates for Maven files, and examples for Java thin clients, .NET, Python, Java thick clients, and JDBC.

Monitoring and Troubleshooting

As mentioned previously, when working with Nebula clusters, once the cluster is operational, you will have access to a comprehensive set of management, monitoring, and troubleshooting tools. These tools are based on our GridGain Control Center, providing you with the ability to monitor various metrics of your application. GridGain and Ignite offer hundreds of metrics that you can track as your application runs on the cluster. You can monitor CPU performance, storage, heap usage, offheap usage, and transaction performance, among other things.

In addition to monitoring, we also provide an interface for working with your data using SQL. Within this environment, we have an integrated SQL IDE that simplifies the process of building and executing SQL queries. As the queries run, you can analyze their performance, including execution time, success or failure, and other relevant information. Even if the SQL queries were not initiated through the SQL IDE, they are still tracked, allowing you to assess their performance.

Furthermore, we offer tools to stop queries that are performing poorly or have become stuck. As your applications run, we enable the capability of application tracing. This feature is particularly useful when you notice a drop in performance or when the application does not meet your expectations. Ignite supports application tracing through open senses, allowing you to selectively enable tracing for specific core APIs. With tracing enabled, you can analyze the application stack and identify any bottlenecks, crashes, or issues affecting the performance of transactions or other operations. This thorough approach enables effective log debugging across the nodes of your cluster.

Additionally, you can evaluate cache data performance and explore other features provided by GridGain Nebula clusters. All these capabilities are included out of the box, empowering you to effectively manage and optimize your cluster.

Connecting to a Cluster

To connect to one of these clusters, you can utilize the supported Ignite connectors. If you are already using Ignite client connectors to connect to an existing Ignite cluster, you can reuse similar patterns. We also provide sample templates to assist you in getting started. Connecting to a GridGate Nebula cluster requires specific configurations, such as using our Java thin client, thick clients, Node.js clients, .NET REST, and JDBC.

Demo of Client Access

Let's do a simple demo of client access. I have a Nebula cluster that is up and running. Let's connect via JDBC. I will switch to a different account that already has some other provisioned clusters. In the clusters view, I can see a two-node cluster that is running GridGain Nebula with version 8.8.10. To make a quick connection, I will show the connection templates and use JDBC. From there, I can see the JDBC connection string, URL patterns, and basic SQL. I will use a local client called SQLLine to connect via JDBC and run SQL operations. I am running a local version of GridGain with version 8.8.10. In my bin directory, I have the SQLLine script. I will call SQLLine and use the connection information from the JDBC template. This will connect from my local laptop to the Grid Nebula cluster running in US East Virginia. It seems like the connection was successful. Let's confirm that we have an open connection.

Another thing we can do is load some data. GridGain and Apache Ignite include code examples, such as the World Sequel sample, which sets up tables with data. Let's run that script. We have connected to the cluster and executed the lines in the SQL script successfully. Now, let's try running a quick query. It looks like everything is working fine.

We have been able to connect from my local environment to the cluster using JDBC, load data, and run SQL queries. We have a convenient way to interact with the remote Ignite cluster. We can also use the UI to perform the same actions. In the SQL IDE, we can see the public schema and the city table. We can execute similar queries through the UI.

As we run these queries, we collect statistics about them, including performance and success rate. We can see the captured commands from executing the World Sequel schema. There is a query still running, which is loading data from an original query.

In summary, we provisioned a cluster in 10 to 15 minutes, connected a local JDBC client using a connection template, ran SQL queries, loaded data, and performed the same actions through the web UI.

Custom Code Deployment

In the past few weeks, we have introduced a new feature in our product that allows users to perform custom code deployment. If you are familiar with Apache Ignite or GridGain, you may have started using our Compute Grid or Service Grid. The Compute Grid allows you to load data into memory and run computations or code against that data. Traditionally, this would require loading additional libraries onto the class path of individual nodes, which involved copying the application or libraries and their dependencies, followed by a restart of the node.

To simplify this process, we have added a new API that enables code deployment without the need for a restart. In addition, we have introduced a user-friendly UI in our Nebula environment, where you can define a "deployment unit" consisting of your code artifacts and library dependencies. These deployment units can be automatically deployed to each node in the cluster, without requiring any node or cluster restarts.

Currently, you have two options for configuring the source of these libraries. You can either provide a direct link via HTTP, HTTPS, or FTP, or you can reference an existing Maven repository. In our roadmap, we have plans to allow users to upload library dependencies directly into the portal UI.

Once you have defined a deployment unit, we keep track of the versions and dependencies. This makes it easy to understand what has been deployed where and which version is being used. We also provide additional lifecycle management tools, supported by an API that can be accessed even in on-prem or local GridGain 8810 environments. This means you are not restricted to using the Control Center UI to define these deployment units.

Deployment Example

Let's take a look at a deployment example. I will go back into my portal and check if everything is still up and running, which is great. Let's take a look at the dashboard and see the status of the cluster. It appears that the cluster is running fine with two nodes. I can see the version and IP address of each node. Currently, the CPU load is not high. I have set up a couple of tables to compare heat versus off-heap consumption. Additionally, I have created some custom dashboards to monitor persistent storage, disk usage on each node, right ahead log performance, checkpointing, rebalance, and partition setups. If needed, I can easily add another table or dashboard. All these widgets are customizable, allowing me to drag, drop, and reposition them as needed. Overall, everything seems to be running fine.

However, there is an issue with the configured alert. I have set up an alert to notify me when there are no connected clients. Currently, I have disconnected all the clients, so the alert is indicating that the clients are not up and running. To resolve this, let's deploy some custom code and connect a thick client that will utilize that code.

To start, I will define a deployment unit called "streamer." A deployment unit consists of libraries and their dependencies. There are two options for adding artifacts to a deployment unit: Maven artifacts or direct links. In this case, I have a direct link to a library hosted on S3, which also has a dependency on Guava. Now that the deployment unit is defined, it is still in draft mode and has not been deployed yet.

Next, I will deploy the code by executing the deployment command. This process involves looking up dependencies of the Maven artifact and the directly linked library. The libraries will then be deployed to each node of the cluster. Since this is a two-node cluster, both nodes will receive the libraries. Once the code is deployed, we can proceed to use it.

I have a simple Java thick client that will connect to the cluster. This client is part of our streaming application tutorial and can be accessed through GridGain docs. I have set up the connection, discovery, IP finder, address of the GridGain Nebula cluster, security credentials, and other necessary configurations. Once the client connects, we will perform some cache setup and make a call to the deployed code.

Now, let's run the client and observe the results. The client is a local Ignite node that joins the cluster. It has some limitations compared to a full-fledged server node. We should see a confirmation of the client connecting and an updated topology in the dashboard. As the client starts streaming data into the caches, we will notice an increase in load. If everything goes well, our previous alert about offline clients should disappear. Additionally, we should see new tables and information related to the streaming data and cache analysis in the schemas section.

Service Overview

So, what did we do well? We started off by writing some custom code. We then deployed this code to the nodes of the cluster and attached a thick client to take advantage of it. Before we wrap up, let's talk a little bit more about what you should expect in terms of SLAs (Service Level Agreements) and service.

With Grid Game Nebula, we support a 99.9% uptime, which means that your cluster should be available and operational for the majority of the time. Additionally, we take daily snapshots of your cluster state and data to ensure their safety. Our team actively monitors your clusters and has the capabilities to perform cluster restarts if necessary. In the event that a rollback is needed, we can use our snapshots and backups to revert the cluster to a specific point in time.

Currently, we don't have auto-scaling functionality, so if you need to scale out your cluster, you would need to file a support request. When using GridGain Nebula, you can contact our support organization directly through the "Contact Us" link on our platform.

We currently support only one version of the underlying Grid Game platform. However, once we add support for additional versions, we will be able to perform rolling upgrades without any downtime. We will notify you and schedule the upgrade accordingly.

Data loss and partition handling, as well as data persistence, are all taken care of by us on your behalf. If you need more information, the best thing to do is to sign up at portal.gridgain.com. Upon signing up, you will receive a 500 credit that can be used for a few weeks of continuous cluster use. You can provision clusters of any size and try things out.

It's important to note that currently, we don't have an option to stop or pause a cluster. When you no longer need to use the cluster, you would need to destroy it and then recreate it. However, this is something we are working on for our upcoming updates.

For general information, you can visit gridgame.com and navigate to our products section, where you will find detailed information about Grid Gain Nebula. If you want a step-by-step guide to connecting your JDBC client in SQL, you can access our documentation's "Getting Started" guide. Additionally, we recommend checking out our webinars, as we have multiple new ones coming up. Starting next week, we will be running an Apache Ignite summit where you can learn more about deploying Ignite to clouds and best practices. We will also provide insights into the development of Nebula.

How does the cost of Nebula compare to the on-prem on-prem gcu model?

The service that you're paying for is pretty comparable. There's an additional service included, which is the 24x7 support and management. But overall, it's similar to what you would pay for GCU plus support and maintenance. We've broken it down to an hourly charge.

Do you need an additional support agreement for Nebula?

No, when you sign up for Nebula clusters, you're entitled to support. This is essentially our basic support package. If you have any questions about your account, clusters, billing, or anything related to Nebula, you can contact us, and we'll be there to assist you. So, you don't need an external or additional support contract to work with GridGain Nebula and access our support.

However, please note that there are certain things not covered under the included support, such as performance-related questions, filing bugs on the underlying Ignite platform or GridGain platform. But we do offer higher-level support options that cover those types of services.

Do we restrict code uploads to the cluster, or do you have complete control over the added functionality?

It's essentially open. You have the ability to upload different types of apps and libraries.