GridGain Developers Hub

Python Database API Driver

GridGain 9 clients connect to the cluster via a standard socket connection. Clients do not become a part of the cluster topology, never hold any data, and are not used as a destination for compute calculations.

GridGain DB API driver uses the Python Database API.

Getting Started

Prerequisites

To run the Python driver, the following is required:

  • CMake 3.18 or newer to build the driver

  • Python 3.9 or newer (3.9, 3.10, 3.11 and 3.12 are tested)

  • Access to a running Ignite 3 or GridGain 9 node

Limitations

Script execution of SQL statements is not supported in current release.

Installation

To install Python DB API driver, download it from pip.

pip install pygridgain_dbapi

After this, you can import pygridgain_dbapi into your project and use it.

Connecting to Cluster

To connect to the cluster, use the connect() method:

addr = ['127.0.0.1:10800']
return pygridgain_dbapi.connect(address=addr, timeout=10)

After you are done working with the cluster, remember to always close the connection to it.

conn.close()

Alternatively, you can use the with statement to automatically close the connection when no longer necessary:

with pygridgain_dbapi.connect(address=addr, timeout=10) as conn:
  conn.cursor()

Configuring SSL for Connection

To ensure secure connection to the cluster, you can enable SSL for it by providing the key file and certificate, for example:

def create_ssl_connection():
  """Create SSL-enabled connection to GridGain cluster."""
  addr = ['127.0.0.1:10800']
  return pygridgain_dbapi.connect(
      address=addr,
      timeout=10,
      use_ssl=True,
      ssl_keyfile='<path_to_ssl_keyfile.pem>',
      ssl_certfile='<path_to_ssl_certfile.pem>',
      # Optional: ssl_ca_certfile='<path_to_ssl_ca_certfile.pem>'
  )

Configuring Authorization

If the cluster uses basic authorization, you need to provide user identity and secret to authorize on it, for example:

def create_authenticated_connection():
  """Create authenticated connection to GridGain cluster."""
  addr = ['127.0.0.1:10800']
  return pygridgain_dbapi.connect(
      address=addr,
      timeout=10,
      identity='user',
      secret='password'
  )

Configuring Data Access

You can configure optional properties to fine-tune how data is accessed.

Configuration name Description

schema

A schema name to be used by default. Default value: 'PUBLIC'.

page_size

Maximum number of rows that can be received or sent in a single request. Default value: 1024

The example below shows how to set these properties:

def create_configured_connection():
  """Create authenticated connection to GridGain cluster."""
  addr = ['127.0.0.1:10800']
  return conn = pygridgain_dbapi.connect(
    address=addr,
    timeout=10,
    schema='CUSTOM',
    page_size=2048
  )

Getting Cursor Object

To work with tables from Python client, you use the cursor object that can be retrieved from the connection object:

conn.cursor()

Similar to the connection, you can use the with statement when getting the cursor:

with conn.cursor() as cursor:

Executing Single Query

The cursor object can be used to execute SQL statements with the execute command:

# Create table
cursor.execute('''
          CREATE TABLE Person(
              id INT PRIMARY KEY, 
              name VARCHAR, 
              age INT
          )
      ''')

Executing a Batched Query

You can use the executemany command to execute SQL queries with a batch of parameters. This kind of operation offers much higher performance than executing individual queries. The example below inserts two rows into the Person table:

# Sample data
sample_data = [
  [1, "John", 30],
  [2, "Jane", 32],
  [3, "Bob", 28]
]

# Insert data (fixed table name)
cursor.executemany('INSERT INTO Person VALUES(?, ?, ?)', sample_data)

Getting Query Results

The cursor retains a reference to the operation. If the operation returns results (for example, a SELECT), they will also be stored in the cursor. You can then use the fetchone() method to retrieve query results from the cursor:

# Query data
cursor.execute('SELECT * FROM Person ORDER BY id')
results = cursor.fetchall()

print("All persons in database:")
for row in results:
  print(f"ID: {row[0]}, Name: {row[1]}, Age: {row[2]}")

Working with Transactions

By default, transactions required for database operations are handled implicitly. However, you can disable automatic transaction handling and manually handle commits.

To do this, first, disable autocommit:

conn.autocommit = False

Once autocommit is disabled, you need to commit your operations manually:

# Insert valid records
cursor.execute('INSERT INTO Person VALUES(?, ?, ?)', [4, "Alice", 29])
cursor.execute('INSERT INTO Person VALUES(?, ?, ?)', [5, "Charlie", 31])


cursor.execute('INSERT INTO Person VALUES(?, ?, ?)', [6, "Invalid", new_age])

conn.commit()
print("Transaction committed successfully")

Operations that are not committed are sent to the cluster, but not yet written to the table. The table is only updated when the commit method is called. You can roll back all uncommitted operations with the rollback command:

with conn.cursor() as cursor:
  try:
    # Insert valid records
    cursor.execute('INSERT INTO Person VALUES(?, ?, ?)', [4, "Alice", 29])
    cursor.execute('INSERT INTO Person VALUES(?, ?, ?)', [5, "Charlie", 31])


    cursor.execute('INSERT INTO Person VALUES(?, ?, ?)', [6, "Invalid", new_age])

    conn.commit()
    print("Transaction committed successfully")

  except Exception as e:
    # Rollback on any error
    conn.rollback()
    print(f"Transaction rolled back due to error: {e}")