Kafka Connector Data Schema
GridGain Kafka Connectors support data schemas. This enables numerous existing non-Ignite Sink Connectors to understand data injected with Ignite Source Connector and the Ignite Sink Connector to understand data injected by non-Ignite Source Connectors.
Ignite Type Support
Source and Sink Connectors work with Ignite data in Ignite Binary format.
The table below provides mappings between Kafka schema types and known logical types and Ignite Binary types.
Kafka Type | Ignite Type |
---|---|
INT8 |
BYTE |
INT16 |
SHORT, CHAR |
INT32 |
INT |
INT64 |
LONG |
FLOAT32 |
FLOAT |
FLOAT64 |
DOUBLE |
BOOLEAN |
BOOLEAN |
STRING |
STRING, UUID, CLASS |
BYTES |
BYTE_ARR |
ARRAY(valueSchema) |
COL SHORT_ARR INT_ARR LONG_ARR FLOAT_ARR DOUBLE_ARR CHAR_ARR BOOLEAN_ARR DECIMAL_ARR STRING_ARR UUID_ARR DATE_ARR OBJ_ARR ENUM_ARR TIME_ARR DATE_ARR TIMESTAMP_ARR DECIMAL_ARR |
MAP |
MAP |
STRUCT |
OBJ, BINARY_OBJ |
Date (Logical Type) |
DATE |
Time (Logical Type) |
TIME |
Timestamp (Logical Type) |
TIMESTAMP |
Decimal (Logical Type) |
DECIMAL |
Updates and Removals
By default, Source Connector does not process removed Ignite cache entries. Set the shallProcessRemovals
configuration setting to true
to make the Source Connector process removals. In this case Source Connector injects a record with null
value into Kafka to indicate that the key was removed. Sink Connector removes keys with null
values from the cache. Using null
as a value to indicate a removed entry works because Ignite does not support null
cache values.
For performance reasons, Sink Connector does not support existing cache entry update by default. Set shallProcessUpdates
configuration setting to true
to make the Sink Connector update existing entries.
Schema Migration
Schema migration is implicit for GridGain Connectors. Both the Source and Sink Connectors pull and push cache entries in cross-platform Ignite Binary format, which intrinsically supports changing schemas. Ignite cache keys and values are dynamic objects that could have a different set of fields.
For performance reasons, Source connector caches key and values schemas. The schemas are created as the first cache entry is pulled and re-used for all subsequent entries. This works only if the schemas never change. Set isSchemaDynamic
to true
to support schema changes.
Schemaless Operation
Source Connector does not generate schemas if the isSchemaless
configuration setting is true
.
Disabling schemas improves performance because the Connectors would not build schemas and would not convert keys and values into Kafka format. This comes at a cost of non-Ignite Sink converters unable to understand the data injected into Kafka in the Ignite Binary format.
Some examples when disabling Source schema makes sense:
-
You are ready to do some coding to extend a non-Ignite converter to process the Ignite Binary objects to achieve higher performance.
-
The Ignite Data Replication example does not need schemas since both the Source and Sink are GridGain connectors.
© 2023 GridGain Systems, Inc. All Rights Reserved. Privacy Policy | Legal Notices. GridGain® is a registered trademark of GridGain Systems, Inc.
Apache, Apache Ignite, the Apache feather and the Apache Ignite logo are either registered trademarks or trademarks of The Apache Software Foundation.