public class DatasetFactory extends Object
Dataset construction is based on three major concepts: a partition upstream, context and
data. A partition upstream is a data source, which assumed to be available all the time regardless
node failures and rebalancing events. A partition context is a part of a partition maintained during the
whole computation process and stored in a reliable storage so that a context is staying available and
consistent regardless node failures and rebalancing events as well as an upstream. A partition data
is a part of partition maintained during a computation process in unreliable local storage such as heap, off-heap or
GPU memory on the node where current computation is performed, so that partition data can be lost as result
of node failure or rebalancing, but it can be restored from an upstream and a partition context.
A partition context and data are built on top of an upstream by using specified
builders: PartitionContextBuilder and PartitionDataBuilder correspondingly. To build a generic
dataset the following approach is used:
Dataset<C, D> dataset = DatasetFactory.create(
ignite,
cache,
partitionContextBuilder,
partitionDataBuilder
);
As well as the generic building method create this factory provides methods that allow to create a
specific dataset types such as method createSimpleDataset to create SimpleDataset and method
createSimpleLabeledDataset to create SimpleLabeledDataset.
Dataset,
PartitionContextBuilder,
PartitionDataBuilder| Constructor and Description |
|---|
DatasetFactory() |
| Modifier and Type | Method and Description |
|---|---|
static <K,V,C extends Serializable,D extends AutoCloseable> |
create(DatasetBuilder<K,V> datasetBuilder,
PartitionContextBuilder<K,V,C> partCtxBuilder,
PartitionDataBuilder<K,V,C,D> partDataBuilder)
Creates a new instance of distributed dataset using the specified
partCtxBuilder and
partDataBuilder. |
static <K,V,C extends Serializable,D extends AutoCloseable> |
create(Ignite ignite,
IgniteCache<K,V> upstreamCache,
PartitionContextBuilder<K,V,C> partCtxBuilder,
PartitionDataBuilder<K,V,C,D> partDataBuilder)
Creates a new instance of distributed dataset using the specified
partCtxBuilder and
partDataBuilder. |
static <K,V,C extends Serializable,D extends AutoCloseable> |
create(Map<K,V> upstreamMap,
int partitions,
PartitionContextBuilder<K,V,C> partCtxBuilder,
PartitionDataBuilder<K,V,C,D> partDataBuilder)
Creates a new instance of local dataset using the specified
partCtxBuilder and partDataBuilder. |
static <K,V> SimpleDataset<EmptyContext> |
createSimpleDataset(DatasetBuilder<K,V> datasetBuilder,
IgniteBiFunction<K,V,Vector> featureExtractor)
Creates a new instance of distributed
SimpleDataset using the specified featureExtractor. |
static <K,V,C extends Serializable> |
createSimpleDataset(DatasetBuilder<K,V> datasetBuilder,
PartitionContextBuilder<K,V,C> partCtxBuilder,
IgniteBiFunction<K,V,Vector> featureExtractor)
Creates a new instance of distributed
SimpleDataset using the specified partCtxBuilder and
featureExtractor. |
static <K,V> SimpleDataset<EmptyContext> |
createSimpleDataset(Ignite ignite,
IgniteCache<K,V> upstreamCache,
IgniteBiFunction<K,V,Vector> featureExtractor)
Creates a new instance of distributed
SimpleDataset using the specified featureExtractor. |
static <K,V,C extends Serializable> |
createSimpleDataset(Ignite ignite,
IgniteCache<K,V> upstreamCache,
PartitionContextBuilder<K,V,C> partCtxBuilder,
IgniteBiFunction<K,V,Vector> featureExtractor)
Creates a new instance of distributed
SimpleDataset using the specified partCtxBuilder and
featureExtractor. |
static <K,V> SimpleDataset<EmptyContext> |
createSimpleDataset(Map<K,V> upstreamMap,
int partitions,
IgniteBiFunction<K,V,Vector> featureExtractor)
Creates a new instance of local
SimpleDataset using the specified featureExtractor. |
static <K,V,C extends Serializable> |
createSimpleDataset(Map<K,V> upstreamMap,
int partitions,
PartitionContextBuilder<K,V,C> partCtxBuilder,
IgniteBiFunction<K,V,Vector> featureExtractor)
Creates a new instance of local
SimpleDataset using the specified partCtxBuilder and
featureExtractor. |
static <K,V> SimpleLabeledDataset<EmptyContext> |
createSimpleLabeledDataset(DatasetBuilder<K,V> datasetBuilder,
IgniteBiFunction<K,V,Vector> featureExtractor,
IgniteBiFunction<K,V,double[]> lbExtractor)
Creates a new instance of distributed
SimpleLabeledDataset using the specified featureExtractor
and lbExtractor. |
static <K,V,C extends Serializable> |
createSimpleLabeledDataset(DatasetBuilder<K,V> datasetBuilder,
PartitionContextBuilder<K,V,C> partCtxBuilder,
IgniteBiFunction<K,V,Vector> featureExtractor,
IgniteBiFunction<K,V,double[]> lbExtractor)
Creates a new instance of distributed
SimpleLabeledDataset using the specified partCtxBuilder,
featureExtractor and lbExtractor. |
static <K,V> SimpleLabeledDataset<EmptyContext> |
createSimpleLabeledDataset(Ignite ignite,
IgniteCache<K,V> upstreamCache,
IgniteBiFunction<K,V,Vector> featureExtractor,
IgniteBiFunction<K,V,double[]> lbExtractor)
Creates a new instance of distributed
SimpleLabeledDataset using the specified featureExtractor
and lbExtractor. |
static <K,V,C extends Serializable> |
createSimpleLabeledDataset(Ignite ignite,
IgniteCache<K,V> upstreamCache,
PartitionContextBuilder<K,V,C> partCtxBuilder,
IgniteBiFunction<K,V,Vector> featureExtractor,
IgniteBiFunction<K,V,double[]> lbExtractor)
Creates a new instance of distributed
SimpleLabeledDataset using the specified partCtxBuilder,
featureExtractor and lbExtractor. |
static <K,V> SimpleLabeledDataset<EmptyContext> |
createSimpleLabeledDataset(Map<K,V> upstreamMap,
int partitions,
IgniteBiFunction<K,V,Vector> featureExtractor,
IgniteBiFunction<K,V,double[]> lbExtractor)
Creates a new instance of local
SimpleLabeledDataset using the specified featureExtractor
and lbExtractor. |
static <K,V,C extends Serializable> |
createSimpleLabeledDataset(Map<K,V> upstreamMap,
int partitions,
PartitionContextBuilder<K,V,C> partCtxBuilder,
IgniteBiFunction<K,V,Vector> featureExtractor,
IgniteBiFunction<K,V,double[]> lbExtractor)
Creates a new instance of local
SimpleLabeledDataset using the specified partCtxBuilder,
featureExtractor and lbExtractor. |
public static <K,V,C extends Serializable,D extends AutoCloseable> Dataset<C,D> create(DatasetBuilder<K,V> datasetBuilder, PartitionContextBuilder<K,V,C> partCtxBuilder, PartitionDataBuilder<K,V,C,D> partDataBuilder)
partCtxBuilder and
partDataBuilder. This is the generic methods that allows to create any Ignite Cache based datasets with
any desired partition context and data.K - Type of a key in upstream data.V - ype of a value in upstream data.C - Type of a partition context.D - Type of a partition data.datasetBuilder - Dataset builder.partCtxBuilder - Partition context builder.partDataBuilder - Partition data builder.public static <K,V,C extends Serializable,D extends AutoCloseable> Dataset<C,D> create(Ignite ignite, IgniteCache<K,V> upstreamCache, PartitionContextBuilder<K,V,C> partCtxBuilder, PartitionDataBuilder<K,V,C,D> partDataBuilder)
partCtxBuilder and
partDataBuilder. This is the generic methods that allows to create any Ignite Cache based datasets with
any desired partition context and data.K - Type of a key in upstream data.V - Type of a value in upstream data.C - Type of a partition context.D - Type of a partition data.ignite - Ignite instance.upstreamCache - Ignite Cache with upstream data.partCtxBuilder - Partition context builder.partDataBuilder - Partition data builder.public static <K,V,C extends Serializable> SimpleDataset<C> createSimpleDataset(DatasetBuilder<K,V> datasetBuilder, PartitionContextBuilder<K,V,C> partCtxBuilder, IgniteBiFunction<K,V,Vector> featureExtractor)
SimpleDataset using the specified partCtxBuilder and
featureExtractor. This methods determines partition data to be SimpleDatasetData, but
allows to use any desired type of partition context.K - Type of a key in upstream data.V - Type of a value in upstream data.C - Type of a partition context.datasetBuilder - Dataset builder.partCtxBuilder - Partition context builder.featureExtractor - Feature extractor used to extract features and build SimpleDatasetData.public static <K,V,C extends Serializable> SimpleDataset<C> createSimpleDataset(Ignite ignite, IgniteCache<K,V> upstreamCache, PartitionContextBuilder<K,V,C> partCtxBuilder, IgniteBiFunction<K,V,Vector> featureExtractor)
SimpleDataset using the specified partCtxBuilder and
featureExtractor. This methods determines partition data to be SimpleDatasetData, but
allows to use any desired type of partition context.K - Type of a key in upstream data.V - Type of a value in upstream data.C - Type of a partition context.ignite - Ignite instance.upstreamCache - Ignite Cache with upstream data.partCtxBuilder - Partition context builder.featureExtractor - Feature extractor used to extract features and build SimpleDatasetData.public static <K,V,C extends Serializable> SimpleLabeledDataset<C> createSimpleLabeledDataset(DatasetBuilder<K,V> datasetBuilder, PartitionContextBuilder<K,V,C> partCtxBuilder, IgniteBiFunction<K,V,Vector> featureExtractor, IgniteBiFunction<K,V,double[]> lbExtractor)
SimpleLabeledDataset using the specified partCtxBuilder,
featureExtractor and lbExtractor. This method determines partition data to be
SimpleLabeledDatasetData, but allows to use any desired type of partition context.K - Type of a key in upstream data.V - Type of a value in upstream data.C - Type of a partition context.datasetBuilder - Dataset builder.partCtxBuilder - Partition context builder.featureExtractor - Feature extractor used to extract features and build SimpleLabeledDatasetData.lbExtractor - Label extractor used to extract labels and buikd SimpleLabeledDatasetData.public static <K,V,C extends Serializable> SimpleLabeledDataset<C> createSimpleLabeledDataset(Ignite ignite, IgniteCache<K,V> upstreamCache, PartitionContextBuilder<K,V,C> partCtxBuilder, IgniteBiFunction<K,V,Vector> featureExtractor, IgniteBiFunction<K,V,double[]> lbExtractor)
SimpleLabeledDataset using the specified partCtxBuilder,
featureExtractor and lbExtractor. This method determines partition data to be
SimpleLabeledDatasetData, but allows to use any desired type of partition context.K - Type of a key in upstream data.V - Type of a value in upstream data.C - Type of a partition context.ignite - Ignite instance.upstreamCache - Ignite Cache with upstream data.partCtxBuilder - Partition context builder.featureExtractor - Feature extractor used to extract features and build SimpleLabeledDatasetData.lbExtractor - Label extractor used to extract labels and buikd SimpleLabeledDatasetData.public static <K,V> SimpleDataset<EmptyContext> createSimpleDataset(DatasetBuilder<K,V> datasetBuilder, IgniteBiFunction<K,V,Vector> featureExtractor)
SimpleDataset using the specified featureExtractor. This
methods determines partition context to be EmptyContext and partition data to be
SimpleDatasetData.K - Type of a key in upstream data.V - Type of a value in upstream data.datasetBuilder - Dataset builder.featureExtractor - Feature extractor used to extract features and build SimpleDatasetData.public static <K,V> SimpleDataset<EmptyContext> createSimpleDataset(Ignite ignite, IgniteCache<K,V> upstreamCache, IgniteBiFunction<K,V,Vector> featureExtractor)
SimpleDataset using the specified featureExtractor. This
methods determines partition context to be EmptyContext and partition data to be
SimpleDatasetData.K - Type of a key in upstream data.V - Type of a value in upstream data.ignite - Ignite instance.upstreamCache - Ignite Cache with upstream data.featureExtractor - Feature extractor used to extract features and build SimpleDatasetData.public static <K,V> SimpleLabeledDataset<EmptyContext> createSimpleLabeledDataset(DatasetBuilder<K,V> datasetBuilder, IgniteBiFunction<K,V,Vector> featureExtractor, IgniteBiFunction<K,V,double[]> lbExtractor)
SimpleLabeledDataset using the specified featureExtractor
and lbExtractor. This methods determines partition context to be EmptyContext and
partition data to be SimpleLabeledDatasetData.K - Type of a key in upstream data.V - Type of a value in upstream data.datasetBuilder - Dataset builder.featureExtractor - Feature extractor used to extract features and build SimpleLabeledDatasetData.lbExtractor - Label extractor used to extract labels and buikd SimpleLabeledDatasetData.public static <K,V> SimpleLabeledDataset<EmptyContext> createSimpleLabeledDataset(Ignite ignite, IgniteCache<K,V> upstreamCache, IgniteBiFunction<K,V,Vector> featureExtractor, IgniteBiFunction<K,V,double[]> lbExtractor)
SimpleLabeledDataset using the specified featureExtractor
and lbExtractor. This methods determines partition context to be EmptyContext and
partition data to be SimpleLabeledDatasetData.K - Type of a key in upstream data.V - Type of a value in upstream data.ignite - Ignite instance.upstreamCache - Ignite Cache with upstream data.featureExtractor - Feature extractor used to extract features and build SimpleLabeledDatasetData.lbExtractor - Label extractor used to extract labels and buikd SimpleLabeledDatasetData.public static <K,V,C extends Serializable,D extends AutoCloseable> Dataset<C,D> create(Map<K,V> upstreamMap, int partitions, PartitionContextBuilder<K,V,C> partCtxBuilder, PartitionDataBuilder<K,V,C,D> partDataBuilder)
partCtxBuilder and partDataBuilder.
This is the generic methods that allows to create any Ignite Cache based datasets with any desired partition
context and data.K - Type of a key in upstream data.V - Type of a value in upstream data.C - Type of a partition context.D - Type of a partition data.upstreamMap - Map with upstream data.partitions - Number of partitions upstream Map will be divided on.partCtxBuilder - Partition context builder.partDataBuilder - Partition data builder.public static <K,V,C extends Serializable> SimpleDataset<C> createSimpleDataset(Map<K,V> upstreamMap, int partitions, PartitionContextBuilder<K,V,C> partCtxBuilder, IgniteBiFunction<K,V,Vector> featureExtractor)
SimpleDataset using the specified partCtxBuilder and
featureExtractor. This methods determines partition data to be SimpleDatasetData, but
allows to use any desired type of partition context.K - Type of a key in upstream data.V - Type of a value in upstream data.C - Type of a partition context.upstreamMap - Map with upstream data.partitions - Number of partitions upstream Map will be divided on.partCtxBuilder - Partition context builder.featureExtractor - Feature extractor used to extract features and build SimpleDatasetData.public static <K,V,C extends Serializable> SimpleLabeledDataset<C> createSimpleLabeledDataset(Map<K,V> upstreamMap, int partitions, PartitionContextBuilder<K,V,C> partCtxBuilder, IgniteBiFunction<K,V,Vector> featureExtractor, IgniteBiFunction<K,V,double[]> lbExtractor)
SimpleLabeledDataset using the specified partCtxBuilder,
featureExtractor and lbExtractor. This method determines partition data to be
SimpleLabeledDatasetData, but allows to use any desired type of partition context.K - Type of a key in upstream data.V - Type of a value in upstream data.C - Type of a partition context.upstreamMap - Map with upstream data.partitions - Number of partitions upstream Map will be divided on.partCtxBuilder - Partition context builder.featureExtractor - Feature extractor used to extract features and build SimpleLabeledDatasetData.lbExtractor - Label extractor used to extract labels and buikd SimpleLabeledDatasetData.public static <K,V> SimpleDataset<EmptyContext> createSimpleDataset(Map<K,V> upstreamMap, int partitions, IgniteBiFunction<K,V,Vector> featureExtractor)
SimpleDataset using the specified featureExtractor. This
methods determines partition context to be EmptyContext and partition data to be
SimpleDatasetData.K - Type of a key in upstream data.V - Type of a value in upstream data.upstreamMap - Map with upstream data.partitions - Number of partitions upstream Map will be divided on.featureExtractor - Feature extractor used to extract features and build SimpleDatasetData.public static <K,V> SimpleLabeledDataset<EmptyContext> createSimpleLabeledDataset(Map<K,V> upstreamMap, int partitions, IgniteBiFunction<K,V,Vector> featureExtractor, IgniteBiFunction<K,V,double[]> lbExtractor)
SimpleLabeledDataset using the specified featureExtractor
and lbExtractor. This methods determines partition context to be EmptyContext and
partition data to be SimpleLabeledDatasetData.K - Type of a key in upstream data.V - Type of a value in upstream data.upstreamMap - Map with upstream data.partitions - Number of partitions upstream Map will be divided on.featureExtractor - Feature extractor used to extract features and build SimpleLabeledDatasetData.lbExtractor - Label extractor used to extract labels and build SimpleLabeledDatasetData.
Follow @ApacheIgnite
Ignite Database and Caching Platform : ver. 2.7.2 Release Date : February 6 2019