public interface DataStreamGenerator
| Modifier and Type | Field and Description |
|---|---|
static int |
FILL_CACHE_BATCH_SIZE
Size of batch for
IgniteCache.putAll(Map). |
| Modifier and Type | Method and Description |
|---|---|
default DatasetBuilder<Vector,Double> |
asDatasetBuilder(int datasetSize,
IgniteBiPredicate<Vector,Double> filter,
int partitions)
Convert first N values from stream to
DatasetBuilder. |
default DatasetBuilder<Vector,Double> |
asDatasetBuilder(int datasetSize,
IgniteBiPredicate<Vector,Double> filter,
int partitions,
UpstreamTransformerBuilder upstreamTransformerBuilder)
Convert first N values from stream to
DatasetBuilder. |
default DatasetBuilder<Vector,Double> |
asDatasetBuilder(int datasetSize,
int partitions)
Convert first N values from stream to
DatasetBuilder. |
default Map<Vector,Double> |
asMap(int datasetSize)
Convert first N values from stream to map.
|
default DataStreamGenerator |
blur(RandomProducer rnd)
Apply pseudorandom noize to vectors without labels mapping.
|
default <K> void |
fillCacheWithCustomKey(int datasetSize,
IgniteCache<K,LabeledVector<Double>> cache,
Function<LabeledVector<Double>,K> keyMapper)
Fills given cache with labeled vectors from this generator and user defined mapper from vectors to keys.
|
default void |
fillCacheWithVecHashAsKey(int datasetSize,
IgniteCache<Integer,LabeledVector<Double>> cache)
Fills given cache with labeled vectors from this generator as values and their hashcodes as keys.
|
default void |
fillCacheWithVecUUIDAsKey(int datasetSize,
IgniteCache<UUID,LabeledVector<Double>> cache)
Fills given cache with labeled vectors from this generator as values and random UUIDs as keys
|
Stream<LabeledVector<Double>> |
labeled() |
default Stream<LabeledVector<Double>> |
labeled(IgniteFunction<Vector,Double> classifier) |
default DataStreamGenerator |
mapVectors(IgniteFunction<Vector,Vector> f)
Apply user defined mapper to vectors stream without labels hiding.
|
default Stream<Vector> |
unlabeled() |
static final int FILL_CACHE_BATCH_SIZE
IgniteCache.putAll(Map).Stream<LabeledVector<Double>> labeled()
LabeledVector in according to dataset shape.default Stream<Vector> unlabeled()
Vector in according to dataset shape.default Stream<LabeledVector<Double>> labeled(IgniteFunction<Vector,Double> classifier)
classifier - User defined classifier for vectors stream.LabeledVector in according to dataset shape and user's classifier.default DataStreamGenerator mapVectors(IgniteFunction<Vector,Vector> f)
f - Mapper of vectors of data stream.default DataStreamGenerator blur(RandomProducer rnd)
rnd - Generator of pseudorandom scalars modifying vector components with label saving.default Map<Vector,Double> asMap(int datasetSize)
datasetSize - Dataset size.default DatasetBuilder<Vector,Double> asDatasetBuilder(int datasetSize, int partitions)
DatasetBuilder.datasetSize - Dataset size.partitions - Partitions count.default DatasetBuilder<Vector,Double> asDatasetBuilder(int datasetSize, IgniteBiPredicate<Vector,Double> filter, int partitions)
DatasetBuilder.datasetSize - Dataset size.filter - Data filter.partitions - Partitions count.default DatasetBuilder<Vector,Double> asDatasetBuilder(int datasetSize, IgniteBiPredicate<Vector,Double> filter, int partitions, UpstreamTransformerBuilder upstreamTransformerBuilder)
DatasetBuilder.datasetSize - Dataset size.filter - Data filter.partitions - Partitions count.upstreamTransformerBuilder - Upstream transformer builder.default <K> void fillCacheWithCustomKey(int datasetSize, IgniteCache<K,LabeledVector<Double>> cache, Function<LabeledVector<Double>,K> keyMapper)
K - Key type.datasetSize - Rows count to put.cache - Cache.keyMapper - Mapping from vectors to keys.default void fillCacheWithVecHashAsKey(int datasetSize,
IgniteCache<Integer,LabeledVector<Double>> cache)
datasetSize - Rows count to put.cache - Cache.default void fillCacheWithVecUUIDAsKey(int datasetSize,
IgniteCache<UUID,LabeledVector<Double>> cache)
datasetSize - Rows count to put.cache - Cache.
GridGain In-Memory Computing Platform : ver. 8.9.26 Release Date : October 16 2025