K - Type of a key in upstream data.V - Type of a value in upstream data.public final class StringEncoderPreprocessor<K,V> extends EncoderPreprocessor<K,V> implements DeployableObject
This preprocessor can transform multiple columns which indices are handled during training process. These indexes could be defined via .withEncodedFeature(featureIndex) call.
NOTE: it doesn’t add new column but change data in-place.
There is only a one strategy regarding how StringEncoder will handle unseen labels when you have fit a StringEncoder on one dataset and then use it to transform another: put unseen labels in a special additional bucket, at index is equal amountOfCategories.
| Modifier and Type | Field and Description |
|---|---|
protected static long |
serialVersionUID |
basePreprocessor, encodingValues, handledIndices, KEY_FOR_NULL_VALUES| Constructor and Description |
|---|
StringEncoderPreprocessor(Map<String,Integer>[] encodingValues,
Preprocessor<K,V> basePreprocessor,
Set<Integer> handledIndices)
Constructs a new instance of String Encoder preprocessor.
|
| Modifier and Type | Method and Description |
|---|---|
LabeledVector |
apply(K k,
V v)
Applies this preprocessor.
|
List<Object> |
getDependencies()
Returns dependencies of this object that can be object with class defined by client side and unknown for server.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitmapandThenandThenprotected static final long serialVersionUID
public StringEncoderPreprocessor(Map<String,Integer>[] encodingValues, Preprocessor<K,V> basePreprocessor, Set<Integer> handledIndices)
basePreprocessor - Base preprocessor.handledIndices - Handled indices.public LabeledVector apply(K k, V v)
apply in interface BiFunction<K,V,LabeledVector>k - Key.v - Value.public List<Object> getDependencies()
getDependencies in interface DeployableObject
GridGain In-Memory Computing Platform : ver. 8.9.26 Release Date : October 16 2025