- Type Parameters:
K
- Type of a key in upstream
data.
V
- Type of a value in upstream
data.
- All Implemented Interfaces:
- Serializable, BiFunction<K,V,LabeledVector>, DeployableObject, IgniteBiFunction<K,V,LabeledVector>, Preprocessor<K,V>
public final class OneHotEncoderPreprocessor<K,V>
extends EncoderPreprocessor<K,V>
implements DeployableObject
Preprocessing function that makes one-hot encoding.
One-hot encoding maps a categorical feature,
represented as a label index (Double or String value),
to a binary vector with at most a single one-value indicating the presence of a specific feature value
from among the set of all feature values.
This preprocessor can transform multiple columns which indices are handled during training process.
Each one-hot encoded binary vector adds its cells to the end of the current feature vector according the order of handled categorial features.
- See Also:
This prerpocessor always creates separate column for the NULL values.
NOTE: the index value associated with NULL will located in binary vector according the frequency of NULL values.
,
Serialized Form