The model updating interface in Ignite ML provides relearning of an already trained model on a new portion of data using the state of the model trained earlier. This interface is represented in the
DatasetTrainer class and it repeats the training interface with an already learned model as the first parameter:
M update (M mdl, DatasetBuilder<K, V> datasetBuilder, IgniteBiFunction<K, V, Vector> featureExtractor, IgniteBiFunction<K, V, L> lbExtractor).
M update (M mdl, Ignite ignite, IgniteCache<K, V> cache, IgniteBiFunction<K, V, Vector> featureExtractor, IgniteBiFunction<K, V, L> lbExtractor).
M update (M mdl, Ignite ignite, IgniteCache<K, V> cache, IgniteBiPredicate<K, V> filter, IgniteBiFunction<K, V, Vector> featureExtractor, IgniteBiFunction<K, V, L> lbExtractor).
M update(M mdl, Map<K, V> data, int parts, IgniteBiFunction<K, V, Vector> featureExtractor, IgniteBiFunction<K, V, L> lbExtractor).
M update (M mdl, Map<K, V> data, IgniteBiPredicate<K, V> filter, int parts, IgniteBiFunction<K, V, Vector> featureExtractor, IgniteBiFunction<K, V, L> lbExtractor).
The interface brings online learning and online batch learning. Online learning means that you can train a model and when you get a new example for learning, such as clicks on a website, you can update the model as if the model were trained on this example too. Batch online learning requires a batch of examples instead of one training example for model update. Some models allow both update strategies, some allow only batch updating. It depends upon the learning algorithm. Further details of model update capabilities in terms of online and batch online learning can be found below.
Each model has a special implementation of this interface. Read the next section to get more information about the updating process for each algorithm.
Model updating takes already learned centroids and updates them by new rows. We recommend to use batch online learning for this model. First, the dataset should have a size equal to the k-value at least. Second, a dataset with a small number of rows can move centroids to invalid positions.
Model updating just adds a new dataset to the old dataset. In this case, model updating isn’t restricted.
As in the case of KNN, a new trainer should provide the same distance measure and k-value. Those parameters are important because internally ANN use KMeans and statistics over centroids provided by KMeans. During an update, the trainer gets statistics over centroids from the last learning and updates it with new observations. From this point of view, ANN allows “mini-batch” online learning where batch size is equal to the k-parameter.
Neural Network (NN)
NN updating just gets current neural network state and updates it according to the gradient of error on a new dataset. In this case the NN requires only feature vector compatibility between different datasets.
Logistic regression inherits all restrictions from the neural network trainer because it uses perceptron internally.
LinearRegressionSGD trainer inherits all restrictions from the neural network trainer.
LinearRegressionLSQRTrainer restores state from the last learning and uses it as a first approximation in learning on a new dataset. In this way,
LinearRegressionLSQRTrainer also requires only feature vectors compatibility.
SVM trainer uses the state of a learned model as first approximation during a training process. From this point of view, the algorithm only requires feature vectors compatibility.
There is no one correct implementation for decision tree updating. Updating learns a new model on a given dataset.
GDB trainer updating gets already learned models from composition and tries to minimize the error gradient on a given dataset through learning of new models predicting gradient. It also uses a convergence checker and if there is no large error on a new dataset then GDB skips the update stage. From this point of view, GDB requires only feature vector compatibility.
Random Forest (RF)
The RF trainer just learns new decision trees on a given dataset and adds them to an already learned composition. In this way, RF requires feature vector compatibility and the dataset should have a size bigger than one element because a decision tree cannot be trained on such a small dataset. In contrast to GDB models in a trained composition, RF models aren’t dependent upon each other and if the composition is too big then a user can manually remove some models.