GridGain Developers Hub

Troubleshooting

Common Issues

Deployment Unit Not Found

DeploymentUnitNotFoundException: No deployment unit found
  • Ensure deployment completed successfully

  • Verify model deployment with cluster unit list

  • Check name and version in job parameters and compare it with the deployment unit name and version

  • Ensure you have actual model files (not .metadata or .lock files)

  • Check that files are readable and not Git LFS artifacts

Model Not Found

ModelNotFoundException: No model found matching criteria
  • Make sure that the deployment unit exists

  • Make sure that the type parameter in the JobParameters matches the engine that the model has been created on (pytorch, tensorflow, onnx)

  • Verify you have the correct model files for your model type:

    • TensorFlow models: *.pb

    • ONNX models: *.onnx file (self-contained, no config needed)

Model Initialization Exception

ModelInitializationException:

Files to look for:

The essential files vary by model type and framework:

  • Model weights: .onnx, or .pb, or *.pt

  • Model config: config.json (except for ONNX - they are self-contained models)

  • Model variable directory

    model-directory/
    ├── saved_model.pb
    └── variables/
        ├── variables.data-00000-of-00001
        └── variables.index

Additionally, there could be Files (model-dependent):

  • Tokenizer: tokenizer.json, tokenizer_config.json (for some models)

  • Vocabulary: vocab.txt, special_tokens_map.json (some NLP models)

Translator Not Found

TranslatorNotFoundException:
  • Make sure that you have used the correct translator for the model

  • Make sure your custom compute job and marshaller is deployed correctly (optional: in case there is user-defined input/output/translator/translatorFactory)

  • Make sure that the input and output classes match the translator that has been used

Memory Issues

OutOfMemoryError during model loading/deploying
  • Increase JVM heap size as per the size of the model that you have deployed

  • Consider model quantization for large models