Hi. I have my pre-trained model in a Docker container and I want to run it for inference. What choices do I have?
Hi Peter, thanks for the question! Inside the AI4EOSC platform you have different options to run your model for inference:
- Deploy using a serverless approach with OSCAR: this is the first option because you are not consuming resources when you are not executing the model. With OSCAR, you can deploy the Docker container with your model in an asynchronous or synchronous way. The first one employs the event-driven approach, where the execution of the model is automatically triggered when a new file is added to an object-storage system. The second approach is based on HTTPS endpoints where the model is triggered when a call is made to it. Finally, you can also explore the “exposed services” if you need to have a model always up and listening for calls, avoiding the latency of model loading.
- Deploy in dedicated resources: this solution relies on the Nomad cluster used for training in the AI4EOSC cluster. Although the model weights are loaded just once, you are consuming resources even when not actively making predictions.
- Deploy in your own resources: in this option, we provide you with the means to use your own resources, where you control the whole execution environment. The Infrastructure Manager (IM) is provided to deploy and configure the environment to run your docker container in your own cloud. There is no need to be an AI4OS member to apply this option.
- Deploy in the EOSC EU Node: again, with this solution you control better the execution environment. In the EOSC EU Node you can choose for deploying your containerized model in VMs or in containers (based in OKD). There’s no need to be an AI4OS member. The EOSC EU Node offers free limited resources for any European researcher.
You can find more details of each option, and a brief analysis of pros and cons of each solution here: Deployment options in AI4OS — AI4OS/AI4EOSC documentation
Also please, consider to have a look to our latest article regarding inference approaches in the EGI Inspired Magazine: AI Inference in Action: Deployment Strategies Learnt from AI4EOSC and iMagine - EGI