Worth checking out MLEM (python, open source) , which can really ease model deployment, especially if you need fastapi serving. Saves a ton of boilerplate. Docs and info: https://mlem.ai/
Yes, deployed several projects with it, but for full disclosure - I would mention those were for testing & evaluation purposes, as I am involved in the project.
The project itself is relatively young, and was open sourced ~mid 2022 and been gaining good traction since
These are useful tools when the model you deploy is heavy, e.g. a large neural network, as they optimize things like inference speed. This is relevant in online deployment, but not in batch systems, of course.
Worth checking out MLEM (python, open source) , which can really ease model deployment, especially if you need fastapi serving. Saves a ton of boilerplate. Docs and info: https://mlem.ai/
Thanks for sharing OdedM. Have you used mlem yourself in a real-world project?
Yes, deployed several projects with it, but for full disclosure - I would mention those were for testing & evaluation purposes, as I am involved in the project.
The project itself is relatively young, and was open sourced ~mid 2022 and been gaining good traction since
Thank you for the article. What about tf serving or torch serving ?
These are useful tools when the model you deploy is heavy, e.g. a large neural network, as they optimize things like inference speed. This is relevant in online deployment, but not in batch systems, of course.