3 phases till now:
- Design discovering the problem and understanding that ML is the best solution for this problem
- Train Exp Tracking Training pipeline ⇒ outputs the production ready ML model
- Operate Deploying the model to production
To deploy a model we need to ask a question:
Do we need to have the Predictions immediately or we can wait for a month, week, hour ?
Depending on the answer of that question:
-
if we can wait, then we should use the Batch deployment mode also called offline mode. model is not up and running all the time but we apply the model to new data in regular basis (evert day, hour, week, …)
-
if not and we need immediate predictions then we go with the Online mode where the models are up and running all the time waiting for a query and provide a response/predication immediately.
there are 2 modes of online deployment:
- Web service
- Streaming
