Once a model is in the registry, you need to expose it as an API. MLflow makes this easy with built-in serving tools.
1. Local HTTP Serving¶
You can spin up a REST API server with a single command. This is great for testing or low-latency local applications.
mlflow models serve -m "models:/Iris_Classifier@champion" --port 5001 --no-conda2. Querying the Model¶
Use curl or Python’s requests library to get predictions.
import requests
import pandas as pd
data = {
"dataframe_split": {
"columns": ["age", "income"],
"data": [[28, 55000]]
}
}
response = requests.post("http://127.0.0.1:5001/invocations", json=data)
print(f"Prediction: {response.json()}")3. Docker Deployment¶
For industrial-scale deployment (Kubernetes, AWS SageMaker), you should use Docker. MLflow can generate a Dockerfile for you.
mlflow models build-docker -m "models:/Iris_Classifier@champion" -n "iris-classifier-image"Then run it:
docker run -p 8080:8080 iris-classifier-image