[Avg. reading time: 6 minutes]
Model Serving
mlflow server
Instantly turn a registered model into a REST API endpoint.
Make sure the MLFlow is still running as per the example.
mlflow server --host 127.0.0.1 --port 8080 \
--backend-store-uri sqlite:///mlflow.db
Windows
SET MLFLOW_TRACKING_URI=http://127.0.0.1:8080
MAC/Linux
export MLFLOW_TRACKING_URI=http://127.0.0.1:8080
Serve the Model
mlflow models serve \
-m "models:/Linear_Regression_Model/1" \
--host 127.0.0.1 \
--port 5001 \
--env-manager local
Use the Model
curl -X POST "http://127.0.0.1:5001/invocations" \
-H "Content-Type: application/json" \
--data '{"inputs": [{"ENGINESIZE": 2.0}, {"ENGINESIZE": 3.0}, {"ENGINESIZE": 4.0}]}'
Pros
- Zero-code serving: Just one CLI command — no need to build an API yourself.
- Auto-handles environment: Loads dependencies automatically.
- Ideal for testing and demos.
- Supports model URIs.
Cons
- Single-threaded process.
- Limited customization.
- Minimal built in monitoring.
- Not suited for blue-green / CICD promotion pipelines.
Fast API
-
Modern, high-performance Python web framework for building REST APIs.
-
FastAPI turns Python functions into fully documented, high-performance REST APIs with minimal code.
-
Built on ASGI (Asynchronous Server Gateway Interface) .
-
Designed for speed, type safety, and developer productivity.
Key Features
- Fast execution: Comparable to Node.js & Go — async by design.
- Automatic validation: Uses Pydantic models to validate and parse JSON inputs.
- Auto-generated API docs: Swagger UI available at /docs, ReDoc at /redoc.
- Type hints = API schema: Python typing directly defines request/response schema.
- Easy to test & extend: Works great with Docker, CI/CD, and modern MLOps stacks.
- Supports both sync & async: You can mix blocking ML inference and async endpoints.
export MLFLOW_TRACKING_URI=http://127.0.0.1:8080
Open uni_multi_model in VSCode
cd uni_multi_model
uvicorn fast_app:app --host 127.0.0.1 --port 5002
Uvicorn
- Python runtime Application server used to run Python app code.
- A lightweight, lightning-fast ASGI server (ASGI = Asynchronous Server Gateway Interface).
- Built on uvloop (fast event loop) and httptools (HTTP parser), with native WebSocket support.
- Works great with FastAPI, Pydandic.