[Avg. reading time: 6 minutes]

Model Serving

mlflow server

Instantly turn a registered model into a REST API endpoint.

Make sure the MLFlow is still running as per the example.

mlflow server --host 127.0.0.1 --port 8080 \
--backend-store-uri sqlite:///mlflow.db

Windows

SET MLFLOW_TRACKING_URI=http://127.0.0.1:8080

MAC/Linux

export MLFLOW_TRACKING_URI=http://127.0.0.1:8080

Serve the Model

mlflow models serve \
  -m "models:/Linear_Regression_Model/1" \
  --host 127.0.0.1 \
  --port 5001 \
  --env-manager local

Use the Model

curl -X POST "http://127.0.0.1:5001/invocations" \
  -H "Content-Type: application/json" \
  --data '{"inputs": [{"ENGINESIZE": 2.0}, {"ENGINESIZE": 3.0}, {"ENGINESIZE": 4.0}]}'

Pros

Zero-code serving: Just one CLI command — no need to build an API yourself.
Auto-handles environment: Loads dependencies automatically.
Ideal for testing and demos.
Supports model URIs.

Cons

Single-threaded process.
Limited customization.
Minimal built in monitoring.
Not suited for blue-green / CICD promotion pipelines.

Fast API

Modern, high-performance Python web framework for building REST APIs.
FastAPI turns Python functions into fully documented, high-performance REST APIs with minimal code.
Built on ASGI (Asynchronous Server Gateway Interface) .
Designed for speed, type safety, and developer productivity.

Key Features

Fast execution: Comparable to Node.js & Go — async by design.
Automatic validation: Uses Pydantic models to validate and parse JSON inputs.
Auto-generated API docs: Swagger UI available at /docs, ReDoc at /redoc.
Type hints = API schema: Python typing directly defines request/response schema.
Easy to test & extend: Works great with Docker, CI/CD, and modern MLOps stacks.
Supports both sync & async: You can mix blocking ML inference and async endpoints.

export MLFLOW_TRACKING_URI=http://127.0.0.1:8080

Open uni_multi_model in VSCode

cd uni_multi_model

uvicorn fast_app:app --host 127.0.0.1 --port 5002

Uvicorn

Python runtime Application server used to run Python app code.
A lightweight, lightning-fast ASGI server (ASGI = Asynchronous Server Gateway Interface).
Built on uvloop (fast event loop) and httptools (HTTP parser), with native WebSocket support.
Works great with FastAPI, Pydandic.

#modelserving #mlflow #fastapiVer 0.3.6

MLOps and AI

Model Serving

mlflow server

Fast API

Key Features

Uvicorn