[Avg. reading time: 5 minutes]
YAML
Introduction
- YAML Ain’t Markup Language.
- Human-readable alternative to JSON.
- Indentation is very key. (like Python)
- Used for configuration, not for programming logic.
Key Principles
- Whitespace indentation -> hierarchy
- Colon (:) -> Key Value Pair
- Dash (-) -> List Item
- Comments (#)
Use Cases in MLOps
- MLflow experiment configs (parameters, environments)
- Kubernetes -> Pods, Services, Deployments
- Docker Compose -> multi-container setups
- CI/CD pipelines -> GitHub Actions, GitLab CI, Azure DevOps
{
"experiment": "CO2_Regression",
"params": {
"alpha": 0.1,
"max_iter": 100
},
"tags": ["linear_regression", "mlflow"]
}
experiment: CO2_Regression
params:
alpha: 0.1
max_iter: 100
tags:
- linear_regression
- mlflow
YAMLLint OR VSCode YAML Validator Extension
YAML Data Structures
Scalars (strings, numbers, booleans)
learning_rate: 0.01
early_stopping: true
experiment_name: "CO2_Prediction"
Lists
models:
- linear_regression
- random_forest
- xgboost
Dictionaries (maps)
params:
n_estimators: 100
max_depth: 5
Description
description: |
This is a multi-line string.
It preserves line breaks.
Useful for comments/description/notes.
Putting together
experiment:
name: CO2_Regression
params:
alpha: 0.1
max_iter: 100
metrics:
- mse
- r2
description: |
Model built using Linear Regression.
We can use univariate or multi variate.
environments:
development:
database: sqlite
production:
database: mysql
Default Values
Using &anchorName and *anchorName and Merge Key <<
base_config: &base
host: localhost
port: 3306
development:
<<: *base
database: dev_db
production:
<<: *base
database: prod_db
host: prod.server.com
Using Environment Variables
config:
path: ${USERPROFILE}\folder1
Mac/Linux/Git Bash
export USERPROFILE="sometext"
Command Prompt
set USERPROFILE="sometext"
YAML Variables
variables:
base_url: http://example.com
endpoints:
user: ${variables.base_url}/user
admin: ${variables.base_url}/admin
https://github.com/gchandra10/python_yaml_demo.git